Start from the raw sequences, for example, file Input_raw_seqs.fastq

1. ./process_Rawseqs.pl Enzyme_code \
      number_of_base_cut_at_the_end \
      Barcode_file.txt \
      Input_raw_seqs.fastq \
      Output_selected_seqs.fastq

   Repeat the above step for all your raw sequence files. Note use only the
   barcodes you want to choose for each file.

2. ./BarcodeSplit.pl RAD Barcodes.txt *.fastq

   Note, all the results in fastq format after first step should be in the
   same folder and Barcodes.txt contains the codes for all samples. 
   You will have files of RAD_barcode.fasta files for each sample.

3. ./HashSeqs.pl *.fasta

   You will have all *.hash files.

4. $ RepeatMasker -lib INRArepbase1.txt -pa 8 sample.hash

   You will need to have RepeatMasker installed and do this step for all the
   *.hash files.  Note -pa 8 means I want to use 8 processors. You can change
   this number based on your computer's configuration.

5. ./list_unmasked.pl *.masked

   The *.masked files are generated by RepeeatMasker

6. ./select_hash.pl 200 test.fna *.unmasked
   Note we choose 200 for the upper limit of reads, but you can use any number.
   The *.unmasked files are generated from step 5.
   File test.fna is the output of this step. It cantains all the selected
   unmasked sequences.

7. ./re-hash.pl test test.fna test_rehash.fna test_rehash.dat 

   Here test.fan is the one genarated from step 6.

8. ./novoindex \
     test_rehash.idx \
     test_rehash.fna

9. $ novoalign -r E 20 -t 250 -F FA -d test_rehash.idx \
     -f test_rehash.fna > test_rehash.novo &

10. ./find_bi_allelic.pl test_rehash.novo test_biAllelic.txt 90

   Note: '90' gives up to 3 mismatches between the two alleles, and 30 gives
   1 mismatch.

11. ./call_snp.pl samples.txt \
      test_rehash.dat \
      test_biAllelic.txt \
      test_genotype.txt \
      test_genotype.fna 1

   Note: the last input parameter means we want the output the loci that have
   1 SNP, and if it is 3, then we output loci that have up to 3 SNPs.

--
Gao, Guangtu
Guangtu.Gao@ARS.USDA.GOV