CRI-MAP Users Forum Posted mail
From jillian.maddoxalumni.unimelb.edu.au  Wed Feb 22 18:09:35 2012
From: Jill Maddox <jillian.maddoxalumni.unimelb.edu.au>
To: Multiple Recipients of <crimap-usersanimalgenome.org>
Subject: Bug fixes for bash recombination script, unmerge
Date: Wed, 22 Feb 2012 18:09:35 -0600

Hi All

I discovered a bug in my chrompic_recomb_ord.sh script. The original left
off the first group of 10 phases for paternal lines in the chrompic output.
I have also uploaded an updated version of the unmerge program
(unmerge_src_310112.tar.gz ) to the animal genome file download site. The
previous version of unmerge had a bug when deleting loci from gen files -
it worked fine when extracting loci. Please note that the part of the
unmerge program that handles deleting families needs further work and also
that if you specify a locus that is not in the gen file then an incorrect
locus gets extracted. The fixes for these are still on my todo list.

Both can be downloaded from
http://www.animalgenome.org/tools/share/crimap/downloads. If people have
crimap related tools that they wish to share then they can make them
available via http://www.animalgenome.org/cgi-bin/util/fileshare

Here's the corrected version of the chrompic_recomb_ord.sh script.

=================================================================================== 

#!/bin/bash # chrompic_recomb_ord.sh # sort lines containing more than
recombinants and order from lowest to highest # Input: chrompic filename #
Output: chrompic filename with .recomb suffix if [ $# -ne 2 ]; then
     echo 1>&2 Usage: $0 chrompic_filename min_num_recomb 
     exit 127 
fi 
if ! [ -a ./${1} ]; then 
    echo "file ${1} not found" 
    exit 127 
fi 
if ! [[ "$2" =~ ^[0-9]+$ ]]; then 
     echo "$2 is not a suitable number" 
     exit 127 
fi 

egrep " [-0o1i]{10} " $1 | sed 's/ \([0-9]*\)$/\t\1/g' | \ gawk -v var=$2
'BEGIN{FS=OFS="\t"; line1=0; minrec=var;} \ {if (line1 == 0) \
    {line1 = 1; num_in_line=split($1, lineinfo, " "); ind = lineinfo[1]; \ 
     if ($2 >= minrec) \ 
       {for (i=2; i <= num_in_line; i++) \ 
          printf "%s", lineinfo[i]; \ 
        printf "\t%d\t%d\n", $2, ind;} \ 
     else next;} \ 
  else {line1 = 0; \ 
     if ($2 >= minrec) \ 
     {num_in_line=split($1, lineinfo, " "); \ 
      for (i=1; i <= num_in_line; i++) \ 
          printf "%s", lineinfo[i]; \ 
      printf "\t%d\t%d\n", $2, ind;}}}' | \ 
sort -n -k2 -n -k3 > ${1}.recomb 
exit 

============================================================================================== 

If you get an error message please check that the "\" character is the last
character on a line and is preceded by a space in the gawk part of the
script.

Regards

 
Jill 

*************************************************************** 

Jill Maddox 16 Park Square Port Melbourne, 3207 Australia phone: 03 9646
0428 E-mail: jillian.maddoxalumni.unimelb.edu.au

*************************************************************** 


 

 

© 2003-2024: USA · USDA · NRPSP8 · Program to Accelerate Animal Genomics Applications. Contact: Bioinformatics Team