From jillian.maddoxalumni.unimelb.edu.au Wed Feb 22 18:09:35 2012
From: Jill Maddox <jillian.maddoxalumni.unimelb.edu.au>
To: Multiple Recipients of <crimap-usersanimalgenome.org>
Subject: Bug fixes for bash recombination script, unmerge
Date: Wed, 22 Feb 2012 18:09:35 -0600
Hi All
I discovered a bug in my chrompic_recomb_ord.sh script. The original left
off the first group of 10 phases for paternal lines in the chrompic output.
I have also uploaded an updated version of the unmerge program
(unmerge_src_310112.tar.gz ) to the animal genome file download site. The
previous version of unmerge had a bug when deleting loci from gen files -
it worked fine when extracting loci. Please note that the part of the
unmerge program that handles deleting families needs further work and also
that if you specify a locus that is not in the gen file then an incorrect
locus gets extracted. The fixes for these are still on my todo list.
Both can be downloaded from
http://www.animalgenome.org/tools/share/crimap/downloads. If people have
crimap related tools that they wish to share then they can make them
available via http://www.animalgenome.org/cgi-bin/util/fileshare
Here's the corrected version of the chrompic_recomb_ord.sh script.
===================================================================================
#!/bin/bash # chrompic_recomb_ord.sh # sort lines containing more than
recombinants and order from lowest to highest # Input: chrompic filename #
Output: chrompic filename with .recomb suffix if [ $# -ne 2 ]; then
echo 1>&2 Usage: $0 chrompic_filename min_num_recomb
exit 127
fi
if ! [ -a ./${1} ]; then
echo "file ${1} not found"
exit 127
fi
if ! [[ "$2" =~ ^[0-9]+$ ]]; then
echo "$2 is not a suitable number"
exit 127
fi
egrep " [-0o1i]{10} " $1 | sed 's/ \([0-9]*\)$/\t\1/g' | \ gawk -v var=$2
'BEGIN{FS=OFS="\t"; line1=0; minrec=var;} \ {if (line1 == 0) \
{line1 = 1; num_in_line=split($1, lineinfo, " "); ind = lineinfo[1]; \
if ($2 >= minrec) \
{for (i=2; i <= num_in_line; i++) \
printf "%s", lineinfo[i]; \
printf "\t%d\t%d\n", $2, ind;} \
else next;} \
else {line1 = 0; \
if ($2 >= minrec) \
{num_in_line=split($1, lineinfo, " "); \
for (i=1; i <= num_in_line; i++) \
printf "%s", lineinfo[i]; \
printf "\t%d\t%d\n", $2, ind;}}}' | \
sort -n -k2 -n -k3 > ${1}.recomb
exit
==============================================================================================
If you get an error message please check that the "\" character is the last
character on a line and is preceded by a space in the gawk part of the
script.
Regards
Jill
***************************************************************
Jill Maddox 16 Park Square Port Melbourne, 3207 Australia phone: 03 9646
0428 E-mail: jillian.maddoxalumni.unimelb.edu.au
***************************************************************
|