Archived Post

From Noelle.Cockettusu.edu  Tue Jan 29 18:58:56 2013 
From: Noelle Cockett <Noelle.Cockettusu.edu> 
  (by way of Jill Maddox <jillian.maddoxalumni.unimelb.edu.au>) 
To: Multiple Recipients of <sheepmodelsanimalgenome.org> 
Subject: ISGC - Minutes of the January 14, 2013 meeting 
Date: Tue, 29 Jan 2013 18:58:56 -0600 

Dear all,

Thank you for your participation at the ISGC meeting held on January 14,
2013 in San Diego. Here are the minutes of the meeting, prepared by Brian
Dalrymple, John McEwan and myself.

Best wishes, Noelle


International Sheep Genome Consortium January 14, 2013 - San Diego, CA
(Plant and Animal Genome XXIth Meeting)

Welcome: Noelle Cockett (Utah State University) and Brian Dalrymple (CSIRO
Animal, Food and Health Sciences) welcomed around 40 people in attendance.

Ovine Parentage SNP Chips: John McEwan (AgResearch) reported that three
parentage chips are being developed (AgResearch, CSIRO, USDA/ARS) and all
have common SNPs selected from the Illumina Ovine SNP50 BeadChip.

Mike Heaton (USDA/ARS) indicated that he had selected 168 SNP with MAF >
.30 from the SNP50 chip for inclusion on the USDA/ARS parentage chip. He
also used the resequencing dataset of 75 animals to annotate sequence
surrounding those SNP.

A motion was made and approved that the ISGC supports the development of
this parentage chip, which will be "branded" as an ISGC product.

Ovine Custom SNP Chips: John reported on the ovine LD chip that was
produced in 2011 by Illumina as a custom chip. A two-stage selection of the
SNP resulted in 91-96% accuracy for imputation to the SNP50 chip tested
across 5,000 animals. He also reported on the Bovine LD+6K ovine "MooBaa"
chip, which was produced in 2012 and should work across sheep and cattle.
The ovine content is the same as the previous ovine LD chip. He indicated
that 30% of all New Zealand born 2012 rams will have come from flocks using
SNP technology through either direct genotyping or imputation. Typically
only 5-10% of candidate rams are tested. He indicated that designs of these
chips are freely available on request.

Ovine HD SNP Chip: John reported that development of the HD SNP chip, which
is an ISGC project, is almost completed. The bulk of the SNP were
identified from the 75 sheep that have been sequenced to 10X coverage at
BCM-HGSC (see Table 1 for breeds of sheep included in the resequencing).
From these data, over 32.1 million SNP were identified using the SNP
pipelines of BCM-HGSC and DPI. In addition, DPI examined the.bam files and
found good concordance with the SNP50 chip genotypes on the resequenced
animals (all have been typed with the SNP50 chip).

Over 18.7 million SNP had MAF > 0.1, with the minor allele in at least two
animals and of those, approximately 6 million SNP have sequence on at least
one side of the SNP. This last group of SNP was sent to Illumina for
screening for the HD chip. The HD chip will also include 30K functional
SNP, 50K from the SNP50 chip, 1K other functional SNP (mutations), and 20K
that match a proposed GBS protocol. The SNP selected from the 6 million SNP
will be equally spaced and will have a range of MAF, which is a slightly
different strategy than what was used on the cattle chip which selected SNP
based on MAF in various subspecies and breeds. 

Cindy Lawley (Illumina) said that the ovine HD SNP will be referred to as
"Illumina sheep HD SNP chip". The bulk of the SNP have been selected from
the 6M SNPs sent by John and Illumina is now ready to start production.
January 31 is the deadline for John to submit the list of other SNP and
Illumina expects to ship the HD chip in early May. As far as ordering chips
using the consortium price, they first extended the September, 2012
deadline to December 31, 2012 because they expected to have the SNP list by
January 1, 2013. Because construction of the HD SNP chip is slightly
delayed, Illumina will extend the deadline to order under the consortium
price to January 24, 2013.

Sequences from the 72 animals are in the short read archives of NCBI. The
.bam and .vcf files are available to members of the consortium by
contacting James Kijas (james.kijascsiro.au). The files are not to be
distributed beyond the initial receipients until the full release of the
data. The .vcf files have been filtered but not "heavily" and include a
full set of SNP, a subset of the SNP that have been agreed upon to have
further analysis and a subset of the SNP that are being considered for the
Illumina HD SNP chip. At the time of the release of the annotation by
Ensembl (probably mid-September, 2013), the release will include the
individual animal .bam/.vcf files as tracks.

Update on Ovine SNP50 BeadChip: John reported that around 40,000-50,000
animals have been genotyped with the Illumina Ovine SNP50 BeadChip to date.
A survey of the literature indicated about 22-25 publications using data
from the SNP50 chip.

Cindy indicated that Illlumina has sufficient bead pool for another run of
the SNP50 chip using the12-bead pool format. Therefore, they have no plans
to re-synthesize the oligos or change to a different format.

It was noted that the SNP50 chip has ovine Oar v1.0 SNP coordinates and
position assignments. Illumina is willing to update the position locations
to Oar v3.1 but recommends that the SNP names remain the same. Brian will
send the Oar v3.1 coordinates for the SNP on the SNP50 chip to Illumina.

Ovine whole genome assembly: Brian Dalrymple provided a summary of Oar v3.1
whole genome assembly. As presented last year, there were gaps in GC-rich
regions in Oar v2.0 so strategies for sequencing across these regions were
implemented at BCM-HGSC and Roslin using the male Texel animal. These
efforts have improved the GC-rich regions but there may still be some
issues with methylation analyses. Although an updated version of the
assembly (Oar v4.0) won't be released until late 2014, patches will be
released but they won't include these more "global" changes.

Brian said Jiang Yu has analyzed the 72 resequences for CNV and deletions.
Jiang used five continuous 200 bp windows and required presence of the CVN
or deletion in at least five animals. The results of this analysis will
contribute to the refinement of the assembly of cnv regions in Oar v4.0.

Jiang also mapped all contigs in Oar v3.1 to the bovine assembly, then
looked at every place where the two species differed. Are these a true
difference between cow and sheep or are they a problem with the assemblies?
Users of the assembly will need to make their own decision on which
assembly is the issue. Almost 1/3 of the differences occur on the X
chromosome, but that chromosome has been an issue for both assemblies.

A call has gone out for RNA seq data that will be used for annotation. Cut-
off date to send these data to Brian is January 31, 2013 and the files are
best transferred by FTP. Ensembl will add in sequences that are publicly
available but Brian needs to let them know where these sequences are
located.

Brian will send all collected RNA data from CSIRO to Ensembl on a hard-
drive. Ensembl believes it will take six months to finish annotation (so
should be completed around September, 2013). Thibaut Hourlier (Ensembl,
Sanger Institute) said that Ensembl will develop gene models using pooled
data across all sequences within a tissue and then have a single "tissue"
track which will include the number of tracks that contributed to that
variant. Esembl can return .bam files to submitters and also, contributors
can request that .bam files be released/not released.

PacBio project: Kim Worley (BCM-HGSC) reported that most of the gaps in the
ovine whole gene assembly are small (40% or 50,000 are in the first peak)
and are most often found in repeats at scaffold ends. She will be using
PBJElly (developed by BCM-HGSC) to fill captured reads, as well as any
high-throughput sequencing platform although the approach is designed for
PacBio sequencing. Improvements in the process will ensure longer and more
reads. A grant proposal has been submitted to the USDA/AFRI Tools and
Resources program area. ISGC has committed another $100,000 to this
project. If completed, the resulting data will have 7X coverage, the contig
N50 will be doubled or tripled, and the number of the gaps in the assembly
will be reduced by half.

Epigenomic analysis: Chris Couldrey (AgResearch) described an epigenomic
analysis which requires both whole genome sequence (provided by the ovine
assembly) and gene expression (RNA seq) data, as well as miRNA and DNA
methylation (representational bisulfate sequencing) data. When finished,
the information will be added to the annotation of the whole genome
assembly and would allow a look at methylation levels at individual CpG
sites. A major consideration is how to put the results across multiple
tissues/animals together and how to annotate. Ensembl has visualization
ways, probably a "track", but to include it in Ensembl, the data will need
to be publicly available.

Contribution of 3SR to the whole genome assembly: Huw Jones (3SR project)
indicated that the 3SR project is scheduled to end in October, 2013. The
contribution of 3SR to the whole genome assembly has been revised from the
original proposal. The revised plan is separated into three areas. First,
samples from CNV animals will be selected by James Kijas (CSIRO) and sent
to INRA to include in their analysis of CNV funded under the CNV work plan.
Second, 3SR funds will be used for targeted BAC sequencing of ~500 BAC in
the next 3-4 months. Brian will select the BACs based on gaps/comparative
analysis and have the BAC clones sent directly from the CHORI 234 library
to Roslin which will do the sequencing. A test run using a 96-well plate
will be done first and based on those results additional BACs will be done.
Selected clones will include regions that are functionally interesting to
3SR as well as problems in the assembly. Third, Roslin Institute will
obtain RNA seq from about 20-30 tissues across a Texel ram, a Texel ewe, a
Texel ewe-lamb and a whole 16-d embryo (see attached Table 2). Some pooling
of tissues or animals in a lane will be done utilizing bar coding.
Expression profiles will be made available through the http://biogps.org
site. There is some concern on how to do the assembly of the transcripts
from the RNA seq data since the annotation isn't done. It was noted by
Garth Brown, the NCBI representative, that NCBI can also assemble from the
RNA seq reads.

Analysis of the ovine X chromosome: Wan-Sheng Liu (Penn State University)
presented results from BTAY and BTAX done in his lab under a USDA/AFRI
grant, as well as a proposed pathway for developing information for the X
and Y chromosomes in sheep.

Sheep Genomes Project: Noelle Cockett described a USDA/AFRI Tools and
Resources grant proposal that will result in a project to develop a
resequencing database that includes extensive annotation of variants. The
database will include sequence data from the 75 animals already sequenced
at BCM-HGSC and an additional 25 animals that will be sequenced using ISGC
funding at BCM-HGSC, as well as exome sequencing by BCM-HGSC from 145
animals (current pricing, requested funding in the proposal). Sequence and
variant data will be available through NCBI.

Publication of the whole genome assembly paper: Brian led a discussion on
the assembly paper, with a central question on whether to publish now or
hold off for more annotation and biological stories to add with it. There
is some concern with a delay in publishing because Esembl requires a
"released" sequence. Also, there may be hesitation for people to use the
assembly because it's not published. The bar is going up for "strong"
publications on whole genome sequences and therefore the scope of the sheep
paper will need to be high. Unfortunately, key individuals haven't
coordinated a large effort to get this "big story" organized. Ensembl says
there will be some statistics on the assembly that could be added in a few
weeks.

It was decided that biologically interesting stories, such as mothering
ability, reproduction function which seems different from cattle and goat,
wool, X chromosome, litter size, etc., would be added to the publication.
Brian/Jiang will circulate the draft paper in late January/early February.
People should indicate their interest of contribution by the end of
February. A deadline for submitting sections was set as the end of March.
Ensembl will "share" the results before the full annotation and then the
details can be updated closer to date of publication (like was done for
swine). Provisional submission date for the manuscript to the journal is
August 2013.

Alan indicated that companion papers will be able to "link" to the paper up
to 6 months after publication of the main paper Genome Biology might be a
good target for the main paper.

ENCODE effort: Alan Archibald (Roslin) indicated that an ENCODE project for
farmed/companion animals has been proposed. The project would combine gene
biology and variation information. The results would be put it in the
ENCODE browser and linked to the whole genome assemblies of each species.
Alan suggested that the focus be on target tissues (e.g. musco-skeletal,
immune tissues), limited assays (e.g. DNaseI, FAIREseq, histone markers,
methylation, etc.) Samples would be shared across participants and would
likely be cells (transformed, primary cells, iPS cells) and then the raw
data would be shared across the group. A good way to visualize the data
will be important.

Plans are to 1) generate a white paper on the project, 2) develop a data
management strategy, 3) review/promote ENCODE experimental protocols, 4)
develop/review cell line resources, 5) develop communications strategy, 6)
establish a EU-US Biotechnology Working Party (close to PAG XXII).

Other projects: Jennifer Thompson (Montana State University) reported that
she had access to sheep reproduction lines that have been closed since
1988. She is developing a proposal to perform selection sweeps across the
population and conduct RNA seq on reproductive tissues from selected
animals. Jennifer is seeking collaborators.

The meeting ended at 3:00 p.m.


Table 1
Animal Identifier	Breed	Animal_ID	Contributor
BGE2	Bangladeshi	BGE2	Faruque Mdomar
BGE4	Bangladeshi	BGE4	Faruque Mdomar
GAR14	Garole	GAR14	Faruque Mdomar
40	Indian	Garole	GAR4	Vidya Gupta
CHA02	Changthangi	CHA02	Jorn Bennenwitz
CHA05	Changthangi	CHA05	Jorn Bennenwitz
IXX.3178	Garut	GUR4	Herman Raadsma
IXX.3530	Garut	GUR5	Herman Raadsma
I99.1574	Sumatra	SUM2	Herman Raadsma
I99.1595	Sumatra	SUM7	Herman Raadsma
ZB08	Northern	Tibetan	ZB08	Kui Li
ZD11	Eastern	Tibetan	ZD11	Kui Li
NA_02_NO	TAG	Namaqua Afrikaner	NQA11	James Kijas
RDA_99_007	Ronderib Afrikaner	RDA2	James Kijas
RDA_017034	Ronderib Afrikaner	RDA4	James Kijas
WD_032101	African White Dorper	AWD1	Miika Tapio
WD_032122	African White Dorper	AWD3	Miika Tapio
M01	Ethiopian Menz	EMZ1	Miika Tapio
KR4	Karya	KR4	Ibrahim Cemal
CC50	Cine Capari	CC50	Ibrahim Cemal
SZ3	Sakiz	SKZ1	Ibrahim Cemal
SZ6	Sakiz	SKZ4	Ibrahim Cemal
NZ1	Norduz	NDZ1	Ibrahim Cemal
NZ4	Norduz	NDZ4	Ibrahim Cemal
1i	Turkish Awassi	AWT1	Ibrahim Cemal
3i	Turkish Awassi	AWT2	Ibrahim Cemal
Afsh-032	Afshari	AFS32	Henner Simianer
Afsh-033	Afshari	AFS33	Henner Simianer
KK3	Karakas	KRS3	Ibrahim Cemal
KK7	Karakas	KRS5	Ibrahim Cemal
T7	Cheviot	CHVA1	Steve Bishop
896	Cheviot	CHVC1	Steve Bishop
Bl	Salz	SALA1	Luis V. Monteagudo Ibáñez
Neg	Salz	SALA2	Luis V. Monteagudo Ibáñez
40	Salz	SALC1	Luis V.	Monteagudo	Ibáñez
BSI3	Santa Inês	BSI3	Samuel  Paiva
BSI4	Santa Inês	BSI4	Samuel	Paiva
BMN3	Morada Nova	BMN3	Samuel Paiva
BMN4	Morada Nova	BMN4	Samuel Paiva
GCN4	Gulf Coast native	GCN4	Noelle Cockett
GCN5	Gulf Coast native	GCN5	Noelle Cockett
BCS1	Brazilian Creole	BCS1	Samuel Paiva
BCS3	Brazilian Creole	BCS3	Samuel Paiva
131-5g	Ovis canadensis	OCAN1	Dave Coltman
131-6g	Ovis canadensis	OCAN2	Dave Coltman
26783	Ovis dalli	ODAL1	Dave Coltman
26761	Ovis dalli	ODAL2	Dave Coltman
FIN1	Finnsheep	FIN1	Juha Kantanen
FIN4	Finnsheep	FIN4	Juha Kantanen
CHU01	Churra	CHU1	Juan Jose Arranz
CHU02	Churra	CHU2	Juan Jose Arranz
64	Ovis canadensis	OCAN3	Stefan	Hiendleder
CAS	01_ULE	Castellana	CAS1	Juan Jose Arranz
CAS	03_ULE	Castellana	CAS3	Juan Jose Arranz
OJA04	Ojalada	OJA4	Juan Jose Arranz
OJA05	Ojalada	OJA5	Juan Jose Arranz
SWA03	Swiss White Alpine	SWAN3	Cord Droegemueller
SWA04	Swiss White Alpine	SWAN4	Cord Droegemueller
SWA	27	Swiss White Alpine	SWAA27	Cord Droegemueller
SWA	29	Swiss White Alpine	SWAA29	Cord Droegemueller
VBS02	Valais Blacknose	VBS2	Cord Droegemueller
SMS02	Swiss Mirror	SMS2	Cord Droegemueller
604343	Meat Lacaune	LAC1	Carole Moreno
604299	Milk Lacaune	LAC84	Carole Moreno
1560	Merino	MERC1	Kristen Nowak
398	Merino	MERA1	Kristen Nowak
SNP1_040129	Poll Dorset	PD454	James Kijas
SNP2_951 0000488312	Merino	MER454	James Kijas
SNP3_9009	Awassi		AW454	Herman Raadsma
SNP4_113/05	Texel	TEX454	John McEwan
SNP5_180/05	Romney	ROM454	John McEwan
SNP6_590771	Scottish Blackface	SBF454	Steve Bishop
1	Dollgellau Welsh Mountain	DWM1	Denis Larkin
10	Welsh Hardy Speckled Face	WHSF1	Denis Larkin
20	Tregaon Welsh mountain	TWM1	Denis Larkin


Table 2 - full table not included as xlsx attachment
Tissues are:
Ram
Brain cerebellum
Brain cerebrum
Brain frontal lobe
Brain stem
Hypothallamus
Pituitary gland
Cornea
Lens
Optic nerve
Retina
Sclera
Abomasum mucosa
Caecum
Colon
Duodenum
Omentum
Rectum
Rumen
Larynx
Oesophagus
Pharynx
Salivary gland parotid
Thyroid gland
Tongue dermis
Tongue muscle
Tonsil
Aorta
Atrium
Ventricle
Adrenal gland
Kidney cortex
Kidney medulla
Liver
Lung
Lymph node mesenteric
Lymph node prescapular
Testes
Testes epididymis
Bladder
Diaphragm
Pancreas
Fat subcutaneous
Fat kidney
Muscle long dorsal
Muscle biceps
Muscle intercostal
Skin back
Spleen
Packed blood cells

Ewe dam
Tissue
Cerebellum
Cerebrum
Frontal lobe
Brain stem
Hypothalamus
Pituitary
Cornea
Lens
Optic nerve
Retina
Sclera
Cervix
Mammary gland
Corpus luteum
Ovary
Placenta + membranes
Uterus
WHOLE EMBRYO
Abomasum
Caecum
Colon
Duodenum
Omentum
Peyer's patch
Rectum
Rumen
Larynx
Oesophagus
Pharynx
Salivary gland
Thyroid gland
Tongue dermis
Tonsil
Aorta
Heart atrium
Heart vertricle
Adrenal gland
Kidney cortex
Kidney medulla
Liver
Lung
Lymph node mediastinal
Lymph node mesenteric
Lymph node prescapular
Bladder
Diaphragm
Pancreas
Fat subcutaneous
Fat kidney
Tongue muscle
Muscle long dorsal
Muscle biceps
Muscle intercostal
Skin side
Spleen
Packed blood cells

Lamb
Tissue
Cerebellum
Cerebrum
Frontal lobe
Brain stem
Hypothalamus
Pituitary gland
Cornea
Lens
Optic nerve
Retina
Eye membrane (not sclera)
Cervix
Mammary gland
Ovarian follicles
Ovary
Uterus
Abomasum
Caecum
Colon
Duodenum
Omentum
Peyer's patch
Rectum
Rumen
Pharynx
Oesophagus
Larynx
Salivary gland parotid
Thyroid gland
Tongue dermis
Tongue muscle
Tonsil
Aorta
Heart atrium
Ventricle
Adrenal Gland
Kidney cortex
Kidney medulla
Liver
Lung
Lymph node mediastinal
Lymph node mesenteric
Lymph node prescapular
Bladder
Diaphragm
Pancreas
Fat kidney
Fat subcutaneous
Muscle long dorsal
Muscle biceps
Muscle intercostal
Skin back
Packed blood cells
Spleen

 

 

© 2003-2024: USA · USDA · NRPSP8 · Program to Accelerate Animal Genomics Applications. Contact: Bioinformatics Team