An Introduction to QTLdb

A comprehensive tool set for QTL repository, comparisons, dynamic linking to comparative structural genome information for positional gene mining and more

  1. What are QTL?
  2. What is QTLdb? Are there any publications about it?
  3. What are the main distinctions between QTL and association mapping?
  4. What animal species are currently included in the QTLdb?
  5. What are "flanking markers" and what do they represent?
  6. What is Trait Ontology and how is it used in QTLdb?
  7. How are public QTL data curated into the QTLdb?
  8. Can I enter my QTL data into the QTLdb?
  9. What questions does the QTLdb attempt to address?
  10. What functionality does QTLdb offer?
  11. How to access the information in the QTLdb?
  12. What structural genomics information are aligned in the QTLdb and how to access them?
  13. Which trait(s) are found having more QTL?
  14. Are data within QTLdb static?
  15. I wish to find a cytogenetic band region of interest for QTL, how can I do that?
  16. I wish to "zoom in" to examine a local region of my interested region, can I do it? how?
  17. Some chromosomes have awful lot of QTL that the chromosome view becomes a very wide picture (extends way out of computer screen to the right), making it hard to compare some alignments. Is there any way I can see a more manageable view?
  18. For genomic mining of a QTL region, we often need to align a QTL against the genome for underlining genes. Is it possible in the QTLdb?
  19. How accurate is the QTL location alignment to, say, transcript locations on the most recent genome assembly?
  20. Can I download the raw data from the QTLdb?
  21. Terminology

  1. What are QTL?


    This graph is modified from the
    Rat GDB with kind permission

    Quantitative Trait Loci (QTL) are hypotheses that specific chromosomal regions contain genes that make a significant contribution to the expression of a complex trait. QTL are generally identified by comparing the linkage (degree of co-variation) of polymorphic molecular markers and phenotypic trait measurements.

    The ultimate goal of complex trait dissection is to identify the actual genes involved in the trait and to understand the cellular roles and functions of these genes.

    The accuracy and precision of locating QTL depends, in part, on the density of the linkage map created. The higher the density of the map, the more precise the location of the putative QTL. When QTL can be mapped to a relatively small chromosomal region or regions other methods, such as positional cloning, can be used effectively to isolate specific genes. Unfortunately, the denser the map, the more likely that false positive QTL will be detected.

    Most, but not all, complex traits are conditioned by more than one locus. QTL often interact in complex ways and their expression can also be influenced by non-genetic factors. Because QTL are hypotheses, they are subject to reinterpretation and revision. Because the location of QTL are provisional their nomenclature is likely to be fluid and temporary. (Originally by Carol J. Bult)

  2. What is QTLdb? Are there any publications about it?

    QTLdb is an abbreviated name for "QTL Database", containing published QTL data organized into structured tables in a relational database, MySQL. The user and manager interface to the database takes advantage of world-wide web (WWW) and is programmed with perl/CGI.

    The active QTLdb development is a project carried out at the Iowa State University (for a bit history and its releases, see this note). The progress on the QTLdb development has been presented at the 13th North American Colloquium on Animal Cytogenetics & Gene Mapping (2003), Midwestern ADSA/ASAS Annual Meeting (2005), annual Plant and Animal Genome (PAG) conferences in 2005, 2006, 2007, 2008, 2009, and International Society for Animal Genetics (ISAG) in 2008. Three papers by Hu et al. [Mammalian Genome (2005), Nucleic Acids Research (2006) and Mammalian Genome (2007)] represent three milestones in the course of the QTLdb development. See the publication notes for more details. The QTLdb has been listed by the NAR Database Collections (in the category of "Human and other Vertebrate Genomes" "Model organisms, comparative genomics" section).

  3. What are the main distinctions between QTL and association mapping?

    The main differences between QTL and association mapping are: (1) the level of resolution (in terms of distance along the DNA or chromosome), and (2) the level of generality (in terms of the number of traits that can be studied with a given set of markers). (1) QTL analyses resolve the locations of genes (or gene clusters) influencing a trait down only to the level of chromosomal segments between one to 20 cM in size (roughly one million to 20 million base pairs). (Originally from http://www.panzea.org/info/faq.html)

  4. What animal species are currently included in the QTLdb?

    The QTLdb is targeted to house QTL results from multiple livestock species. It was originally developed with pig QTL (2005). Subsequently, QTL data from cattle and chicken were added (2006). Currently, sheep QTL is being added to the QTLdb by Jill Maddox's group at the Faculty of Veterinary Science, The University of Melbourne, Australia. QTL from more other animal species, such as rat, human and mouse, will be added in the near future. This is in part of an effort for an upcoming comparative QTL study.

  5. What are "flanking markers" and what do they represent?

    There are different ways to determine a detected QTL is significant enough to be "real". Permutation test is one of those popular ones used by many people. According to Lander and Kruglyak (1995), a suggestive linkage is expected to occur one time at random in a genome scan and has an estimated minimum LOD score of 2.0; A significant linkage is expected to occur 0.05 times at random in a genome scan and has an estimated minimum LOD score of 3.4 (in real life the "cut-off" LOD scores may vary depend on actual permutation tests). Therefore, in an ideal situation, a QTL may be peaked by one marker and flanked by 2 pairs of markers (see Figure).

    In the QTLdb we try to use flanking markers A1, A2, B1, B2 when they are available.

  6. What is Trait Ontology and how is it used in QTLdb?

    Livestock production traits are sets of animal phenotypes described for their nature, quality, quantity and biological stage. Due to differences in methods of detection or measurement, scope of description and/or customs, a trait may be described in several different ways. In order to compare QTL discovered by different labs with different methods, we have to make a "standard" way of trait description in order to correctly compare them. To solve this problem, we introduced "Trait Ontology" to classify and organize the traits for management with database.

    Ontology is a classification methodology defines a common vocabulary in a structured way for useful information sharing. Animal production traits may be classified in many different ways based on their functions, features, property, etc. One most useful construct of the trait ontology is that the animal traits may be classified by how they are measured as commercial products. In the QTLdb, we use three levels of controlled vocabulary to describe each production trait: Trait Class, Trait Type and Trait itself. For their definitions, see FAQ #21 "Terminology" below.

    The classification of traits helps to share common understanding of information structure among people or software agents.

  7. How are public QTL data curated into the QTLdb?

    Following are extracted from each publication: Experimental design, Population structure and design, Testing Model and Methods, Trait names on which significant QTL are detected, Trait Description and Measurements; QTL location (Chromosome, Position, 95% CI on the Location), Flanking markers (A1, A2, B1, B2 and the Peak; see Figure for FAQ #5), Test Statistics (LOD_score, LS_means, P_values, F_values, Variance), QTL effects (Dominance effect, Additive effect), Candidate genes, etc., when available. Publication title, authors, journal and abstracts are also included.

    Take pig data as an example, the QTLdb uses the USDA-MARC pig linkage map (MARC-Map) as a map reference to show relative locations of each QTL, as the MARC map is the single largest pig map to date, and its markers are used by most QTL studies for genome / chromosome scan. When a non-MARC-Map marker is used to describe a QTL, the actual marker location in the experimental map is interpolated to the MARC map and the interpolated map locations are stored in the QTLdb.

    The flanking or underlining markers on the QTL map is linked to LocusLink and Gene Database through NCBI's pipeline.

  8. Can I enter my QTL data into the QTLdb?

    Yes. The Animal QTLdb is open to public for data entry and update. One must apply to be a curator in order to do so. Being a curator, you will be able to

    • have access to your data
    • update your data any time
    • keep your data private
    • take advantage of the QTLdb tool to exam your data against other public data in a private mode
    • choose a time to release your data for public access
    • have your public data populated to NCBI database automatically
    By submitting your data to the QTLdb, your data set will join the other QTL data set published in the past 10+ years, and subject to within and cross species comparisons. See paper by Hu et al. ("Animal QTLdb: Beyond a Repository - A Public Platform for QTL Comparisons and Integration with Diverse Types of Structural Genomic Information. Mammalian Genome, Volume 18, 1-4 (2007) for more details).

  9. What questions does the QTLdb attempt to address?

    The following questions were the initially projected to address:

    • What is the chromosomal location for each QTL? Can multiple QTL be viewed in a "synthetic" manner?
    • Is it possible that QTL from different studies be easily compared for their locations?
    • Can all markers underlining a QTL be shown and marker information easily retrievable?
    • What are the significance values for each QTL, with what method for detection?
    • Have any other phenotypic traits been mapped to the chromosome segment that my QTL appears to fall into or is part of it?
    • What percentage of phenotypic variation is associated with each QTL? Is the effect dominance or additive?
    • How is the possibility that QTL markers may be matched to public sequences, via LocusLink or UniSTS in GenBank if possible at all?

    As we build up the QTLdb, we find that the utility of the QTLdb can extend beyond what we originally anticipated. Efforts are continually made to add more functionality to the utility of the QTLdb.

  10. What functionality does QTLdb offer?

    The animal QTLdb offer a number of functions for user to easily retrieve, compare and synthesize QTL information.

    By searching or browsing the QTLdb, one can

    1. Find all QTL on one chromosome
    2. Find all chromosomes that bear QTL for the same trait
    3. List all QTL from a particular publication
    4. Find all markers underlining a QTL
    5. Find DNA sequences associated with certain markers
    6. Use LocusLink to further search for candidate genes by comparative maps
    7. Find experiment details in brief for a given publication
    8. Find all parameters describing a QTL, as well as test statistics
    9. etc.

    The Figure on the right shows an example of multiple QTL identified by different studies map to pig chromosome 3. With further details, a user can synthesize a picture of his own on the most promising chromosomal region where best candidate gene for a trait may reside.

  11. How to access the information in the QTLdb?

    The QTLdb web interface is designed to be easily accessed by search and browse. Each searched or browsed resulting information is again dynamically linked for further search or browse. In this way users can quickly find information from the QTLdb with multi-directional information traverse. The following paths seem daunting to read but each is at only a couple mouse-clicks away:

    • Draw Single Trait QTL on Multiple Chromosomes
      Go to Search page, input a keyword Click "GO" Click on your trait Click on "Find all QTLs"
    • Draw Multiple QTL on Single Chromosome
      Go to the Browse page Click on your chromosome
    • List all QTL from a particular publication
      Go to Search page, input a keyword Click "GO" Click on "List QTLs"
    • Find a pig QTL in NCBI Gene Database or LocusLink
      Search or browse to a QTL map Click on a QTL symbol Clink on "LocusLink" or "GeneDB"
    • Find DNA sequences associated with certain markers
      Search or browse to a QTL map Click on a marker name Click on "UniSTS" link above the marker name Click on "GenBank Accession" link
    • Find experiment/ publication details that produced a QTL
      Search or browse to a QTL map Click on a QTL symbol Experiment detail in brief is in upper right box Publication detail is in the lower right box
    • Find all locations that QTL for a trait may have been mapped to
      Search or browse to a QTL map Click on a QTL symbol Click on "Trait Name" Click on "Find all QTL on this trait"
    • Find related traits from a known QTL trait
      Search or browse to a QTL map Click on a QTL symbol Click on a "Trait Name", or "Trait Type", or "Trait Class" Choose from returned trait ontology list other traits to search further

  12. What structural genomics information are aligned in the QTLdb and how to access them?

    Thanks to many collaborators who provided a number of useful structural genomics information for aligning to the QTL maps (Acknowledgement are on each respective web pages). These data includes radiation hybrid (RH) maps, BAC clone finger printed contig (FPC) maps, SNP maps, consensus linkage maps, genome maps, etc. For example, 6,500+ cattle SNPs and 1,300+ pig SNPs were aligned to respective QTL maps via RH to human comparative maps. 4,528 new porcine microsatellites from the Sino-Danish Pig Genome Sequencing Consortium were aligned to the pig QTL maps (see following table for a summary).

    Table 1. Data alignment status summary
    SpeciesGenomeRH mapBAC FPCSNPsMicroarray ElementsHuman map
    AffyOligo
    PigsIn progressYesYesYesYesYesYes
    CattleYesYesYesYesYesYesYes
    ChickenYesIn progressYesYesplannedPlannedplanned

    (With reference to paper by Hu et al., "Animal QTLdb: a livestock QTL database tool set for positional QTL information mining and beyond. Nucleic Acids Research, 2007, 35 [Database issue]: D604-D609.). Note: With added function of GBrowse that we implemented in October 2008, alignment to more genome feature is now possible outside of QTLdb. See FAQ #18 for more details.

    All aligned data can be accessed via either pop-up links or web forms on the "chromosome view" page of the QTLdb (see Figure below). Users can bring a QTL region (or interested map locations) across the aligned maps to serve the purpose of data mining, by using either the QTL bars or the web forms where map locations in cM must be provided.

    Please be aware that more data types are continually being added, and data updates are actively going on. Don't be surprised if you see things new.

  13. Which trait(s) are found having more QTLs?

    Backfat, Loin-eye area and Meat Color-L are the top three pig traits having highest number of QTL reported. Body weight in chicken has a dominant number of more QTL than other traits. Fat yield, milk yield and twinning are the three top cattle traits that QTL are measured for. For more, see respective species QTL database "summary" for details.

  14. Are data within QTLdb static?

    No. The QTLdb as an online tool is being kept up-to-date with most current data found in public domain. Its first release was made in June 2004, 2nd release in December 2004, and we plan to have a 3rd release before the end of 2005 (see Notes on the Animal QTL Database Development and Releases for details).

    If you see any new data that has not been included in the QTLdb, please drop us a note with the source of the publication - we will curate it into the database as soon as we can get around. Or better yet, you can register to become a QTL data curator for the QTLdb yourself. In this way you can enter your data, update your data, and also use the curator tools as a research platform (see below).

  15. I wish to find a cytogenetic band region of interest for QTL, how can I do that?

    The QTLdb is based on linkage maps. Before we may add the cyto-genetic band alignments to the linkage maps within the QTLdb, users have to make that alignment with other tools, such as the Arkdb (http://www.thearkdb.org/anubis), i.e. translate your cytogenetic band locations into linkage map locations, and then come to use the QTLdb to look for QTLs. In the Arkdb, you wish to build a within-species comparative map between the "Cytogenetic" map and the "USDA-MARC_v.2" map on the same chromosome.

  16. I wish to "zoom in" to examine a local region of my interested region, can I do that? how?

    The QTL map does not offer "zooming" capabilities. However, there is a way for you to choose the size of the view, by a pull-down menu selection on the top tool bar. On a larger picture, you can move around to see your region of interest in better details.

    Also, combining the use of "Marker density" pull-down menu, you can see more markers in a larger picture, to achieve the "zooming" effects.

  17. Some chromosomes have awful lot of QTL that the chromosome view becomes a very wide picture (extends way out of computer screen to the right), making it hard to compare some alignments. Is there any way I can see a more manageable view?

    On the "chromosome view" of the QTL, there is a "Display QTL" search box in the top tool menu. If you type the QTL abbreviations of your interests and click on "Go", the database will return you a new chromosome view with the QTL of your choice only, making it more effective for you to make comparisons.

  18. For genomic mining of a QTL region, we often need to align a QTL against the genome for underlining genes. Is it possible in the QTLdb?

    Yes it is possible.

    Previously, we aligned some genomics features such as SNPs, microarray elements, microsatellites and RH map markers against QTL in terms of their genomics locations, within the QTLdb. (Reference to FAQ #12)

    In October 2008, we have implemented GBrowse for QTL alignments against multiple genomic features. Now we are able to align the QTL locations against all genome features stored in Genbank, such as locations of transcripts, mRNA, CDS, Annotated Repeats, etc. We also customly add more elements for alignment. The most recent addition to the alignments is the 60K SNP chip elements for cattle and pig.

    The link to Gbrowse view can be found in the Animal QTLdb main pages for respective QTLdb species, and at the GBrowse directory page: http://www.animalgenome.org/gbrowse/. The following link also works: http://www.animalgenome.org/cgi-bin/gbrowse, where you can choose a species to view by selecting a species on the "Data Source" pull-down menu.

  19. How accurate is the QTL location alignment to, say, transcript locations on the most recent genome assembly?

    The alignment of genome locations of QTL against that of transcripts or genes is accomplished by converting the linkage map QTL locations (cM) to genome sequence assembly locations (bp). This is done with references of anchoring markers that are mapped on both (linkage and genome) maps.

    Often, QTL boundaries are not exactly at the anchoring marker locations. In such cases, the relative genome location of the QTL is calculated with an algorithm taking into account of each chromosomal length, the cM versis bp ratio for each individual chromosome, and the offset of the QTL location to that of anchoring marker. As such, the "bp" location of a QTL from its "cM" location is only an approximation. Considering the size of the QTL is on the scale of "cM", which translates into a few hundred kelo- or mega-base pairs; and the error range of a QTL location is often at a few cM, we consider the current "bp" conversion pretty close to the "real" locations, and provide useful land marks for structural genome mining.

    As such, the users are cautioned with the accuracy of the exact "bp" locations found on the GBrowse. It would be safe to give it a range in terms of a few kelo-base-pair error offset when the focus of the interest is around the QTL bounderies.

  20. Can I download the raw data from the QTLdb?

    Yes. Functions have been implemented so that there are multiple ways for you to download data from the QTLdb. (1) Links for downloading QTL coordinates (in cM or in bp) within a species can be found on respective species main pages of the QTLdb; (2) QTL coordinates and related data within a chromosome can be downloaded from the chromosomal view of a species; (3) Subsets of QTL data on a chromosome can be downloaded when term searches are applied to limited the view of QTL to that of your interests.

    Several file formats are available for the downloads: (1) Tab delimited plain text file containing QTL chromosomal locations in cM; (2) GFF files in which the QTL locations are in bp. The GFF download allows you to use the downloaded data file directly with other tools that take GFF file as input.

  21. Terminology

    • Flanking markers - Genetic markers that boundary either sides of a hypothesized QTL. A flanking marker can also represent the level of statistical significance when the QTL is detected. ( see FAQ #5 above for more details )

    • LocusLink - LocusLink organizes information around genes to generate a central hub for accessing gene-specific information for multiple species. It provides a single query interface to curated sequence and descriptive information about genetic loci and presents information on official nomenclature, aliases, sequence accessions, phenotypes, EC numbers, MIM numbers, UniGene clusters, homology, map locations, and related web sites.

    • Quantitative Trait Loci - Genetic loci contributing to quantitative traits variations. ( see FAQ #1 above for more info.)

    • Suggestive linkage - ( see FAQ #5 above )

    • Significant linkage - ( see FAQ #5 above )

    • Trait Ontology - ( see FAQ #6 above )

    • Trait Class - Category of traits that describes one aspect of the pork product or process in which the product is made. e.g. Meat Quality.

    • Trait Type - A group of traits that describes a specified property of the pork products or feature(s) that can influence the process in which pork product is made. Can also be called SuperTrait. e.g. Meat Color.

    • Trait Name - A defined name for traits by the measurement locations, time, methods and measuring units. e.g. 24hr post mortem pH. Each trait is distinguished by its characteristics, methods of measurement, and product merit.


First draft: January 5, 2005
Version 2: August, 12, 2006
Version 3: January, 11, 2007
Version 4: May, 17, 2007
Version 5: January, 6, 2009
By Zhiliang Hu
Associate Scientist
Dept of Animal Science
Iowa State University
Most recent update: May 12, 2009 (04:01:10 PM)

Web Access Statistics © 2003-2009 NAGRP - Bioinformatics Coordination Program.
Contact: NAGRP Bioinformatics Team
July 04, 2009 (Saturday)