Steps |
Check points |
Operatioins |
Step 0: Overview of new data in the curation pipeline |
- Spot checks of data with admin web tools
- Monitor data flow/motion
- Monitor data statistics
|
Overview of data in a web environment similar that for editors with
limited number of execusion power in terms of batch processes.
This is
part of the routine of the DB admin prior to the database release stage.
|
Step 1: Run check points |
- Re-populated 'breed' table with QTL/association information.
- Update gene info from NCBI (where only Gene ID is curated)
- Check for any missing statistics
- Check any missing map info.
- Check if SNPs are available where coordinates are manually entered.
- Fix 1: Populated empty coordinates fields where SNP is available
- Fix 2: Convert 'bp' to 'cM' where applicable
- Fix 3: Fill 'peak'/'span' by their linkage marker locations
- Fix 4: Convert 'cM' to 'bp' where applicable
- Fix 5: Fill missing symbols/names in QTLdata table
- Fix 6: Find and fix inverted bp locations
- Fix 7: Look for 'rs' number of 'ss' SNPs
- Fix 8: Find missing or conflict QTL Symbols
|
Each operation is by running scripts specifically developed for each specific
purpose. Operations require human verification of input/output/error report
to ensure valid processes, identify new problems, exceptions. Modify scripts
for fixes where apply.
|
Step 2: Verify new reference PDF files |
- Find all physical PDF files, “touch” db
- Identify missing PDF files, “touch” db
- Move PDF file in place from upload pool; Check for errors.
|
This is to make the backend links of curated data to their sources (PDF
files where the data were published) for future data quality control
checkups.
|
Step 3: Do the "release" |
- Database: List data by curators, species, verification status
- Web site: Publish release statistics
- Release summary: Compose release data summary
|
- Run scripts; Issue option to release; Log automatically kept
- Semi-automated data updates on web
- Add tools update descriptions
|
Step 4: Post-release operations |
- Prepare data for download
- for NCBI (pre-agreed data format)
- for Routure (pre-agreed data format)
- for Public users (with updated format)
- GBrowse: Re-set up
- JBrowse: Re-set up
- Biomart: data re-import
|
Data refresh on "other" data portals.
|
Step 5: Post-release updates
|
- Update “QTL Gene” IDs from NCBI
|
To complete the new QTL/association data entries with "Gene IDs" assigned
by NCBI GeneDB.
|