Cloud-based Taxonomy Analysis Table
Overview
The Taxonomy Analysis Table (tax_analysis
) contains the taxonomy
information of the run as determined by the
SRA Taxonomy Analysis Tool .
New records are added hourly. This is useful when you want to filter hundreds of thousands of Runs by specific taxonomic content.
It is also possible to build local taxonomy trees using the ilevel, ileft, and iright values.
Linking to other tables:
- to Metadata Table (
sra.metadata
) byacc
column - to Taxonomy Analysis Information (
tax_analysis_info
) table byacc
column - to Taxonomy (
tax
) table bytax_id
column - to kmer (
kmer
) table bytax_id
column
Column name | Type | Desription |
---|---|---|
acc | STRING | SRA Run accession in the form of SRR######## (ERR or DRR for INSDC partners) |
tax_id | INTEGER | Integer ID of the taxonomy record |
rank | STRING | The taxonomic rank |
name | STRING | Scientific name of the rank/organism |
total_count | INTEGER | The total count is the number of kmer hits for the records and all it’s children |
self_count | INTEGER | The count of kmer hits for the record itself |
ilevel | INTEGER | Level of the tree |
ileft | INTEGER | Left mapping value |
iright | INTEGER | Right mapping value |
Example queries for Big Query UI
Searches for the SRA accessions for all records that were identified to have a kmer hit to Coronaviridae:
FROM `nih-sra-datastore.sra.metadata` as m, `nih-sra-datastore.sra_tax_analysis_tool.tax_analysis` as tax
WHERE m.acc=tax.acc and tax_id=11118
ORDER BY m.bioproject, m.sra_study, m.biosample, m.sample_acc
Build a local taxonomic tree by ordering the data based on ileft
and ilevel
for a metagenomic data set:
Search for SRA Runs by taxonomic name:
Contact SRA
Contact SRA staff for assistance at sra@ncbi.nlm.nih.gov