accession

Request genome data by accessions.

accession

Request genome data by accessions.

Name

datasets download virus genome accession - Request genome data by accessions.

Synopsis

datasets download virus genome accession <accession ...> [flags]

Description

Download a coronavirus genome dataset by nucleotide accessions. Coronavirus genome data packages include genome, CDS and protein sequence, annotation and a detailed data report. Datasets are downloaded as a zip file.

The default coronavirus genome dataset includes the following files (if available):

  • genomic.fna (genomic sequences)
  • cds.fna (nucleotide coding sequences)
  • protein.faa (protein sequences)
  • data_report.jsonl (data report with viral metadata)
  • virus_dataset.md (README containing details on sequence file data content and other information)
  • dataset_catalog.json (a list of files and file types included in the dataset)

Refer to NCBI’s download and install documentation for information about getting started with the command-line tools.

Examples

  datasets download virus genome accession NC_045512.2

Options

      --annotated               limit to annotated coronavirus genomes
      --api-key string          NCBI Datasets API Key
      --complete-only           limit to complete coronavirus genomes
      --exclude-cds             exclude cds.fna (CDS sequence file)
      --exclude-protein         exclude protein.faa (protein sequence file)
      --exclude-seq             exclude genomic.fna (genomic sequence file)
      --filename string         specify a custom file name for the downloaded dataset (default "ncbi_dataset.zip")
      --geo-location string     limit to coronavirus genomes isolated from a specified geographic location (continent, country or U.S. state)
  -h, --help                    help for accession
      --host string             limit to coronavirus genomes isolated from a specified host (NCBI Taxonomy ID, scientific or common name at any taxonomic rank)
      --input-file string       read a list of nucleotide accessions from a text file - file should have 1 identifier per row and no spaces or quotes
      --lineage string          limit to SARS-CoV-2 genomes classified as the specified lineage (variant) by pangolin using the pangoLEARN algorithm
      --no-progressbar          hide progress bar
      --refseq                  limit to RefSeq coronavirus genomes
      --released-since string   limit to coronavirus genomes released after a specified date (MM/DD/YYYY)
      --updated-since string    limit to coronavirus genomes updated after a specified date (MM/DD/YYYY)
Generated August 11, 2022