accession

Download a gene data package by RefSeq nucleotide or protein accession

accession

Download a gene data package by RefSeq nucleotide or protein accession

Name

datasets download gene accession - Download a gene data package by RefSeq nucleotide or protein accession

Synopsis

datasets download gene accession <refseq-accession ...> [flags]

Description

Download a gene data package by RefSeq nucleotide or protein accession. Gene data packages include gene, transcript and protein sequences and one or more data reports. Data packages are downloaded as a zip archive.

The default gene data package for NM, NR, NP, XM, XR, XP and YP accessions:

  • rna.fna (transcript sequences)
  • protein.faa (protein sequences)
  • data_report.jsonl (data report with gene metadata)
  • dataset_catalog.json (a list of files and file types included in the data package)

The default gene data package for WP accessions:

  • gene.fna (gene sequences for all genomes on which the WP is annotated)
  • protein.faa (protein sequences)
  • data_report.jsonl (data report with gene metadata)
  • dataset_catalog.json (a list of files and file types included in the data package)
  • annotation_report.jsonl (annotated locations of WP proteins on bacterial genomes)

Examples

  datasets download gene accession NP_000483.3
  datasets download gene accession NM_000546.6 NM_000492.4
  datasets download gene accession WP_000769114.1

Options

      --api-key string             Specify an NCBI API key
      --debug                      Emit debugging info
      --fasta-filter strings       Limit protein and RNA sequence files to the specified RefSeq nucleotide and protein accessions
      --fasta-filter-file string   Limit protein and RNA sequence files to the specified RefSeq nucleotide and protein accessions included in the specified file
      --filename string            Specify a custom file name for the downloaded data package (default "ncbi_dataset.zip")
      --help                       Print detailed help about a datasets command
      --include string(,string)    Specify the data files to include (comma-separated).
                                     * gene:           gene sequence
                                     * rna:            transcript
                                     * protein:        amino acid sequences
                                     * cds:            nucleotide coding sequences
                                     * 5p-utr:         5'-UTR
                                     * 3p-utr:         3'-UTR
                                     * product-report: gene transcript and protein locations and metadata
                                     * none:           do not retrieve any sequence files
                                      (default [])
      --include-flanks-bp int      Specify the length of flanking nucleotides (WP accessions only)
      --inputfile string           Read a list of NCBI Gene Accessions from a file to use as input
      --no-progressbar             Hide progress bar
      --ortholog strings           Retrieves data for an ortholog set. Provide one or more taxa (any rank) to filter results or 'all' for the complete set.
      --preview                    Show information about the requested data package
      --taxon-filter string        Limit gene sequences and annotation report file to specified taxon (any rank, only available for WP accessions)
      --version                    Print version of datasets
Generated April 19, 2024