genome

Download a genome data package

Name

datasets download genome - Download a genome data package

Synopsis

datasets download genome [flags]

Description

Download a genome data package. Genome data packages may include genome, transcript and protein sequences, annotation and one or more data reports. Data packages are downloaded as a zip archive.

The default genome data package includes the following files:

  • _<assembly_name>_genomic.fna (genomic sequences)
  • assembly_data_report.jsonl (data report with genome assembly and annotation metadata)
  • dataset_catalog.json (a list of files and file types included in the data package)

Examples

  datasets download genome accession GCF_000001405.40 --chromosomes X,Y --include genome,gff3,rna
  datasets download genome taxon "bos taurus" --dehydrated
  datasets download genome taxon human --assembly-level chromosome,complete --dehydrated
  datasets download genome taxon mouse --search C57BL/6J --search "Broad Institute" --dehydrated

Options

      --annotated                Limit to annotated genomes
      --api-key string           Specify an NCBI API key
      --assembly-level string    Limit to genomes at one or more assembly levels (comma-separated):
                                   * chromosome
                                   * complete
                                   * contig
                                   * scaffold
                                    (default "[]")
      --assembly-source string   Limit to 'RefSeq' (GCF_) or 'GenBank' (GCA_) genomes (default "all")
      --chromosomes strings      Limit to a specified, comma-delimited list of chromosomes, or 'all' for all chromosomes
      --debug                    Emit debugging info
      --dehydrated               Download a dehydrated zip archive including the data report and locations of data files (use the rehydrate command to retrieve data files).
      --exclude-atypical         Exclude atypical assemblies
      --filename string          Specify a custom file name for the downloaded data package (default "ncbi_dataset.zip")
      --help                     Print detailed help about a datasets command
      --mag string               Limit to metagenome assembled genomes (only) or remove them from the results (exclude) (default "all")
      --no-progressbar           Hide progress bar
      --preview                  Show information about the requested data package
      --reference                Limit to reference genomes
      --released-after string    Limit to genomes released on or after a specified date (MM/DD/YYYY)
      --released-before string   Limit to genomes released on or before a specified date (MM/DD/YYYY)
      --search strings           Limit results to genomes with specified text in the searchable fields:
                                 species and infraspecies, assembly name and submitter.
                                 To search multiple strings, use the flag multiple times.
      --version                  Print version of datasets

Commands


accession

Download a genome data package by Assembly or BioProject accession

taxon

Download a genome data package by taxon (NCBI Taxonomy ID, scientific or common name at any tax rank)

Generated April 19, 2024