download an ortholog dataset


datasets download ortholog - download an ortholog dataset


Download an ortholog dataset including gene, transcript and protein sequence, a data table and a data report. Ortholog data is calculated by NCBI for vertebrates and insects. Ortholog datasets can be specified by NCBI Gene ID, symbol or RefSeq accession. Datasets are downloaded as a zip file.

The default gene dataset includes the following files:

  • gene.fna (gene sequences)
  • rna.fna (transcript sequences)
  • protein.faa (protein sequences)
  • data_report.jsonl (data report with gene metadata)
  • data_table.tsv (data table with gene metadata, one transcript per row)
  • dataset_catalog.json (a list of files and file types included in the dataset)

Refer to NCBI’s command line quickstart documentation for information about getting started with the command-line tools.


  datasets download ortholog gene-id 672


      --api-key string         NCBI Datasets API Key
      --exclude-gene           exclude gene.fna (gene sequence file)
      --exclude-protein        exclude protein.faa (protein sequence file)
      --exclude-rna            exclude rna.fna (transcript sequence file)
      --filename string        specify a custom file name for the downloaded dataset (default "ncbi_dataset.zip")
  -h, --help                   help for ortholog
      --include-3p-utr         include 3p_utr.fna (3'-UTR sequence file)
      --include-5p-utr         include 5p_utr.fna (5'-UTR sequence file)
      --include-cds            include cds.fna (CDS sequence file)
      --no-progressbar         hide progress bar
      --taxon-filter strings   limit results to ortholog data for a specified taxonomic group



download an ortholog dataset by NCBI Gene ID


download an ortholog dataset by gene symbol


download an ortholog dataset by RefSeq nucleotide or protein accession

Generated October 18, 2021