Module: ncbi.datasets.openapi.api.genome_api

NCBI Datasets API

Module: ncbi.datasets.openapi.api.genome_api

NCBI Datasets API

### NCBI Datasets is a resource that lets you easily gather data from NCBI. The Datasets API is still in alpha, and we’re updating it often to add new functionality, iron out bugs and enhance usability. For some larger downloads, you may want to download a [dehydrated bag](https://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/datasets/docs/rehydrate/), and retrieve the individual data files at a later time. # noqa: E501

The version of the OpenAPI document: v1 Generated by: https://openapi-generator.tech

class ncbi.datasets.openapi.api.genome_api.GenomeApi(api_client=None)

Bases: object

NOTE: This class is auto generated by OpenAPI Generator Ref: https://openapi-generator.tech

Do not edit the class manually.

assembly_descriptors_by_accessions(accessions, **kwargs)

Get genome metadata by accession # noqa: E501

Get detailed metadata for assembled genomes by accession in a JSON output format. # noqa: E501 This method makes a synchronous HTTP request by default. To make an asynchronous HTTP request, please pass async_req=True

>>> thread = api.assembly_descriptors_by_accessions(accessions, async_req=True)
>>> result = thread.get()
Parameters

accessions ([str]) –

Keyword Arguments
  • filters_reference_only (bool) – If true, only return reference and representative (GCF_ and GCA_) genome assemblies.. [optional] if omitted the server will use the default value of False

  • filters_assembly_source (V1AssemblyDatasetDescriptorsFilterAssemblySource) – Return only RefSeq (GCF_) or GenBank (GCA_) genome assemblies. [optional]

  • filters_has_annotation (bool) – Return only annotated genome assemblies. [optional] if omitted the server will use the default value of False

  • filters_assembly_level ([V1AssemblyDatasetDescriptorsFilterAssemblyLevel]) – Only return genome assemblies that have one of the specified assembly levels. By default, do not filter.. [optional]

  • filters_first_release_date (datetime) – Only return genome assemblies that were released on or after the specified date By default, do not filter.. [optional]

  • filters_last_release_date (datetime) – Only return genome assemblies that were released on or before to the specified date By default, do not filter.. [optional]

  • filters_search_text ([str]) – Only return results whose fields contain the specified search terms in their taxon, infraspecific, assembly name or submitter fields By default, do not filter. [optional]

  • page_size (int) – The maximum number of genome assemblies to return. Default is 20 and maximum is 1000. If the number of results exceeds the page size, page_token can be used to retrieve the remaining results.. [optional] if omitted the server will use the default value of 20

  • page_token (str) – A page token is returned from an AssemblyMetadataRequest call with more than page_size results. Use this token, along with the previous AssemblyMetadataRequest parameters, to retrieve the next page of results. When page_token is empty, all results have been retrieved.. [optional]

  • _return_http_data_only (bool) – response data without head status code and headers. Default is True.

  • _preload_content (bool) – if False, the urllib3.HTTPResponse object will be returned without reading/decoding response data. Default is True.

  • _request_timeout (int/float/tuple) – timeout setting for this request. If one number provided, it will be total request timeout. It can also be a pair (tuple) of (connection, read) timeouts. Default is None.

  • _check_input_type (bool) – specifies if type checking should be done one the data sent to the server. Default is True.

  • _check_return_type (bool) – specifies if type checking should be done one the data received from the server. Default is True.

  • _host_index (int/None) – specifies the index of the server that we want to use. Default is read from the configuration.

  • async_req (bool) – execute request asynchronously

Returns

V1AssemblyMetadata

If the method is called asynchronously, returns the request thread.

assembly_descriptors_by_bioproject(accessions, **kwargs)

Get genome metadata by bioproject accession # noqa: E501

Get detailed metadata for assembled genomes by bioproject accession in a JSON output format. # noqa: E501 This method makes a synchronous HTTP request by default. To make an asynchronous HTTP request, please pass async_req=True

>>> thread = api.assembly_descriptors_by_bioproject(accessions, async_req=True)
>>> result = thread.get()
Parameters

accessions ([str]) –

Keyword Arguments
  • filters_reference_only (bool) – If true, only return reference and representative (GCF_ and GCA_) genome assemblies.. [optional] if omitted the server will use the default value of False

  • filters_assembly_source (V1AssemblyDatasetDescriptorsFilterAssemblySource) – Return only RefSeq (GCF_) or GenBank (GCA_) genome assemblies. [optional]

  • filters_has_annotation (bool) – Return only annotated genome assemblies. [optional] if omitted the server will use the default value of False

  • filters_assembly_level ([V1AssemblyDatasetDescriptorsFilterAssemblyLevel]) – Only return genome assemblies that have one of the specified assembly levels. By default, do not filter.. [optional]

  • filters_first_release_date (datetime) – Only return genome assemblies that were released on or after the specified date By default, do not filter.. [optional]

  • filters_last_release_date (datetime) – Only return genome assemblies that were released on or before to the specified date By default, do not filter.. [optional]

  • filters_search_text ([str]) – Only return results whose fields contain the specified search terms in their taxon, infraspecific, assembly name or submitter fields By default, do not filter. [optional]

  • returned_content (V1AssemblyMetadataRequestContentType) – Return either assembly accessions, or entire assembly-metadata records. [optional]

  • page_size (int) – The maximum number of genome assemblies to return. Default is 20 and maximum is 1000. If the number of results exceeds the page size, page_token can be used to retrieve the remaining results.. [optional] if omitted the server will use the default value of 20

  • page_token (str) – A page token is returned from an AssemblyMetadataRequest call with more than page_size results. Use this token, along with the previous AssemblyMetadataRequest parameters, to retrieve the next page of results. When page_token is empty, all results have been retrieved.. [optional]

  • _return_http_data_only (bool) – response data without head status code and headers. Default is True.

  • _preload_content (bool) – if False, the urllib3.HTTPResponse object will be returned without reading/decoding response data. Default is True.

  • _request_timeout (int/float/tuple) – timeout setting for this request. If one number provided, it will be total request timeout. It can also be a pair (tuple) of (connection, read) timeouts. Default is None.

  • _check_input_type (bool) – specifies if type checking should be done one the data sent to the server. Default is True.

  • _check_return_type (bool) – specifies if type checking should be done one the data received from the server. Default is True.

  • _host_index (int/None) – specifies the index of the server that we want to use. Default is read from the configuration.

  • async_req (bool) – execute request asynchronously

Returns

V1AssemblyMetadata

If the method is called asynchronously, returns the request thread.

assembly_descriptors_by_taxon(taxon, **kwargs)

Get genome metadata by taxonomic identifier # noqa: E501

Get detailed metadata on all assembled genomes for a specified NCBI Taxonomy ID or name (common or scientific) at any taxonomic rank. # noqa: E501 This method makes a synchronous HTTP request by default. To make an asynchronous HTTP request, please pass async_req=True

>>> thread = api.assembly_descriptors_by_taxon(taxon, async_req=True)
>>> result = thread.get()
Parameters

taxon (str) – NCBI Taxonomy ID or name (common or scientific) at any taxonomic rank

Keyword Arguments
  • filters_reference_only (bool) – If true, only return reference and representative (GCF_ and GCA_) genome assemblies.. [optional] if omitted the server will use the default value of False

  • filters_assembly_source (V1AssemblyDatasetDescriptorsFilterAssemblySource) – Return only RefSeq (GCF_) or GenBank (GCA_) genome assemblies. [optional]

  • filters_has_annotation (bool) – Return only annotated genome assemblies. [optional] if omitted the server will use the default value of False

  • filters_assembly_level ([V1AssemblyDatasetDescriptorsFilterAssemblyLevel]) – Only return genome assemblies that have one of the specified assembly levels. By default, do not filter.. [optional]

  • filters_first_release_date (datetime) – Only return genome assemblies that were released on or after the specified date By default, do not filter.. [optional]

  • filters_last_release_date (datetime) – Only return genome assemblies that were released on or before to the specified date By default, do not filter.. [optional]

  • filters_search_text ([str]) – Only return results whose fields contain the specified search terms in their taxon, infraspecific, assembly name or submitter fields By default, do not filter. [optional]

  • tax_exact_match (bool) – If true, only return assemblies with the given NCBI Taxonomy ID, or name. Otherwise, assemblies from taxonomy subtree are included, too. Ignored for assembly_accession request.. [optional] if omitted the server will use the default value of False

  • returned_content (V1AssemblyMetadataRequestContentType) – Return either assembly accessions, or entire assembly-metadata records. [optional]

  • page_size (int) – The maximum number of genome assemblies to return. Default is 20 and maximum is 1000. If the number of results exceeds the page size, page_token can be used to retrieve the remaining results.. [optional] if omitted the server will use the default value of 20

  • page_token (str) – A page token is returned from an AssemblyMetadataRequest call with more than page_size results. Use this token, along with the previous AssemblyMetadataRequest parameters, to retrieve the next page of results. When page_token is empty, all results have been retrieved.. [optional]

  • _return_http_data_only (bool) – response data without head status code and headers. Default is True.

  • _preload_content (bool) – if False, the urllib3.HTTPResponse object will be returned without reading/decoding response data. Default is True.

  • _request_timeout (int/float/tuple) – timeout setting for this request. If one number provided, it will be total request timeout. It can also be a pair (tuple) of (connection, read) timeouts. Default is None.

  • _check_input_type (bool) – specifies if type checking should be done one the data sent to the server. Default is True.

  • _check_return_type (bool) – specifies if type checking should be done one the data received from the server. Default is True.

  • _host_index (int/None) – specifies the index of the server that we want to use. Default is read from the configuration.

  • async_req (bool) – execute request asynchronously

Returns

V1AssemblyMetadata

If the method is called asynchronously, returns the request thread.

check_assembly_availability(accessions, **kwargs)

Check the validity of genome accessions # noqa: E501

The ‘GET’ version of check is limited by the size of the GET URL (2KB, which works out to about 140 genomic accessions). The POST operation is provided to allow users to supply a larger number of accessions in a single request. # noqa: E501 This method makes a synchronous HTTP request by default. To make an asynchronous HTTP request, please pass async_req=True

>>> thread = api.check_assembly_availability(accessions, async_req=True)
>>> result = thread.get()
Parameters

accessions ([str]) – NCBI genome assembly accessions

Keyword Arguments
  • _return_http_data_only (bool) – response data without head status code and headers. Default is True.

  • _preload_content (bool) – if False, the urllib3.HTTPResponse object will be returned without reading/decoding response data. Default is True.

  • _request_timeout (int/float/tuple) – timeout setting for this request. If one number provided, it will be total request timeout. It can also be a pair (tuple) of (connection, read) timeouts. Default is None.

  • _check_input_type (bool) – specifies if type checking should be done one the data sent to the server. Default is True.

  • _check_return_type (bool) – specifies if type checking should be done one the data received from the server. Default is True.

  • _host_index (int/None) – specifies the index of the server that we want to use. Default is read from the configuration.

  • async_req (bool) – execute request asynchronously

Returns

V1AssemblyDatasetAvailability

If the method is called asynchronously, returns the request thread.

check_assembly_availability_post(v1_assembly_dataset_request, **kwargs)

Check the validity of many genome accessions in a single request # noqa: E501

The ‘GET’ version of check is limited by the size of the GET URL (2KB, which works out to about 140 genomic accessions). The POST operation is provided to allow users to supply a larger number of accessions in a single request. # noqa: E501 This method makes a synchronous HTTP request by default. To make an asynchronous HTTP request, please pass async_req=True

>>> thread = api.check_assembly_availability_post(v1_assembly_dataset_request, async_req=True)
>>> result = thread.get()
Parameters

v1_assembly_dataset_request (V1AssemblyDatasetRequest) –

Keyword Arguments
  • _return_http_data_only (bool) – response data without head status code and headers. Default is True.

  • _preload_content (bool) – if False, the urllib3.HTTPResponse object will be returned without reading/decoding response data. Default is True.

  • _request_timeout (int/float/tuple) – timeout setting for this request. If one number provided, it will be total request timeout. It can also be a pair (tuple) of (connection, read) timeouts. Default is None.

  • _check_input_type (bool) – specifies if type checking should be done one the data sent to the server. Default is True.

  • _check_return_type (bool) – specifies if type checking should be done one the data received from the server. Default is True.

  • _host_index (int/None) – specifies the index of the server that we want to use. Default is read from the configuration.

  • async_req (bool) – execute request asynchronously

Returns

V1AssemblyDatasetAvailability

If the method is called asynchronously, returns the request thread.

download_assembly_package(accessions, **kwargs)

Get a genome dataset by accession # noqa: E501

Download a genome dataset including fasta sequence, annotation and a detailed data report by accession. # noqa: E501 This method makes a synchronous HTTP request by default. To make an asynchronous HTTP request, please pass async_req=True

>>> thread = api.download_assembly_package(accessions, async_req=True)
>>> result = thread.get()
Parameters

accessions ([str]) – NCBI genome assembly accessions

Keyword Arguments
  • chromosomes ([str]) – The default setting is all chromosome. Specify individual chromosome by string (1,2,MT or chr1,chr2.chrMT). Unplaced sequences are treated like their own chromosome (‘Un’). The filter only applies to fasta sequence.. [optional]

  • exclude_sequence (bool) – Set to true to omit the genomic sequence.. [optional] if omitted the server will use the default value of False

  • include_annotation_type ([V1AnnotationForAssemblyType]) – Select additional types of annotation to include in the data package. If unset, no annotation is provided.. [optional]

  • hydrated (V1AssemblyDatasetRequestResolution) – Set to DATA_REPORT_ONLY, to only retrieve data-reports.. [optional]

  • filename (str) – Output file name.. [optional] if omitted the server will use the default value of “ncbi_dataset.zip”

  • _return_http_data_only (bool) – response data without head status code and headers. Default is True.

  • _preload_content (bool) – if False, the urllib3.HTTPResponse object will be returned without reading/decoding response data. Default is True.

  • _request_timeout (int/float/tuple) – timeout setting for this request. If one number provided, it will be total request timeout. It can also be a pair (tuple) of (connection, read) timeouts. Default is None.

  • _check_input_type (bool) – specifies if type checking should be done one the data sent to the server. Default is True.

  • _check_return_type (bool) – specifies if type checking should be done one the data received from the server. Default is True.

  • _host_index (int/None) – specifies the index of the server that we want to use. Default is read from the configuration.

  • async_req (bool) – execute request asynchronously

Returns

file_type

If the method is called asynchronously, returns the request thread.

download_assembly_package_post(v1_assembly_dataset_request, **kwargs)

Get a genome dataset by post # noqa: E501

The ‘GET’ version of download is limited by the size of the GET URL (2KB, which works out to about 140 genomic accessions). The POST operation is provided to allow users to supply a larger number of accessions in a single request. # noqa: E501 This method makes a synchronous HTTP request by default. To make an asynchronous HTTP request, please pass async_req=True

>>> thread = api.download_assembly_package_post(v1_assembly_dataset_request, async_req=True)
>>> result = thread.get()
Parameters

v1_assembly_dataset_request (V1AssemblyDatasetRequest) –

Keyword Arguments
  • filename (str) – Output file name.. [optional] if omitted the server will use the default value of “ncbi_dataset.zip”

  • _return_http_data_only (bool) – response data without head status code and headers. Default is True.

  • _preload_content (bool) – if False, the urllib3.HTTPResponse object will be returned without reading/decoding response data. Default is True.

  • _request_timeout (int/float/tuple) – timeout setting for this request. If one number provided, it will be total request timeout. It can also be a pair (tuple) of (connection, read) timeouts. Default is None.

  • _check_input_type (bool) – specifies if type checking should be done one the data sent to the server. Default is True.

  • _check_return_type (bool) – specifies if type checking should be done one the data received from the server. Default is True.

  • _host_index (int/None) – specifies the index of the server that we want to use. Default is read from the configuration.

  • async_req (bool) – execute request asynchronously

Returns

file_type

If the method is called asynchronously, returns the request thread.

genome_download_summary(accessions, **kwargs)

Preview genome dataset download # noqa: E501

Get a download summary by accession in a JSON output format. # noqa: E501 This method makes a synchronous HTTP request by default. To make an asynchronous HTTP request, please pass async_req=True

>>> thread = api.genome_download_summary(accessions, async_req=True)
>>> result = thread.get()
Parameters

accessions ([str]) – NCBI genome assembly accessions

Keyword Arguments
  • chromosomes ([str]) – The default setting is all chromosome. Specify individual chromosome by string (1,2,MT or chr1,chr2.chrMT). Unplaced sequences are treated like their own chromosome (‘Un’). The filter only applies to fasta sequence.. [optional]

  • exclude_sequence (bool) – Set to true to omit the genomic sequence.. [optional] if omitted the server will use the default value of False

  • include_annotation_type ([V1AnnotationForAssemblyType]) – Select additional types of annotation to include in the data package. If unset, no annotation is provided.. [optional]

  • _return_http_data_only (bool) – response data without head status code and headers. Default is True.

  • _preload_content (bool) – if False, the urllib3.HTTPResponse object will be returned without reading/decoding response data. Default is True.

  • _request_timeout (int/float/tuple) – timeout setting for this request. If one number provided, it will be total request timeout. It can also be a pair (tuple) of (connection, read) timeouts. Default is None.

  • _check_input_type (bool) – specifies if type checking should be done one the data sent to the server. Default is True.

  • _check_return_type (bool) – specifies if type checking should be done one the data received from the server. Default is True.

  • _host_index (int/None) – specifies the index of the server that we want to use. Default is read from the configuration.

  • async_req (bool) – execute request asynchronously

Returns

V1DownloadSummary

If the method is called asynchronously, returns the request thread.

genome_download_summary_by_post(v1_assembly_dataset_request, **kwargs)

Preview genome dataset download by POST # noqa: E501

The ‘GET’ version of download summary is limited by the size of the GET URL (2KB, which works out to about 140 genomic accessions). The POST operation is provided to allow users to supply a larger number of accessions in a single request. # noqa: E501 This method makes a synchronous HTTP request by default. To make an asynchronous HTTP request, please pass async_req=True

>>> thread = api.genome_download_summary_by_post(v1_assembly_dataset_request, async_req=True)
>>> result = thread.get()
Parameters

v1_assembly_dataset_request (V1AssemblyDatasetRequest) –

Keyword Arguments
  • _return_http_data_only (bool) – response data without head status code and headers. Default is True.

  • _preload_content (bool) – if False, the urllib3.HTTPResponse object will be returned without reading/decoding response data. Default is True.

  • _request_timeout (int/float/tuple) – timeout setting for this request. If one number provided, it will be total request timeout. It can also be a pair (tuple) of (connection, read) timeouts. Default is None.

  • _check_input_type (bool) – specifies if type checking should be done one the data sent to the server. Default is True.

  • _check_return_type (bool) – specifies if type checking should be done one the data received from the server. Default is True.

  • _host_index (int/None) – specifies the index of the server that we want to use. Default is read from the configuration.

  • async_req (bool) – execute request asynchronously

Returns

V1DownloadSummary

If the method is called asynchronously, returns the request thread.

genome_metadata_by_post(v1_assembly_metadata_request, **kwargs)

Get genome metadata by variety of identifiers # noqa: E501

Get detailed metadata for assembled genomes. # noqa: E501 This method makes a synchronous HTTP request by default. To make an asynchronous HTTP request, please pass async_req=True

>>> thread = api.genome_metadata_by_post(v1_assembly_metadata_request, async_req=True)
>>> result = thread.get()
Parameters

v1_assembly_metadata_request (V1AssemblyMetadataRequest) –

Keyword Arguments
  • _return_http_data_only (bool) – response data without head status code and headers. Default is True.

  • _preload_content (bool) – if False, the urllib3.HTTPResponse object will be returned without reading/decoding response data. Default is True.

  • _request_timeout (int/float/tuple) – timeout setting for this request. If one number provided, it will be total request timeout. It can also be a pair (tuple) of (connection, read) timeouts. Default is None.

  • _check_input_type (bool) – specifies if type checking should be done one the data sent to the server. Default is True.

  • _check_return_type (bool) – specifies if type checking should be done one the data received from the server. Default is True.

  • _host_index (int/None) – specifies the index of the server that we want to use. Default is read from the configuration.

  • async_req (bool) – execute request asynchronously

Returns

V1AssemblyMetadata

If the method is called asynchronously, returns the request thread.

genome_tax_name_query(taxon_query, **kwargs)

Get a list of taxonomy names and IDs found in the assembly dataset given a partial taxonomic name # noqa: E501

This endpoint retrieves a list of taxonomy names and IDs found in the assembly dataset given a partial taxonomic name of any rank. # noqa: E501 This method makes a synchronous HTTP request by default. To make an asynchronous HTTP request, please pass async_req=True

>>> thread = api.genome_tax_name_query(taxon_query, async_req=True)
>>> result = thread.get()
Parameters

taxon_query (str) – NCBI Taxonomy ID or name (common or scientific) at any taxonomic rank

Keyword Arguments
  • tax_rank_filter (V1OrganismQueryRequestTaxRankFilter) – Set the scope of searched tax ranks. [optional]

  • _return_http_data_only (bool) – response data without head status code and headers. Default is True.

  • _preload_content (bool) – if False, the urllib3.HTTPResponse object will be returned without reading/decoding response data. Default is True.

  • _request_timeout (int/float/tuple) – timeout setting for this request. If one number provided, it will be total request timeout. It can also be a pair (tuple) of (connection, read) timeouts. Default is None.

  • _check_input_type (bool) – specifies if type checking should be done one the data sent to the server. Default is True.

  • _check_return_type (bool) – specifies if type checking should be done one the data received from the server. Default is True.

  • _host_index (int/None) – specifies the index of the server that we want to use. Default is read from the configuration.

  • async_req (bool) – execute request asynchronously

Returns

V1SciNameAndIds

If the method is called asynchronously, returns the request thread.

genome_tax_tree(taxon, **kwargs)

Get a taxonomic subtree by taxonomic identifier # noqa: E501

Using a NCBI Taxonomy ID or name (common or scientific) at any rank, get a subtree filtered for species with assembled genomes. # noqa: E501 This method makes a synchronous HTTP request by default. To make an asynchronous HTTP request, please pass async_req=True

>>> thread = api.genome_tax_tree(taxon, async_req=True)
>>> result = thread.get()
Parameters

taxon (str) – NCBI Taxonomy ID or name (common or scientific) at any taxonomic rank

Keyword Arguments
  • children_only (bool) – Only report the children of the requested taxon and not their descendants. [optional] if omitted the server will use the default value of False

  • _return_http_data_only (bool) – response data without head status code and headers. Default is True.

  • _preload_content (bool) – if False, the urllib3.HTTPResponse object will be returned without reading/decoding response data. Default is True.

  • _request_timeout (int/float/tuple) – timeout setting for this request. If one number provided, it will be total request timeout. It can also be a pair (tuple) of (connection, read) timeouts. Default is None.

  • _check_input_type (bool) – specifies if type checking should be done one the data sent to the server. Default is True.

  • _check_return_type (bool) – specifies if type checking should be done one the data received from the server. Default is True.

  • _host_index (int/None) – specifies the index of the server that we want to use. Default is read from the configuration.

  • async_req (bool) – execute request asynchronously

Returns

V1Organism

If the method is called asynchronously, returns the request thread.

Generated October 22, 2021