dbVar Help & FAQ

  1. Introduction
  2. dbVar Study Browser
  3. dbVar Study Page
  4. dbVar Variant Page
  5. Variant Call and Region Types
  6. dbVar Variant Rendering
  7. dbVar Placements
  8. FTP Site
  9. dbVar Entrez Search
  10. FAQ

Last updated June 2020


Introduction

dbVar is a database of human genomic structural variation where users can search, view, and download data from submitted studies. dbVar stopped supporting data from non-human organisms on November 1, 2017; however existing non-human data remains available via FTP download. In keeping with the common definition of structural variation, most variants are larger than 50 basepairs in length - however a handful of smaller variants may also be found. dbVar provides access to the raw data whenever available, as well as links to additional resources, from both NCBI and elsewhere. For more information on structural variation see the Overview of Structural Variation page. For frequently asked questions see the dbVar Help & FAQ page.

dbVar is a free resource that is developed and maintained by the National Center for Biotechnology Information (NCBI) at the U.S. National Library of Medicine (NLM), located at the National Institutes of Health (NIH) in Bethesda, Maryland.


dbVar Study Browser

Data in dbVar is organized by Study. "Study" typically refers to a publication, however some Studies are community resources which are updated with new data on a regular basis (e.g., Clinical Genome Resource - ClinGen, nstd37). Each Study is given an accession that begins with -std:  nstd if the study was submitted to NCBI, estd if submitted to EBI, and dstd if submitted to DDBJ. The dbVar Study Browser (Figure 1) provides a summary of the studies in dbVar. The browser can be sorted by Study accession, Organism, Study Type, Method, Number of Variant Regions, Number of Variant Calls, or Publication. Links to the Study Page and PubMed summary for each study are provided. Filters to the right of the browser allow you to narrow content by Study Type, Method, or Number of Variant Regions.

study browser

Figure 1: dbVar Study Browser


dbVar Study Page

Each Study in dbVar has a Study Page that displays basic information about the study. At the top is a general information section describing basic information about the study, including links to BioProjectsPubMed, dbGaP, and the submitter's lab page (if available). This is followed by a Detailed Information section where you can download variant data for the current study (Variant Regions, Variant Calls, Both, or all via FTP) or browse details about Variant Summary, Samplesets, Experimental Details, or Validations in a tabbed format.

study page

Figure 2a: General Information and Variant Summary Tab (Study Page)  -   The Variant Summary tab displays the number of Variant Calls and Variant Regions for the current study on each chromosome, Placement type (i.e., whether data is available on the Submitted assembly and/or Remapped assemblies), and links to a graphical display of the variants on NCBI's Sequence Viewer. A pull-down menu above the table allows you to display the current study's data on the submitted assembly, or on any assemblies to which the data has been remapped.

Study Page - Samplsets tab

Figure 2b: Samplesets Tab (Study Page)  -  The Samplesets tab describes any logical division of samples in the study. Details include Name, Description, Size, and relevant Phenotypes, if any. Note that additional details about individual subjects in a sampleset can be found by downloading the Samples data. If subjects have not been consented to have their data displayed publicly (as in the example above), clinical information and links between genetic variants are stored behind controlled access at NCBI's dbGaP. Samples and variants are then anonymized before being forwarded to dbVar.

Figure 2c: Experimental Details Tab (Study Page)

Figure 2c: Experimental Details Tab (Study Page)  -  The Experimental Details tab provides information on the methods and analyses that were used in the study. Each unique combination of method and analysis is named an "Experiment." A complete list of allowable Method Types and Analysis Types can be found in the Excel submission template, or by clicking here. Experiments can be one of three types: Discovery, Validation, or Genotyping. The Experimental Details tab is also where you can access links to the study's raw data (e.g., sequence traces or array data stored in an external database such as Trace or Array Express).

Figure 2d: Validations Tab (Study Page)

Figure 2d: Validations Tab (Study Page)  -  If validation experiments were performed to confirm variant calls generated by discovery experiments, the validation experiments are assigned unique IDs and are listed along with the relevant method in the Validations tab. Individual validation results can be accessed by downloading the variant data itself from the Study Page. If no validation experiments were performed for a given study, the Validations tab will indicate that no validation data were submitted to dbVar.


dbVar Variant Page

Each dbVar Variant page displays detailed information about an accessioned Variant Region. At the top of the page is a section with general information about the Variant Region (Study, number of supporting Variant Calls, Region size, etc. - see Figure 3a), and a chromosome ideogram displaying the variant's location on the genome. This is followed by a tabbed section with complete variant details:  Genome View, Variant Region Details and Evidence, Validation Information, Clinical Assertions, and Genotype Information (see figure legends below for information on each of these tabs).

Variant Page

Figure 3a: General Information (top) and Genome View Tab (Variant Page)  -  The Genome View Tab shows the current variant in NCBI’s Sequence Viewer in the context of known genes and other variant data from the current study. Use the pull-down menu to select the assembly you want to view. Each black bar represents a Variant Region (the current region is centered and highlighted), while the colored bars represent Variant Calls that support each region.  The display default is to collapse all supporting variant calls, but this can be adjusted (for example, to show all individual calls) by clicking Configure… Variations… Show All. Clicking on any Variant Region or Variant Call will reveal a pop-up tooltip with information about the variant and a link to its dbVar page.

Variant Page – VRDE tab

Figure 3b: Variant Region Details and Evidence Tab (Variant Page)  -  This tab contains placement coordinates for the variant region on the assembly on which the region was originally submitted as well as assemblies to which it was subsequently mapped.  In the case of remapped placements, the Score column indicates the quality of the remap – Perfect, Good, or Pass (for details see dbVar Placements section below). Below the Variant Region information are details on the supporting Variant Calls that were merged to define the Region, including their placements (on Submitted and/or Remapped assemblies). Complete details can always be downloaded using the links above the tabbed section of the Variant Page.

Variant Page – Validation Info tab

Figure 3c: Validation Information Tab (Variant Page)  -  If any Variant Call results were validated with additional methods, this tab provides details on the methods and analyses used in the validation, and the results (Pass or Fail for each call tested).

Variant Page – Clinical Assertions tab

Figure 3d: Clinical Assertions Tab (Variant Page)  -  If in the course of a study an association was established between a Variant Call and a phenotype observed in a Subject, details are provided here.  The Variant Call ID is followed by the sample in which it was observed (with a link if the sample is publicly available), the type of event (insertion, deletion, copy number gain, etc.), the parental origin of the variant if it was determined, the phenotype associated with the variant, and the authors’ assessment of its likely pathogenicity. For a reminder of how studies involving sensitive clinical and genetic information are processed, please refer to the Figure 2b legend above.


Variant Call and Region Types

dbVar supports the types of structural variation listed in Table 1 below. Each variant call is submitted using one of the Variant Call Types in the first column; corresponding Sequence Ontology IDs and definitions are listed in the middle column; and Variant Region Types (for regions formed by merging similar calls) are indicated in the last column. (For an explanation of the differences between calls and regions, see our Overview of Structural Variation page.)

Variant Call Type Sequence Ontology ID Variant Region Type

complex substitution

SO:1000005 (complex substitution) When no simple or well defined DNA mutation event describes the observed DNA change, the keyword "complex" should be used. Usually there are multiple equally plausible explanations for the change.

complex substitution

copy number gain

SO:0001742 A sequence alteration whereby the copy number of a given region is greater than the reference sequence.

copy number variation

copy number loss

SO:0001743 A sequence alteration whereby the copy number of a given region is less than the reference sequence.

copy number variation

copy number variation

SO:0001019 A variation that increases or decreases the copy number of a given region.

copy number variation

deletion

SO:0000159 The point at which one or more contiguous nucleotides were excised.

copy number variation

duplication

SO:0001742 (copy number gain) A sequence alteration whereby the copy number of a given region is greater than the reference sequence.

copy number variation

delins

SO:1000032 A sequence alteration which included an insertion and a deletion, affecting 2 or more bases.

delins

insertion

SO:0000667 The sequence of one or more nucleotides added between two adjacent nucleotides in the sequence.

insertion

interchromosomal translocation

SO:0002060 A translocation where the regions involved are from different chromosomes.

translocation, complex chromosomal rearrangement

intrachromosomal translocation

SO:0002061 A translocation where the regions involved are from the same chromosome.

translocation, complex chromosomal rearrangement

inversion

SO:1000036 A continuous nucleotide sequence is inverted in the same position.

inversion

mobile element insertion

SO:0001837 A kind of insertion where the inserted sequence is a mobile element.

mobile element insertion

Alu insertion

SO:0002063 An insertion of sequence from the Alu family of mobile elements.

mobile element insertion

LINE1 insertion

SO:0002064 An insertion from the Line1 family of mobile elements.

mobile element insertion

SVA insertion

SO:0002065 An insertion of sequence from the SVA family of mobile elements.

mobile element insertion

HERV insertion

SO:0002187 An insertion of sequence from the HERV family of mobile elements.

mobile element insertion

novel sequence insertion

SO:0001838 An insertion the sequence of which cannot be mapped to the reference genome.

novel sequence insertion

mobile element deletion

SO:0002066 A deletion of a mobile element when comparing a reference sequence (has mobile element) to a individual sequence (does not have mobile element).

mobile element deletion

Alu deletion

SO:0002070 A deletion of an Alu mobile element with respect to a reference.

mobile element deletion

LINE1 deletion

SO:0002069 A deletion of a LINE1 mobile element with respect to a reference.

mobile element deletion

SVA deletion

SO:0002068 A deletion of an SVA mobile element with respect to a reference.

mobile element deletion

HERV deletion

SO:0002067 A deletion of the HERV mobile element with respect to a reference.

mobile element deletion

sequence alteration

SO:0001059 A sequence_alteration is a sequence_feature whose extent is the deviation from another sequence.

sequence alteration

short tandem repeat variation

SO:0002096 A kind of sequence variant whereby a tandem repeat is expanded or contracted with regard to a reference.

short tandem repeat variation

tandem duplication

SO:1000173 A duplication consisting of 2 identical adjacent regions.

tandem duplication
Table 1: Variant Call and Region Types


dbVar Variant Rendering

Rendering Breakpoint Ambiguity

Figure 4a: Rendering breakpoint ambiguity. There are four common scenarios for most variants (either SVs or SSVs) as shown in the table above. However, mixed cases with a defined breakpoint at one end and an undefined breakpoint range at the other end are possible as well. Here, we use (CNV SV) as examples.

Variant Rendering - Examples

Figure 4b: Rendering variant types. Variants are visually represented in several places: the Genome View tab of variant pages; the dbVar Genome Browser; and NCBI's Sequence Viewer. Colors distinguish types of structural variant (copy number gain/loss, insertion, inversion, etc.). Breakpoint ambiguity is represented by translucency and/or arrows at variant ends.


dbVar Placements

Most variants are submitted to dbVar as asserted locations on a particular assembly; this assembly is not always the current assembly. Additionally, all studies have not used the same assembly to do their analysis. In order to be able to compare data from different studies, it is necessary to obtain placements for all variants on the same set of assemblies. For most variants, we do this using the NCBI Remap Service using the following parameters:

  • Minimum ratio of bases that must be remapped: 0.5
  • Maximum ratio for difference between source length and target length: 4.0
  • The 'Merge' function is turned on.
  • Multiple locations can be returned.

We use relatively permissive parameters as many structural variants fall within regions of the genome that are likely to change from assembly to assembly. Additionally, NCBI Remap produces a coverage score that is calculated by taking the length of the feature in the target assembly and dividing it by the length of the feature in the source assembly. A score of 1 suggests the region is relatively unchanged between assemblies (note: single bases aren't assessed); a score of >1 suggests an insertion in the target assembly and a score of <1 suggests a deletion in the target assembly. We provide a qualitative assessment of the remapping status on our web pages:

  • Perfect: coverage score equal to 1
  • Good: coverage score within 5% of 1
  • Pass: coverage deviates from 1 by greater than 5%

A relatively small number of variants submitted to dbVar are defined by sequence. These are typically insertion sequences that could not be placed on the assembly used in the analysis of the study. Such sequences are submitted to GenBank/EMBL/DDBJ and tracked in dbVar using the assigned accession.version. These sequences are aligned to newer assemblies using a process developed in-house called the NG Aligner (unpublished). Sequences that are aligned with >95% coverage and 98% identity are considered placed.

It is in the nature of evolving assemblies that variants called on an earlier assembly sometimes will not map cleanly to a newer (presumably more accurate) assembly. In these situations the Remapper may output multiple alternative placements and, even though they are scored, one cannot know with certainty exactly where, and with what degree of certainty, a variant can be placed on the newer assembly. Therefore dbVar implements a set of rules for filtering remap results, with the goal of providing users our best estimates based on imperfect remapping data. In some cases variants may fail to remap altogether – these are indicated in dbVar FTP download files as having failed remap – and should be considered suspect. More information about assembly-assembly alignments and the meaning of 'First Pass' versus 'Second Pass' can be found on this Remapper page.

Assemblies in scope

dbVar displays placement data for the assembly used in the analysis of a study (the 'Submitted' placement). For most organisms, we will also attempt to find a placement for a variant on the 'current' assembly, if available. When an assembly is updated we will support the 'current' and 'previous' assembly for up to a year. For example, we currently support placements for human data on GRCh37 and GRCh38.


FTP Site

All data in dbVar can be downloaded from our FTP site: https://ftp.ncbi.nlm.nih.gov/pub/dbVar/data/. Data is organized by assembly and by study.  A complete listing of available downloads is available online.  If you download by assembly, please be aware that some Variant Calls or Variant Regions may have been submitted on an assembly other than the one being downloaded; these variants will have been remapped to reflect their corresponding placements on the downloaded assembly, placements for which can be found in a separate file in the FTP directory. For example, the GRCh37 assembly download includes all variants that were submitted on GRCh37, plus variants submitted on other assemblies and that were remapped to GRCh37. Remapped status is indicated in the download files.

More information about FTP site organization can be found here: https://ftp.ncbi.nlm.nih.gov/pub/dbVar/data/README.txt.


dbVar Entrez Search

A search bar is provided on the top of each dbVar page with a drop-down menu to search dbVar and other NCBI resources using Entrez (See Entrez Help for further details).

A dbVar search returns Studies, with links to the Study page, and Variants, with links to the Variant page. Filters on the left side of the page allow the user to narrow their results based on a wide variety of data fields.

The dbVar Advanced search page allows the user to build Entrez search queries by selecting search fields and values from drop-down menus.

dbVar Search Fields

The following fields are available to search dbVar:

Field name Field aliases Description
Accession ACC, ACCN Accession of any internal or external identifier. Versions removed from GENBANK accessions.
Age of Subject AGE Age of subject in years
All Fields ALL, * All terms from all searchable fields
Assembly ASSM Accession or name of placement assembly
Author AUTH, AU All authors included in journal
BioProject PROJ Accession or Name of associated BioProject
Chr CH, CHROM, CHROMOSOME Chromosome of placement
ChrEnd CHR_END, CHR_STOP, END End of placement on chromosome
ChrPos BASE, BASE_POSITION, CHR_POS, CHR_START, START Start of placement on chromosome
Clinical Interpretation CLIN, CLINICAL_ASSERTION Clinical interpretation of a variant (controlled vocabulary)
Clinical Phenotype DISEASE, PHENO, PHENOTYPE, PTYPE Phenotype of sample/subject study or reference specimen
Filter FILT, FLTR, SUBSET, SB, FIL Limits the records
Frequency FREQ, GLOB_FREQ Global Allele Frequency
Gene GENE_NAME, SYM Name or alias of gene at same location as variant
Has Clinical HASCLIN Flag indicating subject has Clinical Interpretation: 0=no; 1 = yes
MeSH MH Medical Subject Headings assigned to publication
MeSH ID MHID NLM MeSH Browser Unique ID
Method Type METH, METHOD Method type (controlled vocabulary)
OMIM MIM Online Mendelian Inheritance in Man
Object Type OT Object type in dbVar (STUDY, VARIANT)
Organism ORG, ORGN Organism name (exploded)
Origin ALLELE_ORIGIN Allele origin (controlled vocabulary), including Both=Germline+Somatic
Pathogenic Overlap Range PATHO_RNG Range of Reciprocal Overlap with a Pathogenic Variant
Population POP Subject population (major continental group) of calls with frequency data
Publication PMID, PUBMED_ID Unique identifier from PubMed
Publication Date PDAT, PUBDATE Journal Publication date
Sample SMPL Sample/subject ID of study or reference specimen
Sample Count SC, SCOUNT Number of samples in study
Sex of Subject GENDER, SEX Sex of Subject (controlled vocabulary)
Study STDY Study ID, alias, display name, or publication name
Study Type STYPE Study type assigned by NCBI
Subject Phenotype Status SUBPSTAT Flag indicating subject has a phenotype: 0=no; 1 = yes
Taxonomy ID TAX, TAX_ID, TAXID, TAXONOMY Taxonomy ID
UID UID Unique number assigned to study or variant in Entrez
Variant Size VLEN, VSIZE Size of variant
Variant Type VT, VTYPE, VARTYPE Type of variant region or call (controlled vocabulary)
Zygosity ZYG Zygosity of a variant (controlled vocabulary)


dbVar FAQs

Frequently asked questions (FAQs):

  1. What is 'dbVar'?
  2. How does dbVar differ from the Database of Genomic Variants (DGV)?
  3. What is ‘structural variation’?
  4. What types of structural variation data does dbVar accept?
  5. Does it matter how I detected the structural variation?
  6. Can I submit structural variation data from any organism?
  7. Can I submit structural variation data from human clinical or cancer studies?
  8. Does dbVar distinguish between pathogenic and benign variants?
  9. Does dbVar accept genotype data?
  10. Will I get a unique dbVar ID to use in publications?
  11. What is the difference between the dbVar accessions?
  12. Why is the data in dbVar different from that provided in the publication?
  13. Why do some dbVar variants have >1 location?
  14. What's the difference between an nsv and an nssv?
  15. Can I download the whole database?
  16. How do I know if a given variant(s) in dbVar is real (or of high quality)?
  17. How do I download all the data from a given region or gene?
  18. Is there a way to integrate dbVar data with dbSNP data?
  19. Can I include any clinical information in my data submission?
  20. My submission contains sensitive private clinical information. How does dbVar guarantee the information will remain private?
  21. What is the smallest size structural variant dbVar accepts?
  22. My study has identified novel insertion sequences and I don't have a genomic coordinate for these. How can I submit these to dbVar?
  23. The number of "Other Calls in this Sample" in the Variant Call Information table of the Variant Page is different from the number of variants that are returned when I do a dbVar search for the same study and sample ID. What's going on?
  24. How does dbVar place data submitted on one assembly (e.g. NCBI36) on other assemblies (e.g. GRCh37)?
  25. How do I submit to dbVar?
  26. How should I cite dbVar?


  1. What is 'dbVar'?

    dbVar is the NCBI database of human genomic structural variation. For information on how to navigate dbVar see the dbVar Help page.

  2. How does dbVar differ from the Database of Genomic Variants (DGV)?

    DGV has been a useful resource for the human genetics community with respect to collecting and curating structural variation data for human. DGV, dbVar and its European counter-part DGVa, contain data for healthy control human samples, while the latter two also include clinically relevant structural variation data from ClinVar.

  3. What is ‘structural variation’?

    Structural variation (SV) is generally defined as any region of DNA involved in inversions and balanced translocations or genomic imbalances (insertions and deletions), commonly referred to as copy number variants (CNVs). For more information see the Overview of Structural Variation page.

  4. What types of structural variation data does dbVar accept?

    dbVar is a structural variation database designed to store data on variant DNA ≥ 1 bp in size. Practically speaking, we recommend submitting variation data that is > 50bp to dbVar and variation data that is ≤ 50bp to dbSNP. We can accept diverse types of events, including inversions, insertions and translocations. We discourage the submission of somatic and cancer-related variants, as they tend to be complex and sample-specific, and are more appropriately stored in custom databases.

  5. Does it matter how I detected the structural variation?

    dbVar accepts submissions based on analysis of whole-genome or whole-exome next-generation sequencing (NGS) (including paired-end mapping, split-read pair, and read depth) as well as microarray-based experiments including array-CGH and SNP genotyping. A full list of supported methods and analyses can be found on the dbVar Help page.

  6. Can I submit structural variation data from any organism?

    Beginning on September 1, 2017 dbVar stopped accepting submissions for any non-human organisms. Non-human SV data can be submitted to DGVa.

  7. Can I submit structural variation data from clinical or cancer studies?

    No. All clinically relevant structural variation should be submitted to ClinVar or dbGaPWe discourage the submission of somatic and cancer-related SV to dbVar.

  8. Does dbVar distinguish between pathogenic and benign variants?

    Yes, dbVar will clearly mark and allow searching for variants that are known to be pathogenic, providing links to OMIM when available.

  9. Does dbVar accept genotype data?

    Yes, dbVar can accept genotype data. 

  10. Will I get a unique dbVar ID to use in publications?

    Yes, dbVar will provide a unique accession number for each study, each submitted variant region and each supporting level variant.

  11. What is the difference between the dbVar accessions?

    dbVar collaborates with DGVa at EBI and with JVar-SV at DDBJ to accession genomic structural variants. Accessions prefixed with 'n' have been processed by NCBI (dbVar, with 'e' by EBI (DGVa), and with 'd' by DDBJ (JVar-SV. NCBI, EBI, and DDBJ provide three levels of accessions:

    1. (n|e|d)std: the study id - this identifies a submitted study
    2. (n|e|d)sv: the structural variant id - this identifies the submitted region of variation
    3. (n|e|d)ssv: the supporting structural variant id - this identifies the supporting regions of variation (often sample-specific) that were used to call the submitted region of variation
  12. Why is the data in dbVar sometimes different from that provided in the publication?

    The loading of studies that are submitted to dbVar after publication may highlight errors during our quality control checks. In these cases submitters will be contacted and the errors corrected. In other cases a submitter may detect errors before submitting or may decide to edit their data. Often this will be documented in the study record.

  13. Why do some dbVar variants have more than one location?

    There are two reasons:

    When a submitter gives us data on an assembly obtained from UCSC, we translate this into native assembly coordinates and map sequences (chromosomes and unplaced/unlocalized sequences) to their accession.versions. UCSC concatenates unplaced/unlocalized sequences into pseudo-scaffold objects they call chr*_random. In some cases, submitters have provided data that cross gaps on the 'chr*_random' sequences, meaning that the feature actually maps to two different, unrelated sequences.

    More than one location may also be provided if the variant is the result of a transposition event. In this case, coordinates from both the donor and recipient sites are provided, to retain as much information about the variant event as possible.

  14. What's the difference between an nsv and an nssv?

    nsv and nssv are accession prefixes for variant regions and variant calls (or instances), respectively. Typically, one or more variant instances (nssv – variant calls based directly on experimental evidence) are merged into a variant region (nsv – a pair of start-stop coordinates reflecting the submitters’ assertion of the region of the genome that is affected by the variant instances). The ‘n’ preceding sv or ssv indicates that the variants were submitted to NCBI (dbVar). esv and essv represent the same variant entities, but those that were submitted to EBI (DGVa); similarly, dsv and dssv were submitted to DDBJ (JVar-SV).

    Please see Overview of Structural Variation for more information.

  15. Can I download the whole database?

    All data is available on our FTP site. Data are available on a per study and per assembly basis. Additionally, we have used the NCBI Remap Service to project submitted data onto current assemblies: GRCh37 (hg19) and GRCh38 (hg38) for human. 

  16. How do I know if a given variant(s) in dbVar is real / of high quality?

    dbVar is an archive. We report variants as they are submitted to us, usually in association with a peer-reviewed publication. Responsibility for data reproducibility lay with the submitting investigator. dbVar encourages, but does not require, the validation of all variants by at least one alternative method, and we provide a mechanism to include validation results in the submission template. Any submitted validation data are presented as an integral part of the study data. To be listed as “validated” a variant must have been confirmed with at least one, or possibly more, additional independent methods. If, as a consumer of dbVar data, have concerns regarding a particular variant or data set, we recommend you contact the relevant submitter for more supporting information.

  17. How do I find data from a given region or gene?

    You can perform searches using gene names. The results that are returned will include all variants that overlap the gene, and the studies with which the variants are associated. 

  18. Is there a way to integrate dbVar data with dbSNP data?

    Yes, Variation Viewer provides some functionality in this regard. Alternatively, the user can integrate the data manually with avilable bioinformatics tools. 

  19. Can I include any clinical information in my data submission?

    No, all submissions with clinically relevant structural variation should be submitted to ClinVar.

  20. My submission contains sensitive private clinical information. How does dbVar guarantee the information will remain private?

    Submisions with sensitive patient information should be submitted to dbGaP

  21. What is the smallest size structural variant dbVar accepts?

    There are no size restrictions on structural variation data. We recommend that variants smaller than 50 bp be submitted to dbSNP but we will accept variants as small as a single basepair as long as the variant is an insertion or deletion, not a single nucleotide change.

  22. My study has identified novel insertion sequences and I don't have a genomic coordinate for these. How can I submit these to dbVar?

    If you have novel insertion sequence data, please submit it first as a WGS Project. This will give all of your novel insertion sequences unique identifiers that can then be tracked. You can then submit your data to dbVar and these novel insertion sequences can reference the sequence identifiers obtained from the WGS submission. With stable sequence identifiers, we may be able to map the sequence to updated assemblies and obtain a chromosome context for this sequence.

  23. The number of "Other Calls in this Sample" in the Variant Call Information table of the Variant Page is different from the number of variants that are returned when I do a dbVar search for the same study and sample ID. What's going on?

    The number in the table on the Variant Page indicates the number of Variant Calls (SSVs) in the sample. A search for the study and sample ID returns the number of Variant Regions (SVs) after similar calls have been merged into regions.

  24. How does dbVar place data submitted on one assembly (e.g., GRCh37) on other assemblies (e.g., GRCh38)?

    dbVar uses NCBI's Remap tool to map variants between assemblies. All variants reported in human-based studies are automatically remapped to both GRCh37 and GRCh38. More information on the usage of the tool and a list of supported assemblies can be found on the Remap website.

  25. How do I submit to dbVar?

    We encourage you to submit data to dbVar by completing one of the templates we provide (Excel, Tab-delimited, XML) or by VCF, and emailing it to dbvar@ncbi.nlm.nih.gov. Please see dbVar Submission Information for more information.

  26. How should I cite dbVar?

    Please reference the following publication when citing dbVar:

    Lappalainen I, Lopez J, Skipper L, Hefferon T, Spalding JD, Garner J, Chen C, Maguire M, Corbett M, Zhou G, Paschall J, Ananiev V, Flicek P, Church DM. DbVar and DGVa: public archives for genomic structural variation. Nucleic Acids Res. 2013 Jan;41(Database issue):D936-41. doi: 10.1093/nar/gks1213. Epub 2012 Nov 27. PMID: 23193291; PMCID: PMC3531204.

    If you wish to reference a specific submission, cite the study accession, e.g., nstd166; if you wish to reference a specific variant, cite the variant region or variant call accession, e.g., nsv4136077 or nssv15910013.

Support Center

Last updated: 2021-03-28T11:25:56Z