Searching dbSNP in Entrez

Table of contents

How to construct queries

dbSNP is part of NCBI's network of Entrez databases. As with these other databases, data of interest may be located simply by entering keywords into the dbSNP search box. The Advanced Search page, linked below the dbSNP search box, can assist in the construction of complex queries. To construct a complex query, specify the search terms, their fields, and the Boolean operations to perform on the terms using the following syntax:

term[field] OPERATOR term[field]

where term is the search terms, field is the search field, and OPERATOR is the Boolean operator ('AND', 'OR', 'NOT'; must be capitalized).

Common Query fields and examples

Field full name Field aliases Description Search term values and rules Example
All Fields ALL, * Search all searchable (indexed) fields Asterisk (*) in the search term is not interpreted as a wildcard SNV AND pathogenic
Base Position POSITION, SNPPOS Chromosome base position on GRCh38 (current) A natural number representing the SNP's start coordinate on its chromosome on the latest assembly (ie. GRCh38). Most useful when search in combination with the CHR field. 19956018[POSITION] AND 8[CHR]
Base Position Previous POSITION_GRCH37, CHRPOS_PREV_ASSM Chromosome base position on GRCh37 (previous) A natural number representing the SNP's start coordinate on its chromosome on the previous assembly (ie. GRCh37). Most useful when search in combination with the CHR field. 19813529[POSITION_GRCH37] AND 8[CHR]
Chromosome CHR, CHRNUM Chromosomes One of 1-22, X, Y, MT 7[CHR]
Clinical Significance CLIN Variations with defined clinical effects or significances 16 search term values, defined for a relatively small subset of SNPs. "likely pathogenic"[CLIN]
Filter FILT, FLTR, SUBSET, SB, FIL Limits the records returned A variety of filters is available, including functional, positional, source, etc. get all dbSNP records "all[sb]" or subsets "splice 5 snp"[Filter]
Function Class FXN, Function_class, FUNC, FUNCTION, FUNCTION_CLASS Function class 21 function classes are defined "frameshift"[Function Class]
Gene Name GENE, GENE_SYMBOL Entrez Gene symbol Corresponds to the Official Symbol field in the Entrez Gene resource MAPK1[GENE]
Gene ID GENE_ID Entrez Gene UID The numeric ID referencing the Entrez Gene ID 5594[GENE_ID]
Global Minor Allele Frequency GMAF Minor Allele Frequency derived from global population (i.e., 1000G); can also be study-wide MAF that is not from global population Most useful when entered as a range, as in the example (0.0[GMAF] : 0.01[GMAF])
Project or Submitter Handle HAN, PROJECT Submitter Handle or Project Name Submitter lab or project name including 1000Genomes, GnomAD, and DebNick 1000genomes[Submitter Handle] or 1000genomes[PROJECT]
Reference SNP ID RS, SNPID Clustered SNP ID (rs) The numeric ID must be prefixed with "rs". Also retrieves SNPs that have been merged into the specified SNP. rs328[RS]
SNP Class SCLS, SNPCLASS SNP class Possible values are: "del", "delins", "ins", "mnv", and "snv". del[SNPCLASS]
Submitter SNP ID SS, SSNUM The ID assigned to each report of a SNP at submission time Must be prefixed with "ss". Note that the query still returns Reference SNPs rather than Submitter SNPs. ss329[SS]
Validation Status VALI, VALIDATE, VALIDATION Validation status Possible values are: "by cluster" or "by frequency" "by cluster"[Validation Status]

Complex queries and others

Description Query Note
Variant allele with MAF = 0 "00000.0000"[Global Minor Allele Frequency] variant allele is homozygous and may be due to differences between assembly versions
Pathogenic variants in BRCA1 with MAF < 0.01 "pathogenic"[Clinical Significance] AND BRCA1 AND 00000.0000:00000.00999[GLOBAL_MAF] set GLOBAL_MAF range between 0 and 0.00999 for MAF < 0.01
Common variant (MAF => 0.01) 00000.0100[GLOBAL_MAF] : 00001.0000[GLOBAL_MAF] set GLOBAL_MAF range from 0.0100 and 1.0000 for MAF => 0.01
1000Genomes common variant (MAF => 0.01) not found by TOPMED "1000genomes"[Submitter Handle] NOT "topmed"[Submitter Handle] AND 00000.0100: 00001.0000[GLOBAL_MAF] set GLOBAL_MAF range from 0.0100 and 1.0000 for MAF => 0.01

Users can search dbSNP with HGVS names, as shown with the example: NM_000237.3:c.1421C>G OR NG_013007.1:g.7147G>A

ALFA Frequency

All SNPs with ALFA frequency

To retrieve the complete list of variations with ALFA frequency information, we can use this query: all[sb] AND by alfa[Validation]

'by-ALFA' Facet

There is a 'by-ALFA' facet under 'Validate Status' filters, which can be used to further filter out the search result.

For each RefSNP in the search result, there is an 'ALFA' link that leads to frequency by population.

Protein Position

It's also possible to search by amino acid variation at a protein sequence level. The image below shows the result for a search string 'Glu7Gly'.

Support Center

Last updated: 2020-06-19T15:59:44Z