1000 Genomes Browser

The 1000 Genomes Browser allows users to explore variant calls, genotype calls and supporting sequence read alignments that have been produced by the 1000 Genomes project. Users can access genotype data from the Phase 3 May 2013 call set. As of August, 2016, the browser no longer supports the Phase 1 March 2012 call set, though the data remains available from the project FTP site. The SNP data in the genotypes table were updated in June 2015 to dbSNP build 144 .

Using the Page

The 1000 Genomes Browser page consists of a series of page 'widgets' that interact showing data from the 1000 Genomes project. The widgets interact such that an action in one widget causes other widgets on the page to update. For instance, clicking on a chromosome in the 'Genome Overview' will update all other widgets on the page. Release notes are available for each browser version.

Page layout

The following links are available in the browser header:

Page Overview

Below is an overview of the 1000 Genomes Browser page:

NCBI 1000G browser homepage

Figure 1: A page overview with individual page widgets highlighted in different colors.

Page Widgets

Below is an overview for each widget on the page.

Ideogram View

Ideogram widget

Figure 2. Ideogram View

The Ideogram View (figure 2) provides viewing context. The chromosome that is currently in view is surrounded by a green highlight. This widget can also be used for navigation. Clicking on a chromosome will make it the selected chromosome and update the rest of the page to show data for this chromosome. After performing term-based searches in the Search widget (figure 6), the locations of Search results will appear as annotations on the ideogram.

Chromosome Overview

Chromosome overview widget

Figure 3A. Chromosome Overview widget.

Chromosome overview pop-up

Figure 3B. Chromosome Overview Region Tooltip.

The Chromosome Overview widget (figure 3A) provides context and navigation for the page. The blue overlay shown on the ideogram image covers the amount of the chromosome shown in the Sequence Viewer (described below). The sides of this box (outside of the dark blue line) can be selected with the mouse and moved to adjust the size of the box. Making the box smaller is the equivalent of 'zooming' in on a region. You can also select the center of the blue box and drag it to another location on the chromosome. Alternatively, if you adjust the location shown in the sequence viewer using another widget on the page, the blue box will adjust to reflect the location. The thin line beneath the ideogram shows regions of the chromosome for which there are alternate loci or patch scaffold sequence representations. Mousing over any of the filled portions of the line will open a region-specific tooltip (figure 3B) containing the region's name and coordinates, as well as the sequence identifiers of associated alternate loci or patches.

Exon Navigator

Exon navigator widget

Figure 4. Exon Navigator

The Exon Navigator is the blue bar located below the Chromosome Overview widget. When a region of sequence is displayed that includes one or more genes, the symbols for those genes are provided in the menu labeled Gene. You can navigate quickly to a gene by clicking on its symbol. When a single gene is selected, you can move up and down the chromosome, one gene at a time, by clicking on the double arrows (circled red above) to the previous gene (left) or next gene (right) of the Gene selector region. Clicking on the 'Region' link provides a menu that you can use to configure how the gene is displayed in the Sequence viewer.

When a gene is selected there will be one or more open circles at the right representing the exons of the gene (highlighted in yellow above). Navigate to a particular exon by clicking on the appropriate circle; move to the upstream or downstream exon by clicking on the appropriate arrow flanking the exon selector.

Sequence Viewer

Sequence Viewer widget

Figure 5. NCBI Sequence Viewer

This is the NCBI Sequence Viewer (figure 5), an embeddable widget that provides graphical representation of features annotated on individual sequences. Access to other NCBI tools (such as BLAST) are also available within this widget. The 'Tracks' button (in the upper right corner) allows you to add or remove tracks from the display, configure how tracks are displayed and to upload your own data. Additional information on using the Sequence Viewer (which is embedded on many NCBI pages) can be found here: https://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/tools/sviewer/ .

There are also some videos available on the NCBI YouTube channel that can give you a quick introduction to using the Sequence Viewer.

Note: There are many display options for alignment tracks. By default, forward and reverse alignments are shown in separate sub-tracks. Alignment tracks will be listed in the 'Genomic SRA Alignments' tab in the 'Tracks' dialog. Select the track you wish to adjust and the display options will be shown in the bottom part of the window. You can configure all alignment tracks by configuring just one of them, and then clicking the 'Configure Group cSRA'. Note: Alignments with a solid pink background represent PCR duplicates, while reads with a solid deep-red background are PCR errors.

Search Examples

Figure 6. Search widget shown with Search Examples

The Search widget (figure 6) will accept a location directive, such as chr1:1,500,000-2,000,000 or a search term (such as 'PTEN' or 'rs13432'). The 'Search examples' link provides a list of acceptable formats for providing location information. Location information can be provided using a range, a single point or a cytogenetic band. If you search for specified coordinates, the other widgets on the page will update to that location. If you perform a term search, a new window will pop up providing a list of results and their locations (figure 7). The 'Genes' tab lists genes matching the search term. The 'Other Features' tab lists transcripts, phenotypes and Sequence Tagged Sites (STS) associated with the search term. If the first search result is an exact match to your search term, it will be automatically selected. Otherwise, click on a row in either tab of the search results and the rest of the page will update to go to that location. Additionally, the Ideogram View panel shows the genome-wide locations of your search terms.

Search widget

Figure 7: Search Results widget. Clicking on a row will update the page to the corresponding location.

Subjects Selection

Subjects Widget

Figure 8. Subjects Selection widget.

The Subjects Selection widget (figure 8) allows you to add tracks of SRA alignment files (1000 Genomes submitted BAM files converted to cSRA format) to the Sequence Viewer so that you can view the read alignments generated for a particular sample. When you first come to the page, this section will be minimized. To see all of the data available, click on the arrow symbol at top right of the section to maximize the widget. You can further resize the widget by dragging its edges. Once maximized, there are three tables. You can use the icons flanking the center table (blue arrows, figure 8) to resize the table heights. The uppermost table (Tracks in view) lists the alignment tracks currently displayed in the Sequence Viewer. The bottom-most table (Available Tracks) is a list of all alignment tracks associated with the phase 3 data release. You can search the "Available Tracks" by entering terms into the Search box located at the top right of this table. (Note: Quotes and wild-cards are recognized, and space-separated terms are treated as "OR"). Use the Pencil icon at the far right of the table headers to select or reorder the columns displayed in both tables.

The center table is used to filter the list of tracks shown in the "Available Tracks" table. The left-most column of this table lists the filters. Individual filters are shown in the remaining columns and list the filter values and counts (red arrow, figure 8). Note: Values with count = 0 are not shown. Clicking a "+" icon in the leftmost column adds a column for the corresponding filter. Clicking a "-" icon removes the column for that filter. To apply filters to the Available Tracks table, check the boxes next to the filter values. The filter values and "Available Tracks" table will then update. To clear filter value selections, click the "x" icon in each filter column. To clear all filters and filter value selections, click the "Reset" button in the "Select Filters" column. The filters and their descriptions are found below in Table 1 of this help documentation.

To add an alignment to the Sequence Viewer window, click on the box in the first column of each row of interest in the "Available Tracks" table. You can shift-click to check/uncheck a range of tracks. When you have selected all of the alignment tracks you want to see, click on the "Update tracks" link below the "Available Tracks" table. Selected alignments will be listed in the "Tracks in view" table at the top of this widget. To remove tracks , uncheck the box in the first column of either the "Tracks in view" or "Available Tracks" tables and click on the "Update tracks" link. You can also remove the tracks from within Sequence Viewer, either by clicking on the "x" symbol at the right side of each track or through the Configure menu.

Included file 'help-table1.inc' not found

Add Tracks

Add Tracks widget

Figure 9. Add Tracks widget.

The "Add Tracks" widget (Figure 9) accepts GEO, SRR, or dbGaP accessions as input. If the accession is associated with an existing track in the NCBI database, that track will be added to the graphical display. Note: The track must refer to the same assembly as the one currently in the display in order to view data.

Region Details

Region details widget

Figure 10. Region Details widget.

The 'Region Details' widget (figure 10) provides information about other features of interest relevant to the sequence region shown in the Sequence viewer display. The 'Other sequence representations' table provides the sequence identifiers of alternate loci and patch scaffolds that are associated with the displayed sequence. Clicking on any row in this table will update the Sequence viewer display to the selected sequence, and a red arrow will appear beneath the chromosome navigator showing the chromosome region to which the sequence corresponds. To return to the chromosome sequence, click on the named region in the widget.

The Genome Reference Consortium (GRC) is the group responsible for the improvement of the human reference genome assembly. The widget also shows the number of GRC issues corresponding to the region shown in the sequence viewer display. Clicking the 'Add Track' link will add a track to the Sequence viewer display that includes the names and locations of these GRC issues. Clicking on the link showing the count of issues in the displayed region will take you to the GRC website, where you will find issue-specific pages describing GRC efforts to update the assembly in this region.

History

History widget

Figure 11. History widget.

The History widget (figure 11) displays your recent gene searches. Select a gene from the drop down menu to navigate to it.

Genotypes

Genotypes table overview

Figure 12. Genotype widget.

The Genotype table (figure 12) provides access to individual level genotypes and population allele frequencies for the phase 3 callset. The entire table can be hidden from view by clicking the icon to the left of the table title. Within the table, individuals are grouped by 1000 Genomes population, and by default each population section is closed. When the population section is closed the population allele frequencies or the allele counts are displayed (this option is configurable at the top of the table). To see individual level genotypes, click on the arrow next to the population abbreviation. In the figure above, the CEU (Utah Residents (CEPH) with North and Western European Ancestry) population section has been opened to reveal the individual genotypes. The checkbox next to each subject ID in this table can also be used to add BAM files to the Sequence Viewer display. Clicking on the checkbox will produce a new dialog allowing you to select the BAM files of interest available for that Subject (figure 13).

Genotype BAM track addition dialog

Figure 13. Dialog for BAM track addition from Genotypes table.

Data within the genotype table is organized as follows:

Typically, there will be more SNPs in a given region than can be displayed in the genotype table. When the graphical view is on a large region with lots of SNPs, the SNP track will be a histogram by default. As you zoom in, the SNP track will automatically adjust to show individual features. You can also force the SNP track to display individual features by using the 'Tracks' button in the upper right hand corner of the sequence viewer. If you select an individual SNP feature in the sequence view, the corresponding column in the genotype column will be highlighted (if it exists). Additionally, as you zoom into to a smaller region, SNPs displayed in the genotype table that are no longer in the sequence viewer range will be 'greyed' out so that it is clearer which SNPs are actually in your view. You can lock a column in the genotype table to keep it in view even when it falls outside the range shown in the Sequence viewer.

The Genotype table can be configured to show the alleles or samples as frequencies or actual counts. In addition, display options for the table can be user-set. Figure 14 shows how to change the settings.

Genotypes table configuration options

Figure 14. Frequency and Counts setting.

Genotypes shown with a vertical bar ('|') are phased genotypes while genotypes shown with a slash ('/') are unphased. If the genotype for all individuals in a population matches the reference genotype, the genotypes will be shown in gray.

Genotype Table Tooltips

Hovering over individual cells in the the first row of the genotypes table opens a tool tip with more data for that variant. Clicking on an individual genotype for any variant will also open a tool tip with more information.

Genotype Tooltip

Figure 15. Genotype table, variant tooltip.

Genotype Tooltip Detail

Figure 16. Genotype table, genotype tooltip.

The tooltips for the variant columns and genotype cells contain any extra information known about that entity (figures 15 and 16). A given feature may only have data for a subset of the possible labels available. The amount of additional information differs among variants and the ph1/ph3 call sets. Note that the tooltip for the genotype will also include the data labels from the variant column to which it belongs. A description of the data labels may be found in Table 2 below.

Abbreviation Definition
LDAF
MLE Allele Frequency Accounting for LD
AVGPOST
Average posterior probability from MaCH/Thunder
RSQ
Genotype imputation quality from MaCH/Thunder
ERATE
Per-marker Mutation rate from MaCH/Thunder
THETA
Per-marker Transition rate from MaCH/Thunder
CIEND
Confidence interval around END for imprecise variants
CIPOS
Confidence interval around POS for imprecise variants
END
End position of the variant described in this record
HOMLEN
Length of base pair identical micro-homology at event breakpoints
HOMSEQ
Sequence of base pair identical micro-homology at event breakpoints
SVLEN
Difference in length between REF and ALT alleles
SVTYPE
Type of structural variant
AC
Alternate Allele Count
AN
Total Allele Count
AF
Global Allele Frequency based on AC/AN
AMR_AF
Allele Frequency for samples from AMR based on AC/AN
ASN_AF
Allele Frequency for samples from ASN based on AC/AN
AFR_AF
Allele Frequency for samples from AFR based on AC/AN
EAS_AF
Allele Frequency for samples from EAS based on AC/AN
EUR_AF
Allele Frequency for samples from EUR based on AC/AN
VT
indicates what type of variant the line represents
Table 2: data labels for variant and genotype tooltips.

Custom Tracks

Your Data

The 'Your Data' widget (figure 17) allows you to add custom tracks for display in the Sequence Viewer alongside NCBI-provided tracks, by either uploading files or streaming data from remotely-hosted files. To start, click the "+" button. Uploaded tracks will expire 60 days after they are last touched in the browser; streamed tracks will persist until you remove them. If you are logged in with your My NCBI account and uploading/streaming human data, your custom tracks will also be available to you in other NCBI browsers displaying the same assembly version as your data (e.g. Variation Viewer or GeT-RM).

Your Data Widget

Figure 17. Your Data widget

Uploading files: You can upload files by selecting one of the following menu options: "Add Files", "Add URL" or "Add Text", or by dragging the file into the widget. The following file types are supported: BED, GFF3, GTF, GVF, VCF, HGVS, ASN.1 (text and binary). Concurrent upload of multiple files is supported. The per-file upload limit is currently 4 GB. The file name will be used as the track's display name, unless an (optional) alternate name is provided. Once uploaded, the files will appear in the 'select tracks' drop-down menu in this widget.

Remote streaming files: BAM files hosted on HTTP can be streamed for display in the 1000 Genomes browser. To add these data as tracks, select “Add Remote Track” from supported files menu, and enter the corresponding URL in the display. Note that an index file with the .bai extension must be located at the same location as the BAM file. The file name will be used as the track’s display name, unless an (optional) alternate name is provided. A progress bar will indicate the status of the connection and validation processes. Once connected, remotely hosted files will appear in the graphical display and be listed in the 'select tracks' drop-down menu of the “Your Data” widget, designated by “(R)”. Other file types and BAM files hosted on FTP are not yet supported. To request streaming support for additional file types, click the "Support Center" link located at the lower right of the browser page.

BLAST: You can also enter the RID from a BLAST search into the Your Data widget. The results will appear as an alignment track in the display.

If your track has discrete features (e.g. SNPs or gene annotations, rather than graphs or alignments) on the currently displayed sequence, you can display those features in a paginated table if you select that track from the "Your Data" drop-down menu (figure 18). Hovering the mouse over a table row opens a tool-tip with feature details. Click on a row to go to the location of that that feature in Sequence Viewer. Use the menus below the table for additional table naviation. To remove uploaded or remote tracks, select the track from this drop-down menu and then click on the 'minus' icon next to the track name. Note: A selected track will be marked by a check-mark in this menu, regardless of whether it has data suitable for tabular display.

Your Data Table

Figure 18. Your Data feature table

The progress bar at the bottom of the window indicates the status of the upload or initial connection to the data file (figure 18). If validation detects the presence of target sequences in your uploaded or streamed remote file that are not part of the assembly displayed in the 1000 Genomes browser, those sequences will be reported as errors, but you will still be connected and able to view tracks for all target sequences in the file that are part of the assembly. To see any warning or error messages associated with uploading or streaming your custom track data, click on the "Details" option that will be a part of the status message.

File uploads, streaming of remotely hosted files and BLAST results can also be managed via the track configuration menu found under the 'Tracks' button at the top right of the Sequence Viewer display. Files uploaded or connected by this mechanism will also appear in the 'Your Data' widget.

Downloading Data

Phase 3 Dataset Download Widget

Figure 19. Phase 3 dataset Download widget.

Expanding the "Downloads" widget opens a new dialog box for downloads of alignment and genotype data (figure 19).

Phase 3 dataset

The Download widget is used to download SRA (alignment) and genotype data from the browser (figure 18). The SRA toolkit, accessed via this widget, is used for downloads of SRA (alignment) data for the displayed region. The Download widget also provides a link to documentation for the toolkit install and configuration. The 'Show command link parameters' link in the Download widget provides you with the parameters needed for download via the SRA toolkit. Note: In order to download SRA data, you must have alignment data displayed from selected samples (in the 'Tracks in View' table in the Subject Selection widget). The Download widget also provides a link to the SRA Run Selector to access all alignment data for selected samples (not restricted to the displayed region).

Using the Download widget, you can also download genotype data in VCF format for the displayed region by clicking the link in the 'Download genotype data' section. You can choose to download data containing either the individual genotypes or the population level aggregate statistics. By selecting the filter option, you can restrict the download to selected samples or populations with selected samples. Downloads are limited to the first 1 million positions for the selected range.

You can also download genotype data for a single position using the Genotypes table. The tool tips for the variants in the table include a link to "Download data for this position" (figure 15).

Controlling the view with URL Parameters

There are several parameters that can be added to the URL for the 1000 Genomes browser to control the view's position, add markers and specify SNP locations. For constructing URLs, the base URL is: https://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/variation/tools/1000genomes/?<List of parameters>.

Available Tracks

Several data tracks are available for display in the sequence viewer (in addition to the BAM files that can be added as described above).

Click here to see the Available Tracks

For more information on rendering options for particular tracks and feature types, please see this website.

Support Center

Last updated: 2018-04-26T20:47:33Z