U.S. flag

An official website of the United States government

Format

Send to:

Choose Destination

Download Assembly



NA12878_prelim_2.1

Description:
MGI Reference Genomes Project NA12878 preliminary assembly version 2
Organism name:
Homo sapiens (human)
Isolate:
NA12878
Sex:
female
BioSample:
SAMN05181962
BioProject:
PRJNA323611
Submitter:
The Genome Institute at Washington University School of Medicine
Date:
2017/08/10
Assembly level:
Contig
Genome representation:
full
GenBank assembly accession:
GCA_002077035.2 (replaced)
RefSeq assembly accession:
n/a
RefSeq assembly and GenBank assembly identical:
n/a
WGS Project:
NBMU01
Assembly method:
Falcon v. November 2016
Expected final version:
no
Genome coverage:
75x
Sequencing technology:
PacBio RSII

IDs: 1167961 [UID] 4912718 [GenBank]

See Genome Information for Homo sapiens

There are 1094 assemblies for this organism

See more

History (Show revision history)

Comment

This chromosome-level assembly of the NA12878 genome, NA12878_prelim_3.0, is a draft and represents a work in progress. It will subsequently be re-submitted with BACs incorporated into regions of the genome that are difficult to assemble. Also, single allelic representations ... of specific regions will be added when available.
Sequence Assembly Release Notes for Homo sapiens NA12878_prelim_3.0
 Background:
 DNA used for shotgun sequencing is derived from the blood, b-lymphocytes cells, of an adult female, identified as NA12878 (Coriell Institute for Medical Research). The NA12878 genome is diploid and from a CEPH/UTAH pedigree 1463. Sequence from this project will be used to improve the contiguity of the human reference sequence and add diverse allelic variation.
 Total sequence (subreads) input coverage on the PacBio RS II instrument was 70x prior to error correction using a genome size estimate of 3Gb. The combined sequence reads were assembled using the Falcon software, and then error corrected using the Quiver and Pilon algorithms. Contigs of 200 bp or less have been excluded from NA12878_prelim_3.0.
 This work was supported by the NHGRI 'Improving The Human Reference Genome Resource' grant no. 5U41HG007635 to Richard K. Wilson, at the McDonnell Genome Institute, Washington University School of Medicine.
 DNA Source Contact: Dr. Fedik Rahimov, at the Coriell Institute for Medical Research.

24 contigs have been split and replaced by 52 new contigs, so there are only 3663 live contigs.  more

Global statistics

Total sequence length2,858,850,980
Total ungapped length2,858,850,980
Number of contigs3,663
Contig N5014,520,880
Contig L5065
Total number of chromosomes and plasmids0
Number of component sequences (WGS or clone)3,663

Supplemental Content

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...

Global assembly definition

Download the full sequence report
Click on the table row to see sequence details in the table to the right
Assembly Unit Name
Primary Assembly
The primary assembly unit does not have any assembled chromosomes or linkage groups.
Please download the full sequence report for information on the scaffolds.

Assembly statistics

MoleculeTotal
Length
Contig
Count
Ungapped
Length
Contig
N50
Spanned
Gaps
Unspanned
Gaps
unplaced2,858,850,9803,6632,858,850,98014,520,88000