U.S. flag

An official website of the United States government

Format

Send to:

Choose Destination

Download Assembly



mArvAmp1.2

Organism name:
Arvicola amphibius (Eurasian water vole)
BioSample:
SAMEA994740
BioProject:
PRJEB39550
Submitter:
Wellcome Sanger Institute
Date:
2021/05/15
Assembly level:
Chromosome
Genome representation:
full
RefSeq category:
representative genome
GenBank assembly accession:
GCA_903992535.2 (latest)
RefSeq assembly accession:
GCF_903992535.2 (latest)
RefSeq assembly and GenBank assembly identical:
yes
WGS Project:
CAJEUG02
Genome coverage:
45x

IDs: 10049691 [UID] 26733828 [GenBank] 27360728 [RefSeq]

See Genome Information for Arvicola amphibius

There are 2 assemblies for this organism

See more

History (Show revision history)

Comment

The assembly mArvAmp1.2 is based on 45x PacBio data, 52x 10X Genomics Chromium data, BioNano data and 6x Dovetail Hi-C data generated at the Wellcome Sanger Institute. The assembly process included the following sequence of steps: initial PacBio assembly ... generation with Falcon-unzip, retained haplotig separation with purge_dups, 10X based scaffolding with scaff10x, BioNano hybrid-scaffolding with Solve, Hi-C based scaffolding with SALSA2, Arrow polishing, and two rounds of FreeBayes polishing. Finally, the assembly was analysed and manually improved using gEVAL. Chromosome-scale scaffolds confirmed by the Hi-C data have been named in order of size.  more

Global statistics

Total sequence length2,297,766,297
Total ungapped length2,291,739,355
Gaps between scaffolds0
Number of scaffolds215
Scaffold N50158,924,400
Scaffold L507
Number of contigs1,085
Contig N505,392,280
Contig L50127
Total number of chromosomes and plasmids18
Number of component sequences (WGS or clone)215

Supplemental Content

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...

Global assembly definition

Download the full sequence report
Click on the table row to see sequence details in the table to the right
Assembly Unit Name
Primary Assembly
Assembly Unit: Primary Assembly (GCF_903992534.2)
Molecule nameGenBank sequenceRefSeq sequenceUnlocalized
sequences count
Chromosome 1LR862380.1=NC_052047.10
Chromosome 2LR862381.2=NC_052048.20
Chromosome 3LR862382.1=NC_052049.10
Chromosome 4LR862383.1=NC_052050.10
Chromosome 5LR862384.1=NC_052051.10
Chromosome 6LR862385.2=NC_052052.20
Chromosome 7LR862386.1=NC_052053.10
Chromosome 8LR862388.1=NC_052054.10
Chromosome 9LR862389.2=NC_052055.20
Chromosome 10LR862390.1=NC_052056.10
Chromosome 11LR862391.2=NC_052057.20
Chromosome 12LR862392.2=NC_052058.20
Chromosome 13LR862393.1=NC_052059.10
Chromosome 14LR862394.1=NC_052060.10
Chromosome 15LR862395.1=NC_052061.10
Chromosome 17LR862397.2=NC_052063.20
Chromosome 18LR862398.1=NC_052064.10
Chromosome XLR862387.1=NC_052065.10
unplacedn/an/an/a197

Assembly statistics

MoleculeTotal
Length
Scaffold
Count
Ungapped
Length
Scaffold
N50
Spanned
Gaps
Unspanned
Gaps
All2,297,766,2972152,291,739,355158,924,4008700
Chromosome 1200,529,1551199,847,721200,529,155770
Chromosome 2193,958,7091193,418,769193,958,709750
Chromosome 3189,600,0821189,370,107189,600,082660
Chromosome 4161,325,5711161,127,733161,325,571630
Chromosome 5160,717,6441160,020,179160,717,644590
Chromosome 6158,924,4001158,802,281158,924,400350
Chromosome 7138,658,5831137,992,532138,658,583440
Chromosome 8131,410,7841131,231,252131,410,784410
Chromosome 9125,825,4911125,739,044125,825,491340
Chromosome 10125,086,4241124,185,360125,086,424460
Chromosome 11123,987,2921123,891,611123,987,292360
Chromosome 12166,747,4891165,970,833166,747,489650
Chromosome 1375,712,621175,632,37375,712,621310
Chromosome 1463,161,238162,924,58463,161,238220
Chromosome 1555,452,225155,449,83655,452,225220
Chromosome 1742,645,200142,520,62342,645,200340
Chromosome 1833,208,378133,206,49433,208,378170
Chromosome X137,701,9501137,484,027137,701,950570
unplaced13,113,06119712,923,99696,741460