Transcriptome analysis of the mud crab (Scylla paramamosain) by 454 deep sequencing: assembly, annotation, and marker discovery

PLoS One. 2014 Jul 23;9(7):e102668. doi: 10.1371/journal.pone.0102668. eCollection 2014.

Abstract

In this study, we reported the characterization of the first transcriptome of the mud crab (Scylla paramamosain). Pooled cDNAs of four tissue types from twelve wild individuals were sequenced using the Roche 454 FLX platform. Analysis performed included de novo assembly of transcriptome sequences, functional annotation, and molecular marker discovery. A total of 1,314,101 high quality reads with an average length of 411 bp were generated by 454 sequencing on a mixed cDNA library. De novo assembly of these 1,314,101 reads produced 76,778 contigs (consisting of 818,154 reads) with 5.4-fold average sequencing coverage. The remaining 495,947 reads were singletons. A total of 78,268 unigenes were identified based on sequence similarity with known proteins (E≤0.00001) in UniProt and non-redundant protein databases. Meanwhile, 44,433 sequences were identified (E≤0.00001) using a BLASTN search against the NCBI nucleotide database. Gene Ontology (GO) analysis indicated that biosynthetic process, cell part, and ion binding were the most abundant terms in biological process, cellular component, and molecular function categories, respectively. Kyoto Encyclopedia of Genes and Genome (KEGG) pathway analysis revealed that 4,878 unigenes distributed in 281 different pathways. In addition, 19,011 microsatellites and 37,063 potential single nucleotide polymorphisms were detected from the transcriptome of S. paramamosain. Finally, thirty polymorphic microsatellite markers were developed and used to assess genetic diversity of a wild population of S. paramamosain. So far, existing sequence resources for S. paramamosain are extremely limited. The present study provides a characterization of transcriptome from multiple tissues and individuals, as well as an assessment of genetic diversity of a wild population. These sequence resources will facilitate the investigation of population genetic diversity, the development of genetic maps, and the conduct of molecular marker-assisted breeding in S. paramamosain and related crab species.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Arthropod Proteins / genetics*
  • Brachyura / genetics*
  • Gene Expression Profiling*
  • Gene Library
  • Gene Ontology
  • High-Throughput Nucleotide Sequencing / methods*
  • Microsatellite Repeats / genetics
  • Molecular Sequence Annotation
  • Polymorphism, Single Nucleotide
  • Signal Transduction / genetics

Substances

  • Arthropod Proteins

Grants and funding

This study was supported by the National Natural Science Foundation of China (Grant No. 31001106), the Natural Science Foundation of Shanghai (Grant No. 14ZR1449700), the Science and Technology Commission of Shanghai Municipality (Grant No. 10JC1418600), and the National Non-Profit Institutes (East China Sea Fisheries Research Institute, Grant No. 2011M05). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.