Transposable element annotation in non-model species: The benefits of species-specific repeat libraries using semi-automated EDTA and DeepTE de novo pipelines

Mol Ecol Resour. 2022 Feb;22(2):823-833. doi: 10.1111/1755-0998.13489. Epub 2021 Sep 2.

Abstract

Transposable elements (TEs) are significant genomic components which can be detected either through sequence homology against existing databases or de novo, with the latter potentially reducing the risk of underestimating TE abundance. Here, we describe the semi-automated generation of a de novo TE library using the newly developed EDTA pipeline and DeepTE classifier in a non-model teleost (Corydoras fulleri). Using both genomic and transcriptomic data, we assess this de novo pipeline's performance across four TE based metrics: (i) abundance, (ii) composition, (iii) fragmentation, and (iv) age distributions. We then compare the results to those found when using a curated teleost library (Danio rerio). We identify quantitative differences in these metrics and highlight how TE library choice can have major impacts on TE-based estimates in non-model species.

Keywords: Corydoras; de novo; genomics; teleost; transcriptomics; transposon annotation.

MeSH terms

  • DNA Transposable Elements*
  • Edetic Acid
  • Genomics*
  • Sequence Homology

Substances

  • DNA Transposable Elements
  • Edetic Acid