TSA Frequently Asked Questions

Can I submit single-pass reads to TSA?

Single-pass reads may be submitted as part of a TSA project providing they are in the minority and add additional information to the transcriptome project.

Can I submit assemblies with internal Ns?

TSA does not accept assemblies which have Ns inserted to represent gaps of unknown length. Sequences containing Ns representing gaps of unknown length need to be split into individual assemblies. Internal Ns representing ambiguous bases or known length gaps can be submitted. If the Ns represent ambiguous bases they should not be more than 10% of the sequence length or more than 14 n's in a row. If the N's represent a known length gap then an assembly_gap feature must be used.

Is annotation recommended?

Annotation is not recommended for a non-targeted study. Annotation is only required if you are submitting a small targeted subset of transcriptome data. If annotation is included the product names should follow the International Protein Nomenclature Guidelines. All annotation must be biologically valid.

Can I submit an assembly of EST/SRA data generated by another group?

No. All submitted assemblies must be derived from primary data generated by the same group.

Where should clonally derived sequences be submitted?

These sequences should be submitted to GenBank . Only computationally assembled sequences should be submitted to TSA.

Should TSA submissions be submitted directly to GenBank via email?

All TSA submissions need to be submitted using the TSA Submission Portal . Sequences submitted via email or SequinMacroSend will not be accepted.

Can a moltype other than transcribed RNA be used for TSA submissions?

Yes. If a targeted data set is being submitted where the focus was isolating a specific other RNA molecule type, this molecule type should be used. For example noncoding RNA.

Can I submit different assemblies as one submission through the Submission Portal?

No, each submission in portal should represent a single assembly from the same organism and should have the following information in common:

  • Assembly data structured comment
  • SRA run accessions (SRRXXXXXY)
  • BioProject accession
  • BioSample accession

Are TSA sequences available by a BLAST search?

TSA sequences can be retrieved using the Transcriptome Shotgun Assembly (TSA) BLAST. The TSA database is available from the BLAST home page under Basic BLAST at the nucleotide, tblastn, and tblastx links. These sequences are not available in nt.

Can I run VecScreen before submitting?

The TSA submission portal will automatically run VecScreen on your submission. If you would like to screen your sequences prior to submitting then please review the UniVec instructions.

The following is the command-line that should be used:

blastn -task blastn -reward 1 -penalty -5 -gapopen 3 -gapextend 3 -dust yes -soft_masking true
   -evalue 700 -searchsp 1750000000000 -db UniVec -query sequence.fa -out vs.test.out
Last updated: 2019-02-05T13:10:48Z