Using the Text View to Review and Edit a Submission

Genome Workbench now includes the Sequence Editing Package, which will allow users to prepare data for submission to GenBank and also to modify their data for their own purposes.

NCBI stores sequence data in the ASN.1 format, which is effective for parsing with computer programs but difficult for humans to read. Genome Workbench makes it possible for users to create files in ASN.1 and edit content that has been stored in this format without knowledge of the ASN.1 format.

Text View 1

With the Text View, Genome Workbench provides an interactive view of the GenBank Flat File representation, with which most users are at least familiar, or can at least understand without investing too much time.

Text View 2

The Text View can be configured by clicking on the icon to the right of the binoculars. The user can select the Format (either Flat File, FastA, or ASN.1). When viewing the Flat File, the user has several additional options. The mode can be either “Editing” or “Public Preview”. The “Public Preview” mode will show the file as it would appear on the NCBI website, while “Editing” lists some fields separately that would be condensed in the public version, to make it easier to see where changes should be made.

The Text View allows the user to expand and contract sections of the Flat File to make it easier to skip past items in a large file. The “Open Expanded” checkbox controls whether all items are expanded by default when a Text View is first opened in Flat File format. This option is enabled by default.

The user has the option to not display Variation features and Sequence-Tagged Site (STS) features. When viewing records that have been retrieved from the NCBI database, these features may be large in number and may slow down the rendering of the Flat File.

When viewing a sequence that is an assembly of other sequences, the user has the option to show either the nucleotide content of the sequence or the list of contigs that form the scaffold of the assembly. This option is controlled by the “Show sequence instead of scaffold instructions” checkbox.

When viewing such a sequence, the user also has the option to see only the features that have been annotated directly on the assembled sequence, or also features that have been annotated on the individual contigs. This option is controlled by the “Show component features” option.

Text View 3

When the Text View is showing the Flat File format and the Sequence Editing Package is enabled, the user can edit the data represented by the Flat File. The pen icon in the left margin allows the user to launch a dialog to edit the portion of the data that generates that portion of the Flat File. Note that some text cannot be edited directly because the wording is calculated based on the content of the sequence – for example, “BASE COUNT” line shows the number of A, T, G, C, and ambiguity characters present, and cannot be edited directly, although the user can edit the sequence itself and change the content by double-clicking on the lines of sequence after “ORIGIN”. The X icon in the left margin allows the user to delete the portion of the data that generates that portion of the Flat File. Note that some portions of the Flat File have default values for when data is not present, so deleting the item will not necessarily remove the section from the Flat File completely.

By default, the Flat File shows all the nucleotide sequences, but the user can also choose to view the Flat File for a specific sequence or for all nucleotide and protein sequences by using the Sequence(s) control next to the configuration icon. Users can also select a sequence to view using the Submission->Tools->Select Specific Sequence by Sequence ID menu item. When using Submission menu items that act on a specific sequence, the action will apply to either the single sequence being viewed or the only nucleotide sequence in the record.

For more information please see the full documentation for NCBI Genome Workbench Editing Package.

Current Version is 3.6.0 (released March 04, 2021)

Last updated: 2019-06-14T15:43:03Z