We see you! Viruses hiding in the Transcriptome Shotgun Assembly database

Read the full article on ScienceDirect, open access.

A +1 ribosomal frameshifting motif prevalent among plant amalgaviruses

Text by Max L. Nibert, Jesse D. Pyle and Andrew E. Firth

Only a few plant amalgaviruses, from 4 plant host species, have been sequenced and described in papers to date. We recently became interested in these nonsegmented seemingly dsRNA viruses, which are commonly cryptic and persistent in their hosts, and decided to poke around in sequence databases to see if we could find others of them hiding there, unrecognized by the investigators who had originally determined and deposited sequences from plant-derived samples. Happily, we found many such accessions in the TSA database at GenBank, including 16 that appear to encompass the complete protein-coding sequences of novel plant amalgaviruses (see phylogram in figure), from 12 additional plant host species. We then used these and other sequences in a variety of comparisons, including ones that seem to corroborate a slippery sequence motif for +1 programmed ribosomal frameshifting widely shared among plant amalgaviruses for expressing their RNA-dependent RNA polymerase from their downstream open reading frame.

One problem we encountered was that several of the newly uncovered amalgavirus sequences appeared to be truncated at one or both termini. Those protein-coding sequences therefore appeared to be almost complete, but not quite. What we then found, though, was that by consulting the short sequence reads from which the TSA accessions had been assembled by the depositors (as found in the Sequence Read Archive(s) for each TSA accession), we were able to map additional reads to the TSA accessions and thereby extend them to make their protein-coding sequences now apparently complete. Reassembling TSA accessions of interest from their associated SRA data is thus one of our recommendations for future such studies.

One surprise, in addition to finding so many new amalgavirus sequences in the first place, has to do with the protein encoded by the upstream open reading frame of these viruses. Whether or not ORF1p forms an icosahedral capsid remains unclear, in part because plant amalgavirus virions have not been visualized to date. In this study, we were surprised to find that ORF1p, from the 22 plant amalgaviruses that we were now able to compare, is consistently predicted to possess a moderately long central region of α-helical coiled coil. Since such coiled-coil regions are not commonly found in icosahedral shell-forming proteins, this finding may weigh in favor of ORF1p having some other form and function, perhaps a filamentous nucleocapsid or a more amorphous matrix-like structure involved in amalgavirus replication and maintenance inside host cells.


Figure legend

See Fig. 3 in the full article for additional details. This excerpt shows our newly reported plant amalgaviruses (gray) phylogenetically clustering with the 4 previously characterized plant amalgaviruses (black) in genus Amalgavirus. Putative fungal amalgavirus ZbV-Z, in proposed genus Zybavirus, is also shown. ZbV-Z shares the slippery sequence motif for +1 ribosomal frameshifting as well as a predicted central region of coiled coil in ORF1p.

Introducing the authors


Left panel: Max Nibert (left) is a Professor of Microbiology and Immunobiology, Harvard Medical School (Boston, MA, USA). Jesse Pyle (right) is a student in the Virology Ph.D. Program at Harvard and contributed to this work while doing a first-year rotation in the Nibert lab. Right panel: Andrew Firth is a Wellcome Trust Senior Research Fellow in the Division of Virology, Department of Pathology, University of Cambridge (Cambridgeshire, UK).

About the research

A +1 ribosomal frameshifting motif prevalent among plant amalgaviruses

Max L. Nibert, Jesse D. Pyle, and Andrew E. Firth

Virology, Volume 498, November 2016, Pages 201-208, open access