Allowing phage synteny browsing and protein function prediction
Text by Marie-Agnes Petit
The recent renewed interest for phages and viruses in general, has lead to the sequencing and tentative annotating of hundreds of new phage genomes. However, from this huge wave of new information the phage “dark matter” concept has also emerged, referring to the fact that the vast majority of phage genes resist annotation. Our earlier work, as well as the work of many others, had nevertheless suggested that part of the problem was due to unfit tools for homology search. Due to the remarkable divergence of phage proteins, simple BLAST searches are often unproductive. This paper describes the setting up of the “Phagonaute” web interface allowing the navigation among complete phage genomes, and taking into account this difficulty. Its purpose is to allow “module” comparisons across genomes, based on distant protein homologies. Within a window of 6-12 genes around a specific query gene, all homology relationships with related phages are displayed graphically, using a color code. Synteny conservation serves to strengthen the potential new function prediction uncovered by distant homology. This tool is therefore designed to help experimentalists to pick the right gene for the right experiment.
The motivation to build up this site came from the fact that in the bacterial or eukaryotic world, such tools already exist and greatly help experimental research. One of the first websites serving this purpose was created by Ross Overbeek a long time ago and still serves today: it is entitled the ‘show neighborhood’ function, on the Integrated Microbial Genome (IMG) site of the Joint Genome Institute. Genomicus is its counterpart (with a different design) for Eukaryots.
Before publishing this work, we used the site for our research purposes, mainly interrogating genes with functions related to homologous recombination, and were surprised by the amount of fruitful guesses it permitted (these are given in the paper as examples of use).
The main problem we had to overcome was the treatment of protein fusions, which could easily meddle up the results. Let us say a query starts with a protein resulting from the fusion of an exonuclease with a recombinase, it will indistinctly display as ‘homologs’ many exonuclease and recombinase proteins which are distinct proteins. We solved this by splitting genes into ‘domains’ (which are not functional domains, but domains of homology). This gives to the output graphic an additional level of refinement, the mapping of the homology region.
We hope this new phage tool will lead to many exciting discoveries and will contribute, with time, to decrease the phage dark matter.
About the research
Phagonaute: A web-based interface for phage synteny browsing and protein function prediction
Virology, Volume 496, September 2016, Pages 42–50
Hadrien Delattre, Oussema Souiai, Khema Fagoonee, Raphaël Guerois, Marie-Agnès Petit