Viral sequences in human cancer

Read the full article on ScienceDirect

From contamination to discovery: the cancer virome of 3,000 TCGA participants

NGS databases are a rich source for virus discovery. We developed viral detection software, Pickaxe, to enable this discovery. We surveyed >3,000 patients from 22 different cancers for the presence of viruses using RNA-seq, exome and whole genome datasets generated by The Cancer Genome Atlas (TCGA). After intensive efforts to remove artifactual detections arising from physical contamination or computational artifacts, we identified 34 different viruses in 12 cancers (Figure 1). As expected we found Human papillomavirus (HPV) in cervical cancers, and Hepatits B and C virus (HBV/HCV) in liver cancers. We also found HPV and HPV integration events in bladder cancer. We show that viral oncogenes in bladder cancer were expressed in these tumors. Given this and in combination with an analysis of the mutational profiles of bladder cancer, HPV may drive a small subset of bladder cancers. In addition, we detected several Herpesviruses primarily throughout the gastrointestinal tract.

The genesis of this study began after determining the virome of raw sewage (1). Since our lab is interested in the molecular mechanisms of cancer and how viruses impact host systems, we naturally asked: what is the virome of human cancer and how does it impact human biology? What new viral associations in cancer await to be uncovered and what novel viruses are present? At that time, the TCGA project had accumulated rich genomic data on thousands of tumor samples. It provided the next-generation sequencing data that we needed to answer our questions. Wonderful, we thought; this should be easy.

But then, we started finding viruses in almost every tumor sample that we analyzed. Why was HCMV, HSV1 and SV40 apparently in the same tumor (see Figure S2)!? After countless hours of manual examination of the data, we realized that most of these detections were artifacts, either computational or physical. We showed that HPV18 was not present in non-cervical cancers but was instead due to HeLa cell nucleic acid contamination (2). Leaving the detritus of artifacts behind us, we proceeded to assess the impact of the cancer virome on tumor biology.

Tumor viruses such as HPV and BKV express oncogenes that block the activity of the tumor suppressors pRb and p53. We hypothesized that in the presence of viral oncogenes they would not be under selective pressure to mutate. Focusing on a small subset of bladder tumors harboring HPV or BKV and using TCGA mutational data, we determined that RB1 and TP53 were not mutated in viral bladder tumors. This novel finding supports the notion that HPV and BKV are drivers of a small subset of bladder cancers. Another novel and intriguing finding was the number of cancers that harbored various herpesviruses. However, due to the relatively low abundance of the herpesviruses in these tumors, we suspect that these viruses are passengers not drivers of these tumors.

With our experience of studying the virome of cancer, we are continually honing our Pickaxe to mine for novel viruses in various NGS databases and biomes around the world. Stay tuned!


Figure 1. Viruses detected in human cancer. Thirty-four viruses from 5 virus families were detected across 707 TCGA samples (tumor, orange; normal, purple) in 12 cancers (alternating grey bar at the top). The number of alignments to each virus is indicated by intensity of red color. BLCA, bladder; CESC, cervical; COAD, colon; GBM, brain (multi.); HNSC, head & neck; KICH, kidney (chrom.); LGG, brain (lower); LIHC, liver; PAAD, pancreas; READ, rectum; SKCM, skin; STAD, stomach.


  1. Cantalupo, P. G., B. Calgua, G. Zhao, A. Hundesa, A. D. Wier, J. P. Katz, M. Grabe, R. W. Hendrix, R. Girones, D. Wang, and J. M. Pipas. 2011. Raw sewage harbors diverse viral populations. MBio 2.
  2. Cantalupo, P. G., J. P. Katz, and J. M. Pipas. 2015. HeLa Nucleic Acid Contamination in The Cancer Genome Atlas Leads to the Misidentification of Human Papillomavirus 18. Journal of virology 89:4051-4057.

Introducing the authors

James M.Pipas, Department of Biological Sciences, University of Pittsburgh (right) and Paul G.Cantalupo, Department of Biological Sciences, University of Pittsburgh (left)

About the research

Viral sequences in human cancer
Paul G. Cantalupo,  Joshua P. Katz, James M.Pipas
Virology, Volume 513, 1 January 2018, Pages 208-216