New & Improved!

Developing Technologies Support Proteogenomics
May 2018 - Vol. 7 No. 4 - Page #2

Proteogenomics at its essence is the application of biology’s central dogma to clinical diagnostics. The classical view of that dogma is one where the genetic information encoded in DNA is transcribed into messenger RNA (mRNA), which contains all the information necessary for cells to translate proteins. In other words, proteogenomics utilizes the genomic data from DNA and/or RNA extracted by next generation sequencing (NGS) or other molecular biology techniques in combination with proteomic data from mass spectrometry (MS) or other biochemical techniques (see FIGURE 1).

There is little doubt that many clinical laboratories will seek to incorporate proteogenomic techniques by following the large, research-based, proof-of-concept proteogenomic results.1 The drive to provide such results is ultimately fueled by efforts to provide better patient care, as was demonstrated by the 2017 Memorandum of Understanding (MoU) between the National Cancer Institute (NCI) and the US Food and Drug Administration (FDA) in partnership with the Department of Health and Human Services (DHHS). This MoU coalesced the agencies’ agreement to leverage their expertise and resources toward advancement of proteogenomics cancer research.2

Many clinical labs already are performing proteogenomics in its basic form (albeit in a somewhat disjointed manner with distinct separation and interpretation of NGS and protein-based tests), but are poised to readily adopt developing technologies as they become clinically viable. In anticipation of this, it is important to understand the growth history of proteogenomics that has led to the current clinical trials with a primary focus toward oncology.

Clinical Drive for Proteogenomics: A Proof of Concept

The progression toward practical implementation of powerful proteogenomic approaches likely will require a strategically paced plan, as initiating such complex workflows necessitates a high degree of expertise. However, proteogenomics has the potential to burst into the clinical space if recent history is taken into account (see FIGURE 2).

Coined ever so recently in 2004, the term proteogenomics was created in the effort to refine gene annotation processes with increasing published genomes. The gene annotations that led this effort originally came from research-based model organisms with large messenger RNA datasets, represented as short sub-sequences of cloned mRNA and termed expressed sequence tags (ESTs). Matching these ESTs to both the DNA in published genomes and protein peptide sequences demonstrated the diversity and quantity of protein isoforms that any single gene could produce. Then in 2009, with more large RNA sequencing (RNA-Seq) datasets coming online, it was apparent that entire reference genomic DNA sequences retained limited protein isoform predictions not reflected in the diversity and quantity of protein isoforms from the RNA-seq data. It was this work-in-progress annotation of genomic data from RNA and protein-based MS that led to proteogenomics workflows to match the progression of information from DNA to RNA to protein.

With respect to protein-based detection methods, two developments have been essential in the march toward clinical usage: First, the refinement of liquid chromatography tandem MS instruments that greatly increased instrument sensitivity; and second, the ability to bar code protein samples for multiplex analysis on the same instrument using isobaric tags for relative and absolute quantitation (iTRAQ) peptide labels. With those two developments, the publishing of the complete human proteome was quick to follow in 2014.3 In that same year, a proof-of-concept paper predicting immunogenic tumor-specific mutations, or neoantigens, by combined MS and exome sequencing revealed the power of the combined approach.1 In this seminal paper, a database of 28,439 exome variations from known mouse tumors was generated from NGS data to enable the development of a predictive peptide database of the translated DNA. This was then used to match peptide MS results, from which seven neoantigen peptides were identified. Further modeling and testing revealed three peptides could effectively immunize mice against growth of tumor implants; in essence, this revealed a peptide-based vaccine toward tumor-specific neoantigens that prevented tumor growth. With such potential, the MoU between the FDA and NCI is set to bolster research and commercialize the process.

Beyond this timeline and proof of concept in mice, nascent paradigms for clinical proteogenomics are emerging. For example, since 2007, nearly every patient receiving crizotinib has benefitted from a proteogenomic approach in the discovery of EML4-ALK fusions. For that discovery, two independent papers in 2007, when combined, formed the basis of a proteogenomic approach for neoantigen and fusion protein discovery. In one paper,4 the researchers isolated EML4-ALK neoplastic clones from patient cells and showed through sequence analysis the gene fusion event between EML4 and ALK, thereby providing the sequence needed for future fusion protein analysis. In a subsequent paper,5 the protein products from lung cancer cell lines were subsequently identified by MS, given that the previous paper allowed for identification of the unique peptide spanning the junction of the newly fused proteins.

Click here to see FIGURE 2.

Proteogenomic and Neoantigen Clinical Trials

As of today, there are five completed or ongoing clinical trials for “proteogenomics” and 59 listed for “neoantigens” on the website: Two studies published in 2017 that developed successful vaccines directed toward neoantigens have led the charge. The first study6 resulted in the therapeutic, long-peptide neoantigen vaccine NeoVax based on their neoantigen discovery pipeline. With NeoVax, long-peptides of 15-30 residues representing up to 20 patient-specific neoantigens were injected into six patients. Four of those patients who had enrolled with stage III melanoma showed no tumor recurrence after 32 months. The remaining two patients who experienced tumor recurrence had complete tumor regression after anti-PD1 therapy.7

An additional study led to the neoantigen mRNA-based vaccine IVAC MUTANOME.8 In their study, 13 melanoma patients received a personalized vaccine developed from 10 neoantigens. Eight patients remained tumor free throughout a 23 month follow-up. Of the five patients who had metastatic disease at vaccination, one patient showed a complete response in combination with anti-PD1 therapy.

Proteogenomics in the Clinical Lab

Going back to the central dogma, protein detection has a long history in the clinical lab space, whether involving a neoantigen or a mutant peptide, such as BRAF V600E. Pathologists have characterized tumors via tumor infiltrating leucocytes (TILs) for decades. Thus, it is perhaps expected that cancers with the highest mutational burden and likely high neoantigen burden, such as those with DNA mismatch repair deficiencies (eg, Lynch Syndrome and MSI-High), or those with high UV light exposure (eg, melanoma) have associated TILs.

Beyond TILs, IHC (immunohistochemistry) detection of either mutated, or overexpressed proteins such as HER2, PD-1, or PD-L1 are mainstays of many current treatment algorithms. In this context, proteogenomics has the potential to improve detection and quantification of the biomarker mainstays and in a manner with high multiplexing capabilities. Digital biomarker detection of both protein and nucleic acids is now available on the market. Similarly, optical bar codes for nucleic acids are in development, as are photocleavable bar codes on custom antibodies that can be used with standard formalin-fixed paraffin embedded (FFPE) slides. As these platforms are validated and adopted by more labs, the quantified detection of protein and nucleic acids in a proteogenomic manner will continue to improve patient care and access new avenues for treatment. Similarly, just as mass spec platforms are bringing more and more MS-based approaches to a clinical setting, clinical laboratories are expected to follow suit and adopt more proteogenomic workflows.


  1. Yadev M, Jhunjhunwala S, Phung QT, et al. Predicting Immunogenic Tumor Mutations by Combining Mass Spectrometry and Exome Sequencing. Nature. 2014;515(7528):572-576.
  2. US Food and Drug Administration. Domestic MOUs. MOU 225-17-014. Clinical Proteogenomics Cancer Research. Page last updated: 12/18/2017. Accessed 4/5/2018.
  3. Kim MS, Pinto SM, Getnet D, et al. A Draft Map of the Human Proteome. Nature. 2014;509(7502):575-581.
  4. Soda M, Choi YL, Enomoto M, et al. Identification of the Transforming EML4-ALK fusion gene in non-small-cell lung cancer. Nature. 2007;448(7153):561-566.
  5. Rikova K, Guo A, Zeng Q, et al. Global Survey of Phosphotyrosine Signaling Identifies Oncogenic Kinases in Lung Cancer. Cell. 2007;131(6):1190-1203.
  6. Ott PA. A Phase I Study with a Personalized NeoAntigen Cancer Vaccine in Melanoma. Identifier – NCT0197358. Accessed 4/5/2018.
  7. Ott PA, Hu Z, Keskin DB, et al. An Immunogenic Personal Neoantigen Vaccine for Patients with Melanoma. Nature. 2017;547(7662):217-221.
  8. Sahin U. IVAC MUTANOME Phase I Clinical Trial. Identifier – NCT02035956. Accessed 4/5/2018.

Aaron E. Atkinson, PhD, is an instructor of pathology and laboratory medicine at the Laboratory for Clinical Genomics and Advanced Technology (CGAT) in the department of pathology and laboratory medicine, Dartmouth Hitchcock Medical Center, Lebanon, New Hampshire.

Laura J. Tafe, MD, is an associate professor of pathology and laboratory medicine at Dartmouth-Hitchcock Medical Center and the Geisel School of Medicine at Dartmouth, and is assistant director of the Laboratory for Clinical Genomics and Advanced Technology (CGAT). She completed fellowship training in oncologic surgical pathology and molecular genetic pathology at Memorial Sloan Kettering Cancer Center, and her academic interests focus on thoracic and gynecologic neoplasms and molecular diagnostics.

Edward G. Hughes, PhD, is a genomic analyst in the Clinical Genomics and Advanced Technologies laboratory at Dartmouth Hitchcock Medical Center.

Gregory J. Tsongalis, PhD, is director of the Laboratory for Clinical Genomics and Advanced Technology (CGAT) in the department of pathology and laboratory medicine, Dartmouth Hitchcock Medical Center. He also is a professor in pathology and laboratory medicine at Geisel School of Medicine at Dartmouth in Hanover.

Like what you've read? Please log in or create a free account to enjoy more of what has to offer.

WHERE TO FIND: Mass Spectrometry

Current Issue

Enter our Sweepstakes now for your chance to win the following prizes:

    Just answer the following quick question for your chance to win:

    To continue, you must either login or register:

    As of May 24, 2018,'s Terms of Use and Privacy Policy have been updated.
    You can review the Terms of Use HERE and the Privacy Policy HERE.
    By continuing to use, you are agreeing to these changes.