Spectral Library Construction and Matching of MS/MS Spectra at Repository Scales

Apr 15, 2017
CSE Ph.D. candidate Mingxun Wang

CSE Ph.D. candidate Mingxun Wang is set to mount the final defense of his doctoral dissertation on Monday, April 24. He works in bioinformatics, and on computational mass spectrometry specifically, with applications to proteomics, metabolomics and natural product discovery. Wang's dissertation focuses on spectral library construction and matching of MS/MS spectra at repository scales. The dissertation committee is led by Wang's advisor, CSE Prof. Nuno Bandeira, as welll as CSE professors Pavel Pevzner and Vineet Bafna, Skaggs School of Pharmacy and Pharmaceutical Sciences Pieter Dorrestein, and Elizabeth Komives from the department of Chemistry and Biochemistry.

MingxunWang_Bioinformatics800.jpg
Ph.D. candidate Mingxun Wang

According to Wang's abstract, the "characterization of proteins, peptides, metabolites, and natural products are crucial to the understanding of biological processes, discovering biomarkers, and uncovering new therapeutic molecules. Tandem mass spectrometry (MS/MS) has proven to be a high-throughput and sensitive tool to assay these molecules, whereby the fragmentation observed in the MS/MS spectra functions as a reproducible signature for each molecule. Thus, any acquisition of a molecule's MS/MS fragmentation can be aggregated into a reusable collection of observed and annotated MS/MS spectra known as a spectral library. 

Due to the reproducibility of a molecule's MS/MS spectrum, spectral libraries have gained traction as a resource for the sensitive identification of newly-acquired MS/MS spectra. Thus, the utility of spectral libraries rests on the reliability of MS/MS similarity metrics as well as the quality and size of the libraries themselves." 

In his thesis, Wang highlights the computational methods he developed to enable the creation of spectral libraries for proteomics, metabolomics, and natural products discovery. "These methods include the aggregation and analysis of the entire community's mass spectrometry data along with online computational resources that crowdsource the annotation and curation of specialized spectral libraries," noted Wang. Further, he added, "by leveraging repository-scale mass spectrometry data, we have developed methods to assign statistical significance to spectral similarity metrics in order to enable the automated identification of MS/MS data by matching to spectral libraries.

Wang will defend his dissertation at 1pm in the CSE auditorium on Monday, April 24, 2017.