CSE Colloquium Lecture
Controlling the Rate of False Discoveries in Tandem Mass Spectrum Identifications
The final speaker scheduled for the Fall 2016 CSE Colloquium Lecture Series is Uri Keich, a Professor at the University of Sydney, Australia. He will talk about controlling the rate of false discoveries in tandem mass spectrum identifications.
Date: Tuesday, December 6, 2016
Time: 11:30am - 1:00pm
Location: Room 4140, CSE Building
Host: CSE Prof. Pavel Pevzner
Abstract: A typical shotgun proteomics experiment produces thousands of tandem mass spectra, each of which can be tentatively assigned a corresponding peptide by using a database search procedure that looks for a peptide-spectrum match (PSM) that optimizes the score assigned to a matched pair. Some of the resulting PSMs will be correct while others will be false, and we have no way to verify which is which. The statistical problem we face is of controlling the false discovery rate (FDR), or the expected proportion of false PSMs among all reported pairings. While there is a rich statistical literature on controlling the FDR in the multiple hypothesis testing context, controlling the FDR in the PSM context is mostly done through the "home grown" method called target-decoy competition (TDC).
After a brief introduction to the problem of tandem mass spectrum identification we will explore the reasons why the mass spec community has been using this non-standard approach to controlling the FDR. We will then discuss how calibration can increase the number of correct discoveries and offer an alternative method for controlling the FDR in the presence of calibrated scores. We will conclude by arguing that our analysis extends to a more general setup than the mass spectrum identification problem. [This is joint work with Bill Noble from the University of Washington.]
Bio: Professor Uri Keich began his research career as a mathematician. Working under the supervision of Henry McKean at the Courant Institute of NYU, his PhD thesis on stationary approximations of non-stationary stochastic processes was awarded the Wilhelm T. Magnus Memorial Prize for Significant Contributions to the Mathematical Sciences. He continued working on this problem first as a Von Karman Instructor in the Applied Mathematics group in Caltech, and later as an assistant professor in the math department at UC Riverside.
While at Riverside Dr. Keich met Pavel Pevzner, who was just moving to UC San Diego and the course of his research career took an abrupt change as he learned about the rapidly growing area of Bioinformatics. Keich made his transition into the area under Pavel's supervision, first developing tools for the discovery of sequence motifs, and later on their statistical analysis. While at UCSD Keich filled the gaps in his statistical education under the guidance of Ian Abramson. He then spent several years as an assistant professor at the CS department of Cornell University before joining the statistics group at the University of Sydney in January 2009, where he is currently an associate professor.
Professor Keich worked on a variety of other problems arising in computational biology including optimal seed design for sequence similarity search tools, designing motif database search tools, mapping of sequence motifs involved in DNA replication initiation, and more recently on the statistical analysis of tandem mass spectrometry data. In addition, he is working on developing novel computational approaches for exact statistical tests.