UC San Diego Computer Science and Engineering Professor Pavel Pevzner has been honored by the Association for Computing Machinery (ACM) with the Paris Kanellakis Theory and Practice Award, which recognizes theoretical work that significantly advances computing. The award includes a $10,000 prize. ACM noted:
“Dr. Pevzner has made fundamental contributions to the theoretical study of string algorithms and to their application to scalable reconstruction of genomes and other biological sequences… In a series of papers starting in 1989, Pevzner developed models based on de Bruin graphs that led to algorithms that were both accurate and significantly more efficient. His work has tremendous practical impact on modern biology: his algorithms underlie almost all sequence assemblers used today and were used to reconstruct the vast majority of genomic sequences available in databases.”
Although genome sequencing was invented more than forty years ago, biologists still cannot read genomes letter-by-letter from beginning to end, like a book. Instead, sequencing machines generate tiny overlapping segments, called reads. Putting them together is an enormously difficult algorithmic task.
“The traditional overlap approach to assemble genomes is similar to the way we assemble jigsaw puzzles – adding a matching piece to the already assembled pieces at each step,” says Pevzner. “The problem is that, while the overlap approach works for a puzzle with a few thousand pieces, it becomes intractable for huge genomic puzzles.”
Pevzner described a counterintuitive way to assemble these puzzles. His approach built on the Bridges of Konigsberg puzzle, which asks whether there’s a way to walk through Konigsberg and visit each bridge exactly once.
In his Ph.D. thesis, thirty years ago, Pevzner described a genomic analog to Konigsberg that represents a network of roads with millions of bridges called an assembly graph. A few years later, he and Vineet Bafna, his student at the time and now a professor in the Computer Science and Engineering Department at UC San Diego, came up with a similar concept, a breakpoint graph, which is now a workhorse for genome rearrangement studies that analyze differences between genomes.
In addition to his pioneering algorithms, Pevzner also co-developed popular online bioinformatics and algorithms specializations on Coursera and edX, which have been adopted worldwide.
“I am humbled to receive the Kanellakis award,” says Pevzner, “but I did not do it alone. This award goes to my teachers, students, and postdocs over the last quarter of a century.”