Making Pipelines and Large Processing Microbial Community Studies Available to Any User, Any Time, Any Place

Jan 23, 2018

CSE Ph.D. Candidate José Navas Molina is set to be the first Computer Science student to stage the final defense of his doctoral dissertation in 2018. His advisor, CSE and Pediatrics professor Rob Knight, will chair a faculty committee that includes fellow CSE professors Vineet Bafna, Larry Smarr, Nuno Bandeira and Skaggs School of Pharmacy and Pharmaceutical Sciences professor Pieter Dorrestein.  The title of Navas Molina's dissertation: "Making Pipelines and Large Processing Microbial Community Studies Available to Any User, Any Time, Any Place."

Computer Science Ph.D.
Candidate José Navas Molina

Date: Monday, January 29, 2018
Time: 2:00 p.m.
Location: Room 2109, CSE Building

Navas Molina joined the Knight Lab in Spring 2012. He was looking to merge his interests in high-performance computing and data visualization -- and Knight's lab allowed him to apply his interests to iimproving people's lives. "I found computational biology the perfect mix of my two passions," explains the Ph.D. candidate. His research focused mainly on the development of the software package Qiita, sequence clustering, new high-performance visualization tools and analysis of datasets that push the limit of current analysis techniques. Below in greater detail is the research abstract of his doctoral dissertation.

Abstract:  Advances in ’omics technologies are producing vast amounts of data, bringing microbiome research to a whole new level. This increase in data is pushing the limits of existing analysis tools, creating a rapidly-changing environment in which new tools are constantly being released. This presents a challenge to researchers, who need to constantly learn new analytical tools, expose themselves to new environments such as cloud computing or supercomputers, and deal with the problems resulting from a heterogeneous environment lacking the enforcement of standards.This thesis demonstrates how computational optimizations, enforcement of standards, and minimizing the learning curve for analytical tools and computational environments empower researchers to push the microbiome field forward.

Chapter 1 motivates and contextualizes the thesis, exposing the challenges and opportunities that current microbiome research faces as it presents itself as a big data field. Next, Chapter 2 presents the first gold standard approach for analyzing microbiome data, improvements in analytical tools, and examples of how these improvements move microbiome research forward. Chapter 3 describes a system that lowers the access barrier to cloud computing that researchers without a computational background face. Chapter 4 exposes the importance of meta-analyses to increase researchers’ ability to discover new findings and how much effort is currently spent to perform such meta-analyses. This chapter also presents Qiita, a web-based system focused on facilitating meta-analyses by enforcing standards, normalizing data representation and processing, and providing a common interface to current state-of-the-art analysis tools. Chapter 5 describes how using the tool improvements and data standardizations presented in Chapters 2 and 4, respectively, and a novel system that aids the recording of sample handling information, and speeds up the process of analyzing microbiome samples to levels never reached before. Finally, the concluding chapter of [Navas Molina's dissertation] discusses the results and the opportunities opened due to these advances, paying special attention to precision medicine, a topic in which the microbiome is becoming key.