Doctoral Candidate Defends Thesis on Learning Information from Data While Preserving Differential Privacy

Jun 19, 2017
Ph.D. candidate Zhanglong Ji prepares for his final dissertation defense on June 23.

CSE Ph.D. candidate Zhanglong Ji (M.S. '13) -- whose research interests include data privacy, machine learning, statistics and data mining -- is scheduled to defend his doctoral dissertation this Friday. The topic: "Learning Information from Data while Preserving Differential Privacy."

The panel is co-chaired by his advisors -- CSE Prof. Charles Elkan and Department of Medicine Prof. Lucila Ohno-Machado. Rounding out the panel include CSE professors Kamalika Chaudhuri and Sanjoy Dasgupta, as well as professor Kevin Patrick from the Department of Family Medicine and Public Health as well as the Qualcomm Institute. 

ZhanglongJi_LinkedIn.jpg
Zhanglong Ji 

Date: Friday, June 23, 2017
Time: 10:00am
Location: Room 5A03, Biomedical Research Facility II

In this dissertation, Zhanglong Ji will introduce his work on differentially private data mining. There are usually two types of privacy-preserving data mining algorithms: query answering algorithms and data publishing algorithms. He will present three algorithms on two differentially-private query answering problems. The first problem is to return SNPs (single nucleotide polymorphisms) most correlated to a disease, and he'll detail two efficient algorithms which make an accurate but previously inefficient algorithm feasible. The second problem is to learn a model that minimizes empirical risk while selecting features. "Our algorithm beats state-of-the-art algorithms," contends Ji in his abstract.

Ji will also present three algorithms on differentially private data publishing under two scenarios. The first assumes the existence of public data and assigns weights to the public data so that they are statistically similar to the private data. Analysis on the weighted public data is more accurate than doing analysis on private data directly. The two subsequent algorithms assume that published data are for supervised learning (one for classification, the other for regression). They also assume that the prediction rules are continuous with respect to predictors. Finally, the data resulting from these algorithms, noted Ji, "perform very well on learning tasks."

In addition to completing his M.S. in Computer Science at UC San Diego, Ji earned a B.S. in Statistics from Peking University in 2011.  He is one of several CSE Ph.D. students affiliated with the Biomedical Informatics group in the School of Medicine. Others include Zachary Lipton (also graduating in 2017) and Steven Rick (expecting to complete his doctorate in 2020).