Clement Canone (Theory Seminar)

"When Fourier SIIRVs: Fourier-Based Testing for Families of Distributions" Image removed.
Clément Canonne (Columbia)
Monday, October 30, 2017, 2:00 pm
EBU3B, Room 4258

Abstract: We study the general problem of testing whether an unknown discrete probability distribution belongs to a specified family of distributions. More specifically, given a distribution family $\mathcal{P}$ and sample access to an unknown discrete distribution $p$, we want to distinguish (with high probability) between the case that $p$ is in $\mathcal{P}$ and the case that $p$ is eps-far, in total variation distance, from every distribution in $\mathcal{P}$. This is the archetypal hypothesis testing problem that has received significant attention in statistics and, over the past decade and a half, in theoretical computer science.

Of course, the sample complexity of this general inference task depends on the underlying family  $\mathcal{P}$. The gold standard in distribution testing is to design sample-optimal and computationally efficient algorithms for this task. The main contribution of this work is a simple and general testing technique that is applicable to all distribution families, and is particularly suited to those whose /Fourier spectrum/ satisfies a certain approximate /sparsity/ property. To the best of our knowledge, ours is the first use of the Fourier transform in the context of distribution testing.

We apply our Fourier-based framework to obtain near sample-optimal and computationally efficient testers for the following fundamental distribution families: Sums of Independent Integer Random Variables (SIIRVs), Poisson Multinomial Distributions (PMDs), and Discrete Log-Concave Distributions. For the first two, ours are the first non-trivial testers in the literature, vastly generalizing previous work on testing Poisson Binomial Distributions. For the third, our tester improves on prior work in both sample and time complexity.

Joint work with Ilias Diakonikolas (USC) and Alistair Stwart (USC).