Problem 36: Learning an $f$-Transformed Product Distribution

From Open Problems in Sublinear Algorithms
Revision as of 05:29, 16 November 2012 by Krzysztof Onak (talk | contribs)
Jump to: navigation, search
Suggested by Rocco A. Servedio
Source Bertinoro 2011
Short link https://sublinear.info/36

In this learning setting there are $n$ independent Bernoulli random variables $X_1,...,X_n$ with unknown $E[X_i]=p_i$. There is a known transformation function $f:\{0,1\}^n \mapsto R$, where $R$ is some range. The learner has access to independent draws from $f(X_1,...,X_n)$; i.e., each example for the learner is obtained by independently drawing $X_1,...,X_n$, applying $f$, and giving the result to the learner. Call this distribution $D_f$. The learner's job is to construct a hypothesis distribution $D'$ over the range set such that the variation distance between $D_f$ and $D'$ is at most $\epsilon$, with high probability.

Question: Give some necessary or sufficient conditions on $f$ that make the “learn an $f$-transformed product distribution” problem solvable using $O_{\epsilon}(1)$ queries, independent of $n$.

Background: The following is known [DaskalakisDS-11]:

  1. For $f(X)=X_1 + ... + X_n$, there's a learning algorithm using poly$(1/\epsilon)$ queries independent of $n$.
  2. For $f(X) = \sum_{i=1}^n i\cdot X_i$, any algorithm for learning to constant accuracy must make $\Omega(n)$ queries.