# Problem 36: Learning an $f$-Transformed Product Distribution

Suggested by | Rocco A. Servedio |
---|---|

Source | Bertinoro 2011 |

Short link | https://sublinear.info/36 |

In this learning setting there are $n$ independent Bernoulli random variables
$X_1,...,X_n$ with *unknown* $E[X_i]=p_i$. There is a *known*
transformation function $f:\{0,1\}^n \mapsto R$, where $R$ is some range.
The learner has access to independent draws from $f(X_1,...,X_n)$;
i.e., each example for the learner is obtained by independently drawing
$X_1,...,X_n$, applying $f$, and giving the result to the learner. Call
this distribution $D_f$.
The learner's job is to construct a hypothesis distribution $D'$ over the
range set such that the variation distance between $D_f$ and $D'$ is
at most $\epsilon$, with high probability.

**Question:** Give some necessary or sufficient conditions on $f$
that make the “learn an $f$-transformed product distribution” problem
solvable using $O_{\epsilon}(1)$ queries, independent of $n$.

**Background:** The following is known [DaskalakisDS-11]:

- For $f(X)=X_1 + ... + X_n$, there's a learning algorithm using poly$(1/\epsilon)$ queries independent of $n$.
- For $f(X) = \sum_{i=1}^n i\cdot X_i$, any algorithm for learning to constant accuracy must make $\Omega(n)$ queries.