Difference between revisions of "Open Problems:66"

From Open Problems in Sublinear Algorithms
Jump to: navigation, search
m (Krzysztof Onak moved page Waiting:Eldar Fischer to Open Problems:66)
Line 9: Line 9:
  
 
What can we say if both distributions are unknown? The best known upper bound is $\tilde O\left( \frac{\log^5 n}{\epsilon^4} \right)$ {{cite|CanonneRS-14}}.
 
What can we say if both distributions are unknown? The best known upper bound is $\tilde O\left( \frac{\log^5 n}{\epsilon^4} \right)$ {{cite|CanonneRS-14}}.
 +
 +
==Update==
 +
It was shown by Acharya, Canonne, and Kamath {{cite|AcharyaCK-14}} that, for a constant $\epsilon > 0$,  $\Omega(\sqrt{\log \log n})$ conditional queries are needed to test equivalence if both distributions are unknown.

Revision as of 00:13, 9 December 2014

Suggested by Eldar Fischer
Source Bertinoro 2014
Short link https://sublinear.info/66

Suppose we are given access to two distributions $P$ and $Q$ over $\{1,2, \ldots, n\}$ and wish to test if they are the same or are at least $\epsilon$ apart under the $\ell_1$ distance. Assume that we have access to conditional samples: a query consists of a set $S \subseteq \{1,2, \ldots, n\}$ and the output is a sample drawn from the conditional distribution on $S$ [ChakrabortyFGM-13,CanonneRS-14]. In other words, if $p_j$ is the probability of drawing an element $j$ from $P$, a conditional sample from $P$ restricted to $S$ is drawn from the distribution where $$ \text{Pr}(j) = \begin{cases} \frac{p_j}{\sum_{i \in S} p_i} & \mbox{if }j \in S, \\ 0 & \mbox{otherwise.}\end{cases} $$ It is known that if one of the distributions is fixed, then the testing problem requires at most $\tilde O(1/\epsilon^4)$ queries, which is independent of $n$ [CanonneRS-14].

What can we say if both distributions are unknown? The best known upper bound is $\tilde O\left( \frac{\log^5 n}{\epsilon^4} \right)$ [CanonneRS-14].

Update

It was shown by Acharya, Canonne, and Kamath [AcharyaCK-14] that, for a constant $\epsilon > 0$, $\Omega(\sqrt{\log \log n})$ conditional queries are needed to test equivalence if both distributions are unknown.