Difference between revisions of "Open Problems:66"

From Open Problems in Sublinear Algorithms
Jump to: navigation, search
Line 6: Line 6:
 
   
 
   
 
Suppose we are given access to two distributions $P$ and $Q$ over $\{1,2, \ldots, n\}$ and wish to test if they are the same or are at least $\epsilon$ apart under the $\ell_1$ distance. Assume that we have access to ''conditional samples'': a query consists of a set $S \subseteq \{1,2, \ldots, n\}$ and the output is a sample drawn from the conditional distribution on $A$. In other words, a conditional sample on $P$ given $S$ is drawn from the distribution where  
 
Suppose we are given access to two distributions $P$ and $Q$ over $\{1,2, \ldots, n\}$ and wish to test if they are the same or are at least $\epsilon$ apart under the $\ell_1$ distance. Assume that we have access to ''conditional samples'': a query consists of a set $S \subseteq \{1,2, \ldots, n\}$ and the output is a sample drawn from the conditional distribution on $A$. In other words, a conditional sample on $P$ given $S$ is drawn from the distribution where  
 
+
$$ \text{Pr}(j) = \begin{cases} \frac{p_j}{\sum_{i \in A} p_i} & \mbox{if }j \in A, \\ 0 & \mbox{otherwise.}\end{cases} $$
$$ \text{Pr}(j) = \begin{cases} \frac{p_j}{\sum_{i \in A} p_i} & j \in A \\ 0 & \text{otherwise} \end{cases} $$
 
 
 
 
It is known that if one of the distributions is fixed, then a constant number ($f(1/\epsilon)$, where $f$ is an arbitrary function) of queries suffice to test the distributions.  
 
It is known that if one of the distributions is fixed, then a constant number ($f(1/\epsilon)$, where $f$ is an arbitrary function) of queries suffice to test the distributions.  
  
 
What can we say if both distributions are unknown?
 
What can we say if both distributions are unknown?

Revision as of 04:27, 4 June 2014

Suggested by Eldar Fischer
Source Bertinoro 2014
Short link https://sublinear.info/66

Suppose we are given access to two distributions $P$ and $Q$ over $\{1,2, \ldots, n\}$ and wish to test if they are the same or are at least $\epsilon$ apart under the $\ell_1$ distance. Assume that we have access to conditional samples: a query consists of a set $S \subseteq \{1,2, \ldots, n\}$ and the output is a sample drawn from the conditional distribution on $A$. In other words, a conditional sample on $P$ given $S$ is drawn from the distribution where $$ \text{Pr}(j) = \begin{cases} \frac{p_j}{\sum_{i \in A} p_i} & \mbox{if }j \in A, \\ 0 & \mbox{otherwise.}\end{cases} $$ It is known that if one of the distributions is fixed, then a constant number ($f(1/\epsilon)$, where $f$ is an arbitrary function) of queries suffice to test the distributions.

What can we say if both distributions are unknown?