<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://sublinear.info/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=18.189.42.114</id>
	<title>Open Problems in Sublinear Algorithms - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://sublinear.info/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=18.189.42.114"/>
	<link rel="alternate" type="text/html" href="https://sublinear.info/index.php?title=Special:Contributions/18.189.42.114"/>
	<updated>2026-04-22T20:11:01Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.31.10</generator>
	<entry>
		<id>https://sublinear.info/index.php?title=Open_Problems:66&amp;diff=865</id>
		<title>Open Problems:66</title>
		<link rel="alternate" type="text/html" href="https://sublinear.info/index.php?title=Open_Problems:66&amp;diff=865"/>
		<updated>2015-05-20T23:10:50Z</updated>

		<summary type="html">&lt;p&gt;18.189.42.114: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Header&lt;br /&gt;
|source=bertinoro14&lt;br /&gt;
|who=Eldar Fischer&lt;br /&gt;
}}&lt;br /&gt;
 &lt;br /&gt;
Suppose we are given access to two distributions $P$ and $Q$ over $\{1,2, \ldots, n\}$ and wish to test if they are the same or are at least $\epsilon$ apart under the $\ell_1$ distance. Assume that we have access to ''conditional samples'': a query consists of a set $S \subseteq \{1,2, \ldots, n\}$ and the output is a sample drawn from the conditional distribution on $S$ {{cite|ChakrabortyFGM-13|CanonneRS-14}}. In other words, if $p_j$ is the probability of drawing an element $j$ from $P$, a conditional sample from $P$ restricted to $S$ is drawn from the distribution where &lt;br /&gt;
$$ \text{Pr}(j) = \begin{cases} \frac{p_j}{\sum_{i \in S} p_i} &amp;amp; \mbox{if }j \in S, \\ 0 &amp;amp; \mbox{otherwise.}\end{cases} $$&lt;br /&gt;
It is known that if one of the distributions is fixed, then the testing problem requires at most $\tilde O(1/\epsilon^4)$ queries, which is independent of $n$ {{cite|CanonneRS-14}}.&lt;br /&gt;
&lt;br /&gt;
What can we say if both distributions are unknown? The best known upper bound is $\tilde O\left( \frac{\log^5 n}{\epsilon^4} \right)$ {{cite|CanonneRS-14}}.&lt;br /&gt;
&lt;br /&gt;
==Update==&lt;br /&gt;
Acharya, Canonne, and Kamath {{cite|AcharyaCK-14}} showed that $\Omega(\sqrt{\log \log n})$ conditional queries are needed in this case for some constant $\epsilon &amp;gt; 0$. Contrary to the case of only one distribution unknown, if both distributions are unknown, the required number of queries is a function of $n$. Concurrently, in 2015, Falahatgar, Jafarpour, Orlitsky, Pichapathi, and Suresh showed that $O(\frac{\log{\log{n}}}{\varepsilon^5})$ queries are sufficient, so that the query complexity of this problem is determined up to a factor of $\sqrt{\log \log n}$.&lt;/div&gt;</summary>
		<author><name>18.189.42.114</name></author>
		
	</entry>
</feed>