Editing Open Problems:83
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone.
Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 1: | Line 1: | ||
{{Header | {{Header | ||
+ | |title=Instance-optimal Hellinger testing | ||
|source=focs17 | |source=focs17 | ||
|who=Clément Canonne | |who=Clément Canonne | ||
Line 5: | Line 6: | ||
Given the full description of a fixed distribution $q$ over a discrete domain (say $[n]=\{1,\dots,n\}$), as well as access to i.i.d. samples from an unknown probability distributions $p$ over $[n]$ and distance parameter $\varepsilon\in(0,1]$, the identity testing problem asks to distinguish w.h.p. between (i) $p=q$ and (ii) $\operatorname{d}_{\rm TV}(p,q)>\varepsilon$. | Given the full description of a fixed distribution $q$ over a discrete domain (say $[n]=\{1,\dots,n\}$), as well as access to i.i.d. samples from an unknown probability distributions $p$ over $[n]$ and distance parameter $\varepsilon\in(0,1]$, the identity testing problem asks to distinguish w.h.p. between (i) $p=q$ and (ii) $\operatorname{d}_{\rm TV}(p,q)>\varepsilon$. | ||
− | The sample complexity of this question as a function of $n$ and $\varepsilon$ is fully understood by now: $\Theta(\sqrt{n}/\varepsilon^2)$ are necessary and sufficient, the worst-case lower bound following from taking $q$ to be the uniform distribution on $[n]$. Valiant and Valiant {{cite|ValiantV-14}} shown an ''instance- | + | The sample complexity of this question as a function of $n$ and $\varepsilon$ is fully understood by now: $\Theta(\sqrt{n}/\varepsilon^2)$ are necessary and sufficient, the worst-case lower bound following from taking $q$ to be the uniform distribution on $[n]$. Valiant and Valiant {{cite|ValiantV-14}} shown an ''instance-optimal'' bound on this problem, where the sample complexity $\Psi_{\rm TV}$ now only depends on $n$ and the (massive) parameter $q$ instead of $n$: namely, that |
$$\Psi_{\rm TV}(q,\varepsilon) = \Theta\left(\max\left( \frac{\Phi(q,\Theta(\varepsilon))}{\varepsilon^2}, \frac{1}{\varepsilon}\right)\right)$$ | $$\Psi_{\rm TV}(q,\varepsilon) = \Theta\left(\max\left( \frac{\Phi(q,\Theta(\varepsilon))}{\varepsilon^2}, \frac{1}{\varepsilon}\right)\right)$$ | ||
samples were necessary and sufficient, where $\Phi$ is the functional defined by taking the $2/3$-pseudonorm of the vector of probabilities of $q$, once both the biggest element and $\varepsilon$ total mass of the smallest elements had been removed: | samples were necessary and sufficient, where $\Phi$ is the functional defined by taking the $2/3$-pseudonorm of the vector of probabilities of $q$, once both the biggest element and $\varepsilon$ total mass of the smallest elements had been removed: | ||
$ | $ | ||
\Phi(q,\varepsilon) = \lVert q^{-\max}_{-\varepsilon} \rVert_{2/3} | \Phi(q,\varepsilon) = \lVert q^{-\max}_{-\varepsilon} \rVert_{2/3} | ||
− | $. Using different techniques, Blais, Canonne, and Gur {{cite|BlaisCG-17}} then established a similar instance- | + | $. Using different techniques, Blais, Canonne, and Gur {{cite|BlaisCG-17}} then established a similar instance-optimal bound, with regard to a different functional, the "K-functional $\kappa$ between $\ell_1$ and $\ell_2$ spaces:" |
$ | $ | ||
\Psi_{\rm TV}(q,\varepsilon)=\Omega\left({\kappa_p(1-\Theta(\varepsilon))}/{\varepsilon}\right), \Psi_{\rm TV}(q,\varepsilon)=O\left({\kappa_p(1-\Theta(\varepsilon))}/{\varepsilon^2}\right) | \Psi_{\rm TV}(q,\varepsilon)=\Omega\left({\kappa_p(1-\Theta(\varepsilon))}/{\varepsilon}\right), \Psi_{\rm TV}(q,\varepsilon)=O\left({\kappa_p(1-\Theta(\varepsilon))}/{\varepsilon^2}\right) | ||
Line 19: | Line 20: | ||
\operatorname{d}_{\rm H}(p,q) = \frac{1}{\sqrt{2}}\lVert\sqrt{p}-\sqrt{q}\rVert_2\,. | \operatorname{d}_{\rm H}(p,q) = \frac{1}{\sqrt{2}}\lVert\sqrt{p}-\sqrt{q}\rVert_2\,. | ||
$$ | $$ | ||
− | Results of Daskalakis, Kamath, and Wright {{cite|DaskalakisKW-18}} show that the ''worst-case'' sample complexity remains $\Theta(\sqrt{n}/\varepsilon^2)$. Moreover, due to the quadratic dependence between Hellinger and total variation distances, both instance- | + | Results of Daskalakis, Kamath, and Wright {{cite|DaskalakisKW-18}} show that the ''worst-case'' sample complexity remains $\Theta(\sqrt{n}/\varepsilon^2)$. Moreover, due to the quadratic dependence between Hellinger and total variation distances, both instance-optimal bounds mentioned above apply, yet with possibly a quadratic gap between upper and lower bounds in terms of $\varepsilon$: leading to bounds on the instance-optimal sample complexity $\Psi_{\rm H}$ of Hellinger identity testing of |
− | $$\Psi_{\rm TV}(q,\varepsilon) \leq \Psi_{\rm H}(q,\varepsilon) \leq \Psi_{\rm TV}(q,\varepsilon^2) | + | $$\Psi_{\rm TV}(q,\varepsilon) \leq \Psi_{\rm H}(q,\varepsilon) \leq \Psi_{\rm TV}(q,\varepsilon^2)$$. |
What is the right dependence on $\varepsilon$ of $\Psi_{\rm H}$? | What is the right dependence on $\varepsilon$ of $\Psi_{\rm H}$? | ||
− | ''Note that in both instance- | + | ''Note that in both instance-optimal bounds obtained for $\Psi_{\rm TV}$, there exist (simple) examples of $q$ where $\varepsilon$ ends up ''in the exponent,'' so this quadratic gap is not innocuous even for constant $\varepsilon$.'' |