Problem 25: Communication Complexity and Metric Spaces

From Open Problems in Sublinear Algorithms
Revision as of 17:02, 8 November 2019 by Andoni (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Suggested by T. S. Jayram
Source Kanpur 2009
Short link https://sublinear.info/25

Poincaré Inequalities[edit]

Alice and Bob are given two points $x$ and $y$, respectively, from a specific metric space $\mathcal M$. We are interested in deciding whether $d_{\mathcal M}(x,y) \le R$ or $d_{\mathcal M}(x,y) \ge \alpha R$, where $d_{\mathcal M}$ is the distance function of $\mathcal M$, $R > 0$, and $\alpha > 1$. What amount of information must be exchanged in order to solve this problem? Answering this question is interesting in any standard communication model: unrestricted communication between the players, one-way communication, sketching, etc.

The above question can partially be answered if the metric satisfies a specific “gap” Poincaré inequality [AndoniJP-10]. It is known that another kind of Poincaré inequality is equivalent to non-embeddability into $\ell_2^2$ [Matousek-02], but it is not known if non-embeddability into $\ell_2^2$ implies lower bounds for communication complexity. Can one show a formal connection between the communication complexity for approximating the distance between two points and non-embeddability into $\ell_2^2$?

Product Metrics[edit]

We are also interested in the following general class of metrics. Let each $\mathcal M_i= \langle S_i, d_i\rangle$, $1 \le i \le k$, be a metric space on a set $S_i$ with a distance function $d_i$. A product metric space $\bigoplus_{i=1}^{k} \mathcal M_i$ is defined on the product $S_1 \times \ldots \times S_k$ with the distance function $$d\left((x_1,\ldots,x_k),(y_1,\ldots,y_k)\right) = \operatorname{\bf op}\left(d_1(x_1,y_1),\ldots,d_k(x_k,y_k)\right),$$ where ${\bf op}$ is a symmetric operator. For instance, $\bigoplus_{i=1}^{k} \mathcal M_i$ is a proper metric space if ${\bf op}$ is the maximum operator or the $p$-th norm for any $p \in [1,\infty)$. The case when $\bigoplus_{i=1}^{k} \mathcal M_i$ is not necessarily a metric space also finds applications.

Applications of product metric spaces include a nearest neighbor data structure for Ulam distance [AndoniIK-09], and a near-linear time subpolynomial-approximation algorithm for edit distance [AndoniO-09].

The following questions arise in the context of product spaces:

  1. Can one design efficient communication protocols for computing the distance between a pair of points? Suppose that there is an efficient communication protocol for each $\mathcal M_i$. What is the communication complexity for computing the distance between two points in $\bigoplus_{i=1}^{k}\mathcal M_i$? Andoni, Jayram, and Pătraşcu [AndoniJP-10] prove lower bounds for some product metrics. Jayram and Woodruff [JayramW-09] show streaming algorithms which yield communication protocols.
  2. Can one design efficient streaming algorithms and data structures for product metric spaces? In particular, can one efficiently compute the distance between a pair of points? Jayram and Woodruff [JayramW-09] consider the related question of computing cascaded norms.

Update[edit]

It has been shown [AndoniKR-14] that for normed spaces the above implication is true: if a snowflake of a normed space does not embed into $\ell_2$ (in fact, more generally, does not uniformly embed into a Hilbert space), then, there is a non-trivial communication lower bound for distinguishing small and large distances. Furthermore, [KhotN-19] show that a similar statement is not true for metrics in general: there exist metrics which are sketchable with constant approximation and space, but do not $O(1)$-embed into $\ell_{1-\epsilon}$ for any $\epsilon<1/2$.