Problem 49: Sketching Earth Mover Distance

From Open Problems in Sublinear Algorithms
Revision as of 23:37, 6 December 2012 by Krzysztof Onak (talk | contribs) (Created page with "{{Header |title=Sketching Earth Mover Distance |source=bertinoro11 |who=Piotr Indyk }} For any two subsets of the planar grid [n]^2, |A|=|B|, define \[ \operatornam...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Suggested by Piotr Indyk
Source Bertinoro 2011
Short link https://sublinear.info/49

For any two subsets A,B of the planar grid [n]^2, |A|=|B|, define \operatorname{EMD}(A,B)=\min_{\pi: A \to B} \sum_{a \in A} \|a-\pi(a)\|_1, where \pi ranges over one-to-one mapping from A to B.

Question: What is the sketching complexity of c-approximating EMD? That is, consider a distribution over mappings L_c that maps subset of [n]^2 to \{0,1\}^s, such that for any sets A,B with |A|=|B|, given L_c(A), L_c(B), one can estimate \operatorname{EMD}(A,B) up to a factor of c, with probability \ge 2/3. Is it possible to construct such a distribution for constant c and s=\operatorname{polylog} n?

Background: It is known that one can achieve s=O(\log n) for c=O(\log n) by embedding EMD into \ell_1 [IndykT-03,Charikar-02], and s=n^{O(1/c)} \operatorname{polylog} n for any c \ge 1 [AndoniDIW-09].