Problem 7: Estimating Earth-Mover Distance

From Open Problems in Sublinear Algorithms
Revision as of 05:00, 16 November 2012 by Krzysztof Onak (talk | contribs)
Jump to: navigation, search
Suggested by Piotr Indyk
Source Kanpur 2006
Short link https://sublinear.info/7

Consider a stream of red points $R$ and blue points $B$ from a 2-dimensional grid $[\Delta]^2$, in an arbitrary order. We assume $|R|=|B|=n$. The Earth-Mover Distance (EMD) between $R$ and $B$ is the value of the min-cost matching between $R$ and $B$, i.e., \[ \mbox{EMD}(R,B)= \min_{\pi: R \to B} \sum_{p \in R} \|p-\pi(p)\|, \] where $\pi$ is a one-to-one mapping, and $\|\cdot\|$ is (say) the $L_1$ norm.

What are the space vs. approximation tradeoffs achievable by streaming algorithms for this problem? In particular, is there an $O(1)$-approximation algorithm using $(\log n+\log \Delta)^{O(1)}$ space?

It is known that that there is an $O(\log \Delta)$-approximation algorithm using that much space [Indyk-04]. That algorithm proceeds essentially by embedding EMD into $L_1$ [Charikar-02,IndykT-03]. However, any such embedding must incur at least $\Omega( \sqrt{\log \Delta})$ distortion [NaorS-06]. So one would need to do something else to get $O(1)$-approximation.