Problem 21: Deterministic Heavy-Hitters & Fast Matrix Algorithms

From Open Problems in Sublinear Algorithms
Jump to: navigation, search
Suggested by Martin Strauss
Source Kanpur 2006
Short link

An important ingredient in many recent algorithms for heavy hitters is the Restricted Isometry Property, defined in [CandesRT-06] and, equivalently, in [Donoho-06]. A matrix $\Phi$ with $d$ columns has the $m$-RIP if any submatrix $\Phi_0$ of $m$ columns has low-distortion, i.e., for all $x$, we have $\|x\|_2 \le \|\Phi_0 x\|_2\le 2\|x\|_2$ (after appropriate normalization). The identity matrix has this property; we want to minimize the number of rows in matrices with the $m$-RIP against a lower bound of $\Omega(m\log d)$ rows. It is known that an $O(m\log d)\times d$ matrix of independent Gaussian entries has this property with high probability. Because it is expensive to store fully random numbers, researchers have also looked at pseudorandom and deterministic constructions. It is also known [RudelsonV-06] that a random collection of $m\log^4(d)$ rows of a $d\times d$ Fourier matrix (or any unitary matrix whose entries have bounded magnitude) has the $m$-RIP. A technique in [CormodeM-06,Muthukrishnan-06a] gives a deterministic construction of a matrix with $m^2\operatorname{polylog}(d)$ rows with the $m$-RIP.

This leads to the following open questions:

  1. Give a polynomial-time deterministic construction of a $m\operatorname{polylog}(d)\times d$ matrix with the $m$-RIP. One possibility is constructing a set of $m\operatorname{polylog} (d)$ rows of the Fourier matrix.
  2. Give a zero-error randomized construction of such a matrix. Equivalently, give a deterministic polynomial-time test for such matrices (which can be applied to randomized constructions).
  3. Improve the number of rows for the Fourier construction from $O(m\log^4 d)$ to $O(m\log d)$ (or show a larger lower bound for Fourier matrices in particular). If necessary, substitute another unitary, bounded-magnitude matrix of your choice for Fourier.

Some related open questions are as follows. A bottleneck in the runtime of [GilbertSTV-07] is the time to multiply an $m\times m$ submatrix $F_{RC}$ of the the $d\times d$ Fourier matrix $F$ by a vector $v$ of length $m$.

  1. Provide a $o(m^2)$-time algorithm to multiply $F_{RC}$ by $v$, given as worst case input $v$, the subset $R$ of rows and the subset $C$ of columns.
  2. Provide a $o(m^2)$-time algorithm to multiply $F_{RC}$ by $v$, given as worst case input $v$ and the subset $C$ of columns, but given random set $R$ of rows. The algorithm should take time $o(m^2)$ in expectation or with high probability with respect to $R$.
  3. Same questions, but with Fourier replaced by another unitary, bounded-magnitude matrix of your choice.