Difference between revisions of "Open Problems:57"
m (updated header) |
|||
Line 1: | Line 1: | ||
{{Header | {{Header | ||
− | |||
|source=dortmund12 | |source=dortmund12 | ||
|who=Atri Rudra | |who=Atri Rudra |
Latest revision as of 02:00, 7 March 2013
Suggested by | Atri Rudra |
---|---|
Source | Dortmund 2012 |
Short link | https://sublinear.info/57 |
Consider the problem of “codeword testing” in the data stream model. In particular, consider a code $C:\Sigma^k\rightarrow\Sigma^n$ with distance[1] $d$. The specific problem is the following:
The input to the problem is a vector $\mathbb{y}\in\Sigma^n$ and integer parameters $0\le \tau_1<\tau_2\le n$. The algorithm has to decide whether
\[\Delta(\mathbb{y},C)\le \tau_1~ \mathrm{ or }~ \Delta(\mathbb{y},C)\geq \tau_2,\]
where $\Delta(\mathbb{y},C)$ is the Hamming distance of $\mathbb{y}$ from the closest codeword in $C$.
Ideally, we want a one-pass, $\log^{O(1)}{n}$ space algorithm to solve the problem above for some good code $C$ (that is, we have $k\ge \Omega(n)$ and $d\ge \Omega(n)$). Or if we prove a hardness result, one would like a hardness result for every good code $C$. (For the sake of simplicity, assume that the algorithm has access to some succinct description of the code $C$.)
The main technical motivation comes from the case when $\tau_1=0$ and $\tau_2\ge \epsilon n$ for any fixed $\epsilon>0$ but with constant number of queries to $\mathbb{y}$ (i.e. in the property testing world). This question is perhaps the open question in the codeword testing literature. The case of $\tau_1>0$ also makes sense in the property testing world and has been studied [GuruswamiR-05]. (See the paper for some potential practical motivations.)
One of the original motivation (in [RudraU-10]) for the study of the data-streaming version of the question was possibly to use communication complexity results to prove the impossibility of good locally testable codes.
It was shown in [RudraU-10] that for the well-known Reed-Solomon codes, the data stream version of the problem can be solved for $\tau_1=0$ and $\tau_2=1$ with one pass and logarithmic space. It can also be shown that the classical Berlekamp-Massey algorithm for decoding Reed-Solomon codes implies a solution for the case $\tau_2=\tau_1+1$ with one pass and space $\tilde{O}(\tau_1)$[2]. Finally, [McGregorRU-11] showed how to solve this problem in one pass and $O(k\log{n})$ space. This question is wide open:
Solve the problem above with one pass and $\tilde{O}(\min(k,\tau_1))$ space.
In fact the very special case of the problem above for $k=\tau_1=\sqrt{n}$ with one pass and space $o(\sqrt{n})$ is also open. This is open even for the special case of Reed-Solomon codes.
Notes[edit]
- ↑ The distance of a code $C$ is the minimum Hamming distance between any two codewords, i.e., $\min_{\mathbb{x}\neq \mathbb{y}\in\Sigma^k} |\{i\in [n]| C(\mathbb{x})_i\neq C(\mathbb{y})_i\}|$.
- ↑ There is a small catch: the algorithm actually computes the location of errors if the number of errors is at most $\tau_1$. However, results in [RudraU-10] can be used to verify if the returned error locations are indeed correct.