Difference between revisions of "Open Problems:57"

Suggested by	Atri Rudra
Source	Dortmund 2012
Short link	https://sublinear.info/57

Revision as of 05:06, 12 December 2012

Consider the problem of “codeword testing” in the data stream model. In particular, consider a code $C:\Sigma^k\rightarrow\Sigma^n$ with distance^[1] $d$. The specific problem is the following:

The input to the problem is a vector $\mathbb{y}\in\Sigma^n$ and integer parameters $0\le \tau_1<\tau_2\le n$. The algorithm has to decide whether
\[\Delta(\mathbb{y},C)\le \tau_1~ \mathrm{ or }~ \Delta(\mathbb{y},C)\geq \tau_2,\]
where $\Delta(\mathbb{y},C)$ is the Hamming distance of $\mathbb{y}$ from the closest codeword in $C$.

Ideally, we want a one-pass, $\log^{O(1)}{n}$ space algorithm to solve the problem above for some good code $C$ (that is, we have $k\ge \Omega(n)$ and $d\ge \Omega(n)$). Or if we prove a hardness result, one would like a hardness result for every good code $C$. (For the sake of simplicity, assume that the algorithm has access to some succinct description of the code $C$.)

The main technical motivation comes from the case when $\tau_1=0$ and $\tau_2\ge \epsilon n$ for any fixed $\epsilon>0$ but with constant number of queries to $\mathbb{y}$ (i.e. in the property testing world). This question is perhaps the open question in the codeword testing literature. The case of $\tau_1>0$ also makes sense in the property testing world and has been studied [GuruswamiR-05]. (See the paper for some potential practical motivations.)

One of the original motivation (in [RudraU-10]) for the study of the data-streaming version of the question was possibly to use communication complexity results to prove the impossibility of good locally testable codes.

It was shown in [RudraU-10] that for the well-known Reed-Solomon codes, the data stream version of the problem can be solved for $\tau_1=0$ and $\tau_2=1$ with one pass and logarithmic space. It can also be shown that the classical Berlekamp-Massey algorithm for decoding Reed-Solomon codes implies a solution for the case $\tau_2=\tau_1+1$ with one pass and space $\tilde{O}(\tau_1)$^[2]. Finally, [McGregorRU-11] showed how to solve this problem in one pass and $O(k\log{n})$ space. This question is wide open:

Solve the problem above with one pass and $\tilde{O}(\min(k,\tau_1))$ space.

In fact the very special case of the problem above for $k=\tau_1=\sqrt{n}$ with one pass and space $o(\sqrt{n})$ is also open. This is open even for the special case of Reed-Solomon codes.

Notes

↑ The distance of a code $C$ is the minimum Hamming distance between any two codewords, i.e., $\min_{\mathbb{x}\neq \mathbb{y}\in\Sigma^k} |\{i\in [n]| C(\mathbb{x})_i\neq C(\mathbb{y})_i\}|$.
↑ There is a small catch: the algorithm actually computes the location of errors if the number of errors is at most $\tau_1$. However, results in [RudraU-10] can be used to verify if the returned error locations are indeed correct.

[1] The distance of a code $C$ is the minimum Hamming distance between any two codewords, i.e., $\min_{\mathbb{x}\neq \mathbb{y}\in\Sigma^k} |\{i\in [n]| C(\mathbb{x})_i\neq C(\mathbb{y})_i\}|$.

[2] There is a small catch: the algorithm actually computes the location of errors if the number of errors is at most $\tau_1$. However, results in [RudraU-10] can be used to verify if the returned error locations are indeed correct.

[1]

[2]

@@ Line 5: / Line 5: @@
 }}
 Consider the problem of &ldquo;codeword testing&rdquo; in the data stream model. In particular, consider a code
-\[C:\Sigma^k\rightarrow\Sigma^n,\]
+$C:\Sigma^k\rightarrow\Sigma^n$
-with distance $d$. Recall that the distance of a code $C$ is the minimum Hamming distance between any two codewords, i.e., $\min_{\mathbb{x}\neq \mathbb{y}\in\Sigma^k} |\{i\in [n]| C(\mathbb{x})_i\neq C(\mathbb{y})_i\}|$. The specific problem is the following
+with distance<ref>The distance of a code $C$ is the minimum Hamming distance between any two codewords, i.e., $\min_{\mathbb{x}\neq \mathbb{y}\in\Sigma^k} |\{i\in [n]| C(\mathbb{x})_i\neq C(\mathbb{y})_i\}|$.</ref> $d$. The specific problem is the following:
-  The input to the problem is a vector $\mathbb{y}\in\Sigma^n$ and integer parameters $0\le \tau_1<\tau_2\le n$. The algorithm has to decide whether
+<blockquote>The input to the problem is a vector $\mathbb{y}\in\Sigma^n$ and integer parameters $0\le \tau_1<\tau_2\le n$. The algorithm has to decide whether
-  \[\Delta(\mathbb{y},C)\le \tau_1~ \mathrm{ or }~ \Delta(\mathbb{y},C)\geq \tau_2.\]
+\[\Delta(\mathbb{y},C)\le \tau_1~ \mathrm{ or }~ \Delta(\mathbb{y},C)\geq \tau_2,\]
-  $\Delta(\mathbb{y},C)$ is the Hamming distance of $\mathbb{y}$ from the closest codeword in $C$.
+where $\Delta(\mathbb{y},C)$ is the Hamming distance of $\mathbb{y}$ from the closest codeword in $C$.</blockquote>
 Ideally, we want a one-pass, $\log^{O(1)}{n}$ space algorithm to solve the problem above for some ''good'' code $C$ (that is, we have $k\ge \Omega(n)$ and $d\ge \Omega(n)$). Or if we prove a hardness result, one would like a hardness result for ''every'' good code $C$. (For the sake of simplicity, assume that the algorithm has access to some succinct description of the code $C$.)
@@ Line 17: / Line 17: @@
 One of the original motivation (in {{cite|RudraU-10}}) for the study of the data-streaming version of the question was possibly to use communication complexity results to prove the impossibility of good locally testable codes.
-It was shown in {{cite|RudraU-10}} that for the well-known Reed-Solomon codes, the data stream version of the problem can be solved for $\tau_1=0$ and $\tau_2=1$ with one pass and logarithmic space. It can also be shown that the classical Berlekamp-Massey algorithm for decoding Reed-Solomon codes implies a solution for the case $\tau_2=\tau_1+1$ with one pass and space $\tilde{O}(\tau_1)$. (There is a small catch: the algorithm actually computes the location of errors ''if'' the number of errors is at most $\tau_1$. However, results in {{cite|RudraU-10}} can be used to verify if the returned error locations are indeed correct.)
+It was shown in {{cite|RudraU-10}} that for the well-known Reed-Solomon codes, the data stream version of the problem can be solved for $\tau_1=0$ and $\tau_2=1$ with one pass and logarithmic space. It can also be shown that the classical Berlekamp-Massey algorithm for decoding Reed-Solomon codes implies a solution for the case $\tau_2=\tau_1+1$ with one pass and space $\tilde{O}(\tau_1)$<ref>There is a small catch: the algorithm actually computes the location of errors ''if'' the number of errors is at most $\tau_1$. However, results in {{cite|RudraU-10}} can be used to verify if the returned error locations are indeed correct.</ref>.
 Finally, {{cite|McGregorRU-11}} showed how to solve this problem in one pass and $O(k\log{n})$ space.
 This question is wide open:
-  Solve the problem above with one pass and $\tilde{O}(\min(k,\tau_1))$ space.
+<blockquote>Solve the problem above with one pass and $\tilde{O}(\min(k,\tau_1))$ space.</blockquote>
 In fact the very special case of the problem above for $k=\tau_1=\sqrt{n}$ with one pass and space $o(\sqrt{n})$ is also open. This is open even for the special case of Reed-Solomon codes.
+==Notes==
+<references />

Difference between revisions of "Open Problems:57"

Revision as of 05:06, 12 December 2012

Notes

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools