Difference between revisions of "Open Problems:4"

Suggested by	Sumit Ganguly
Source	Kanpur 2006
Short link	https://sublinear.info/4

Latest revision as of 01:37, 7 March 2013

Given a stream of elements of the form $(i,\delta)$ where $i\in [n]$ and $\delta\in \{-1,1\}$ define the frequency of an element to be $f_i=\sum_{(i,\delta)} \delta$. We wish to find estimates $\hat{f}_i$ for each $f_i$ such that \[|\hat{f}_i-f_i|\leq \epsilon L_1\] where $L_1=\sum_i |f_i|$. The Count-Min algorithm is a randomized $O(\epsilon^{-1} \log (mn) \log \delta^{-1})$-space algorithm that returns such estimates with probability $1-\delta$ [CormodeM-05]. This is nearly optimal as the space lower bound is $O(\epsilon^{-1} \log (m) \log \epsilon n)$ [Ganguly-06].

However, in practice it is desirable to have deterministic algorithms rather than randomized algorithms. Using a deterministic collection of primes [Muthukrishnan-06], Ganguly [Ganguly-06] devised a deterministic $O(\phi^{-2}\epsilon^{-1} \log^2(mn))$-space algorithm that returned all items $i$ with $|f_i|\geq \phi L_1$ and no $j$ satisfying $|f_j|\leq (1-\epsilon)\phi L_1$. While this algorithm has the advantage of being deterministic, it uses more space than the Count-Min algorithm. Does there exist a deterministic algorithm that uses the same amount of space as Count-Min? Such an algorithm would lead to space-efficient algorithms for a range of problems including hierarchical heavy hitters, estimating inner product sizes, approximately optimal $B$-bucket histograms etc. Unfortunately, we conjecture that no such algorithm exists. Either an algorithm or lower bound would be very interesting.

@@ Line 1: / Line 1: @@
-{{DISPLAYTITLE:Problem 4: Deterministic Summary Structures}}
+{{Header
-{{Infobox
+|source=kanpur06
-|label1 = Proposed by
+|who=Sumit Ganguly
-|data1 = Sumit Ganguly
-|label2 = Source
-|data2 = [[Workshops:Kanpur_2006|Kanpur 2006]]
-|label3 = Short link
-|data3 = http://sublinear.info/4
 }}
 Given a stream of elements of the form $(i,\delta)$ where $i\in [n]$ and $\delta\in \{-1,1\}$ define the frequency of an element to be $f_i=\sum_{(i,\delta)} \delta$. We wish to find estimates $\hat{f}_i$ for each $f_i$ such that

Difference between revisions of "Open Problems:4"

Latest revision as of 01:37, 7 March 2013

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools