Difference between revisions of "Open Problems:17"

From Open Problems in Sublinear Algorithms
Jump to: navigation, search
m (1 revision)
m (updating header)
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
{{DISPLAYTITLE:Problem 17: The Massive, Unordered, Distributed-Data Model}}
+
{{Header
{{Infobox
+
|source=kanpur06
|label1 = Proposed by
+
|who=S. Muthkrishnan
|data1 = S. Muthkrishnan
 
|label2 = Source
 
|data2 = [[Workshops:Kanpur_2006|Kanpur 2006]]
 
|label3 = Short link
 
|data3 = http://sublinear.info/17
 
 
}}
 
}}
 
The Massive, Unordered, Distributed-data (MUD) model was recently introduced by Feldman et al. {{cite|FeldmanMSSS-06}} as an abstraction of part of the infrastructure used at Google. It is  related to the MapReduce framework presented in {{cite|DeanG-04}}. In the multi-round, multi-key MUD model, $n$ data records are distributed arbitrarily between $M$ machines. Each machine maps each record to (key, value) pairs. All pairs corresponding to the same key are then “reduced”  to a single record. This reduction is performed by an $O(\operatorname{polylog} n)$-space streaming computation. The process repeats for a total of $l$ rounds.
 
The Massive, Unordered, Distributed-data (MUD) model was recently introduced by Feldman et al. {{cite|FeldmanMSSS-06}} as an abstraction of part of the infrastructure used at Google. It is  related to the MapReduce framework presented in {{cite|DeanG-04}}. In the multi-round, multi-key MUD model, $n$ data records are distributed arbitrarily between $M$ machines. Each machine maps each record to (key, value) pairs. All pairs corresponding to the same key are then “reduced”  to a single record. This reduction is performed by an $O(\operatorname{polylog} n)$-space streaming computation. The process repeats for a total of $l$ rounds.
  
 
The model is very powerful and it was proven that any EREW-PRAM algorithm can be simulated in the multi-round, multi-key MUD model if the number of keys and rounds is sufficiently large {{cite|FeldmanMSSS-06}}. In practice we are primarily interested in computing with a small number of keys and rounds. What can be computed given $k$ keys and $l$ rounds?
 
The model is very powerful and it was proven that any EREW-PRAM algorithm can be simulated in the multi-round, multi-key MUD model if the number of keys and rounds is sufficiently large {{cite|FeldmanMSSS-06}}. In practice we are primarily interested in computing with a small number of keys and rounds. What can be computed given $k$ keys and $l$ rounds?

Latest revision as of 01:41, 7 March 2013

Suggested by S. Muthkrishnan
Source Kanpur 2006
Short link https://sublinear.info/17

The Massive, Unordered, Distributed-data (MUD) model was recently introduced by Feldman et al. [FeldmanMSSS-06] as an abstraction of part of the infrastructure used at Google. It is related to the MapReduce framework presented in [DeanG-04]. In the multi-round, multi-key MUD model, $n$ data records are distributed arbitrarily between $M$ machines. Each machine maps each record to (key, value) pairs. All pairs corresponding to the same key are then “reduced” to a single record. This reduction is performed by an $O(\operatorname{polylog} n)$-space streaming computation. The process repeats for a total of $l$ rounds.

The model is very powerful and it was proven that any EREW-PRAM algorithm can be simulated in the multi-round, multi-key MUD model if the number of keys and rounds is sufficiently large [FeldmanMSSS-06]. In practice we are primarily interested in computing with a small number of keys and rounds. What can be computed given $k$ keys and $l$ rounds?