Problem 67: Difficult Instance for Max-Cut in the Streaming Model

From Open Problems in Sublinear Algorithms
Jump to: navigation, search
Suggested by Robert Krauthgamer
Source Bertinoro 2014
Short link https://sublinear.info/67

We are interested in Max-Cut in the streaming model, and specifically in the tradeoff between approximation and space (storage) complexity. Formally, in the Max-Cut problem, the input is a graph $G$, and the goal is to compute the maximum number of edges that cross any single cut $(V_1,V_2)$. This is clearly equivalent to computing the least number of edges that need to be removed to make the graph bipartite. In the streaming model, we assume that the input graph is seen as a sequence of edges, in an arbitrary order, and the goal is to compute the Max-Cut value, i.e., the number of edges (or approximate it). There is no need to report the cut itself. For instance, it is easy to approximate Max-Cut within factor $1/2$ using $O(\log n)$ space, by simply counting the total number edges in the input and reporting $|E(G)|/2$.

Here is a concrete suggestion for a hard input distribution, which is known to be a hard instance for bipartiteness testing in sparse graphs [GoldreichR-02]. Let $G'$ be a graph consisting of a cycle of length $n$ (where $n$ is even) and a random matching. It is known that with high probability, $G'$ is an expander and at least $0.01 n$ edges have to be removed to turn it into a bipartite graph. Let $G$ be a graph consisting of a cycle of length $n$ and a random matching, with the constraint that the matching must consist only of odd chords: these are chords that are an odd number of vertices apart on the cycle. It is easy to see that $G$ is always bipartite.

The total number of edges in both $G'$ and $G$ is exactly $3n/2$. It is easy to see that

  • $G$ has a cut of size $3n/2$,
  • with high probability, $G'$ has no cut of size greater than $(3/2 - 0.01)n$.

How much space is required to distinguish between these two graphs in the streaming model? Is it $\Omega(n)$? And what about the (multi-round) communication complexity of the problem, namely, the edges of the input graph are split between two parties, Alice and Bob, who need to estimate the Max-Cut?