Problem 61: RNA Folding

From Open Problems in Sublinear Algorithms
(Redirected from Waiting:Qin Zhang)
Jump to: navigation, search
Suggested by Qin Zhang
Source Bertinoro 2014
Short link https://sublinear.info/61

An RNA sequence is a string of letters from the alphabet , where \mbox{A---U} and \mbox{C---G} form pairings. A set of pairings in such a string is said to be non-crossing if there are no pairs of the form (i, j) and (k, l), where i < k < j < l.

A maximum non-crossing matching is a pairing of \mbox{A---U} and \mbox{C---G} of maximum cardinality that is non-crossing. Given a string of length n, such a matching can be computed in O(n^2) space and O(n^3) time via dynamic programming [Eddy-04].

Note that there is a trivial 2-approximation to the optimal matching. Find the optimal matchings on the \{\mbox{A}, \mbox{U}\} and \{\mbox{C}, \mbox{G}\} subsequences, and take the larger one. In particular, this implies that a 2-approximation to the size of the optimal non-crossing matching can be computed in O(1) space.

How well can the size of the optimal non-crossing matching be approximated in \operatorname{polylog}(n) space and a small number of passes?

(Additional question from Alex Andoni: Can the problem be solved in the RAM model with running time better than the trivial dynamic programming?)