Algorithms and Complexity Seminar 2012-2013

Thursday, October 4, 2012: Speaker: Liu Yang (CMU). Topic: Active Property Testing

One motivation for property testing of boolean functions is the idea that testing can provide a fast preprocessing step before learning. However, in most machine learning applications, it is not possible to request for labels of arbitrary examples constructed by an algorithm. Instead, the dominant query paradigm in applied machine learning, called active learning, is one where the algorithm may query for labels, but only on points in a given (polynomial-sized) unlabeled sample, drawn from some underlying distribution D. In this work, we bring this well-studied model to the domain of testing.

We develop both general results for this active testing model as well as efficient testing algorithms for several important properties for learning, demonstrating that testing can still yield substantial benefits in this restricted setting. For example, we show that testing unions of d intervals can be done with O(1) label requests in our setting, whereas it is known to require Omega(d) labeled examples for learning (and Omega(sqrt{d}) for passive testing [KR00] where the algorithm must pay for every example drawn from D). In fact, our results for testing unions of intervals also yield improvements on prior work in both the classic query model (where any point in the domain can be queried) and the passive testing model as well. For the problem of testing linear separators in R^n over the Gaussian distribution, we show that both active and passive testing can be done with O(sqrt{n}) queries, substantially less than the Omega(n) needed for learning, with near-matching lower bounds. We also present a general combination result in this model for building testable properties out of others, which we then use to provide testers for a number of assumptions used in semi-supervised learning.

In addition to the above results, we also develop a general notion of the testing dimension of a given property with respect to a given distribution, that we show characterizes (up to constant factors) the intrinsic number of label requests needed to test that property. We develop such notions for both the active and passive testing models. We then use these dimensions to prove a number of lower bounds, including for linear separators and the class of dictator functions.

Joint work with Nina Balcan, Eric Blais and Avrim Blum

Thursday, November 29, 2012: Speaker: Dan Feldman (MIT). Topic: Learning patterns in Big data from small data using core-sets

When we need to solve an optimization problem we usually use the best available algorithm/software or try to improve it. In recent years we have started exploring a different approach: instead of improving the algorithm, reduce the input data and run the existing algorithm on the reduced data to obtain the desired output much faster on a streaming input, using a manageable amount of memory, and in parallel (say, using Hadoop, cloud service, or GPUs).

A core-set for a given problem is a semantic compression of its input, in the sense that a solution for the problem with the (small) coreset as input yields a provable approximate solution to the problem with the original (Big) data.

In this talk I will describe how we applied this magical paradigm to obtain algorithmic achievements with performance guarantees in iDiary: a system that combines robotics, sensor networks, computer vision, differential privacy, and text mining. It turns large signals collected from smart-phones or robots sensors into textual descriptions of the trajectories. The system features a user interface similar to Google Search that allows users to type text queries on their activities (e.g., "Where did I buy books?") and receive textual

Thursday, December 6, 2012: Speaker: Madhav Jha (Penn State). Topic: Testing and Reconstruction of Lipschitz Functions with Applications to Data Privacy.

A function f : D → R is Lipschitz if d_R(f(x), f(y)) ≤ d_D(x, y) for all x, y in D, where d_R and d_D denote the distance metrics on the range and domain of f , respectively. We initiate the study of testing and local reconstruction of the Lipschitz property of functions. A property tester has to distinguish functions with the property (in this case, Lipschitz) from functions that differ from every function with the property on many values. A local filter reconstructs a desired property (in this case, Lipschitz) in the following sense: given an arbitrary function f and a query x, it returns g(x), where the resulting function g satisfies the property, changing f only when necessary. If f has the property, g must be equal to f .

We design efficient testers and local reconstructors for functions over domains of the form {1, . . . , n}^d , equipped with ell_1 distance, and give corresponding impossibility results. The algorithms we design have applications to program analysis and data privacy. The application to privacy is based on the fact that a function f of entries in a database of sensitive information can be released with noise of magnitude proportional to a Lipschitz constant of f, while preserving the privacy of individuals whose data is stored in the database (Dwork, McSherry, Nissim and Smith, TCC 2006). We give a differentially private mechanism, based on local filters, for releasing a function f when a purported Lipschitz constant of f is provided by a distrusted client.

Based on a FOCS 2011 paper with S. Raskhodnikova and RANDOM 2012 papers with P. Awasthi, M. Molinaro and S. Raskhodnikova.

Wednesday, December 12, 2012: Speaker: Grigory Yaroslavtsev (Penn State). Topic: Beating the Direct Sum Theorem in Communication Complexity

A direct sum theorem for two parties and a function f states that the communication cost of solving k copies of f simultaneously with error probability 1/3 is at least k * R_{1/3}(f), where R_{1/3}(f) is the communication required to solve a single copy of f with error probability 1/3. We improve this for a natural family of functions f, showing that the 1-way communication required to solve k copies of f simultaneously with error probability 1/3 is (k R_{1/k}(f)). Since R_{1/k}(f) may be as large as R_{1/3}(f) * log k, we asymptotically beat the standard direct sum bound for such functions, showing that the trivial upper bound of solving each of the k copies of f with probability 1 – O(1/k) and taking a union bound is optimal.

Our results imply optimal communication/space lower bounds for several sketching problems in a setting when the algorithm should be correct on a sequence of k queries.

Thursday, January 3, 2013: Speaker: Morteza Zadimoghaddam (MIT). Topic: Learning Disjunctions: Near-Optimal Trade-off Between Mistakes and "I Don't Know"s

We develop polynomial-time online algorithms for learning disjunctions. In this model, we are given an online adversarial sequence of inputs for an unknown disjunction function of the form f(x_1, x_2, ..., x_n) = OR_{i \in S} x_i , and for each such input, we must guess "true", "false", or "I don't know". On the algorithm side, we show how make at most \epsilon n mistakes while answering "I don't know" at most (1/\epsilon)^{2^{O(1/\epsilon)}} n times. Furthermore, we show how to make O(nlog(log(n))/log(n)) mistakes while answering "I don't know" O(n^2 log(log(n))) times. We also show that any algorithm making o(n/log(n)) mistakes must answer "I don't know" a superpolynomial number of times.

This is a joint work with Erik Demaine.

Wednesday, February 6, 2013: Speaker: Raghu Meka (IAS). Topic: Beating the Union Bound via Geometric Techniques

The union bound is one of the mainstays of the probabilistic method and analysis of randomized algorithms. However, this simplistic approach does not give the full picture for many important cases, with Lovasz local lemma being a particularly salient example. In this talk I will discuss two recent results that go beyond the union bound via geometric techniques:
1) Optimal size, explicit eps-nets for computing Gaussian integrals. Besides being interesting on its own, our result has applications to computing the supremum of Gaussian processes and cover times of graphs.
2) A new elementary and constructive proof of Spencer's celebrated six standard deviations suffice theorem. We introduce a new 'entropic' rounding technique that can be used in a variety of algorithmic settings and appears to have much broader potential.

For both problems, our methods critically exploit certain geometric and symmetry properties of the Gaussian space.

Wednesday, February 20, 2013: Speaker: Andrew V. Goldberg (Microsoft Research -- Silicon Valley). Topic: Highway Dimension: From Practice to Theory and Back

Computing driving directions has motivated many shortest path heuristics that answer queries on continental-scale networks, with tens of millions of intersections, in real time. The underlying algorithms are highly practical and intuitive, but until recently there was no theoretical explanation of their practical performance. We review some of these algorithms and give the first theoretical analysis for them on a non-trivial class of networks.

For our analysis, we introduce the notion of highway dimension, which is a strengtherning of the doubling dimension. The analysis gives good bounds for networks with low highway dimension. Our analysis explores an unexpected relationship to VC-dimension, which leads to better algorithms.

We also show that the hub labeling algorithm achieves a better theoretical time bound. This motivates a heuristic implementation of this algorithm. Our experimenters show that the implementation outperforms the fastest previous codes, validating the theoretical prediction.

Joint work with Ittai Abraham, Daniel Delling, Amos Fiat, and Renato Werneck.

Friday, March 8, 2013: Speaker: Sidharth (Sid) Jaggi (The Chinese University of Hong Kong). Topic: Robust sparse recovery and applications: order-optimal measurements and complexity

Sparse recovery problems are usually in the setting in which a vector x with ambient dimension n has only k "significant" entries. This vector x is "compressively measured" (with possibly "noisy" measurements) as y, where y has just m ≪ n components, and the goal is to reconstruct (a good approximation to) x.

We consider three sparse recovery problems:

Compressive Sensing: A linear problem in which y is a (possibly noisy) linear transform of x, a vector in R^n.
Group Testing: A non-linear problem in which y is a binary vector comprising of (possibly noisy) disjunctive measurements of a binary x vector.
Network tomography: A linear problem in which x, a vector in R^n, denotes network parameters (such as edge/node delays) to be estimated via constrained, end-to-end measurements (such as path delays).

In each of these cases we present sparse recovery algorithms that have good reconstruction performance, and have information-theoretically order-optimal decoding complexity and number of measurements (O(k) measurements and decoding complexity for compressive sensing and network tomography, and O(k log(n)) measurements and decoding complexity for group testing.)

Thursday, March 21, 2013, at 2pm: Speaker: Vijay Vazirani (Georgia Tech). Topic: Non-separable utilities are as easy as separable ones -- if they are Leontief-free

Conventional wisdom has it that computing an equilibrium for a market under non-separable utilities is difficult -- even if restricted to the piecewise-linear, concave (PLC) case, for which irrationality sets in and no fast algorithms are known. In contrast, recent results have shown that markets under separable PLC utilities have rational equilibria and admit practical algorithms.

Using the powerful machinery of LCP formulations and complementary pivot algorithms, we show that a subclass of non-separable PLC, which we call Leontief-free PLC, is just as easy as separable PLC. Furthermore, adding even one Leontief-type constraint renders it difficult.

The same applies for production, where by separable PLC production we mean using one raw good to produce one finished good in a piecewise-linear, concave manner.

Based on joint work with Jugal Garg and Ruta Mehta.

Wednesday, April 10, 2013: Speaker: Ping Li (Cornell). Topic: Efficient Compressed Sensing with L0 Projections

Many applications concern sparse signals, for example, detecting anomalies from the differences between consecutive images taken by surveillance cameras. In general, anomaly events are sparse. This talk focuses on the problem of recovering a K-sparse signal in N dimensions (coordinates). Classical theories in compressed sensing say the required number of measurement is M = O(K log N). In our most recent work on L0 projections, we show that an idealized algorithm needs about M = 5K measurements, regardless of N. In particular, 3 measurements suffice when K = 2 nonzeros. Practically, our method is very fast, accurate, and very robust against measurement noises. Even when there are no sufficient measurements, the algorithm can still accurately reconstruct a significant portion of the nonzero coordinates, without catastrophic failures (unlike popular methods such as linear programming). This is joint work with Cun-Hui Zhang at Rutgers University.

Wednesday, April 17, 2013: Speaker: Arnab Bhattacharyya (DIMACS/Rutgers). Topic: Every locally characterized affine-invariant property is testable

Let F = F_p for any fixed prime p >= 2. An affine-invariant property is a property of functions on F^n that is closed under taking affine transformations of the domain. We prove that all affine-invariant property having local characterizations are testable. In fact, we show a proximity-oblivious test for any such property P, meaning that there is a test that, given an input function f, makes a constant number of queries to f, always accepts if f satisfies P, and rejects with positive probability if the distance between f and P is nonzero. More generally, we show that any affine-invariant property that is closed under taking restrictions to subspaces and has bounded complexity is testable.

We also prove that any property that can be described as the property of decomposing into a known structure of low-degree polynomials is locally characterized and is, hence, testable. For example, whether a function is a product of two degree-d polynomials, whether a function splits into a product of d linear polynomials, and whether a function has low rank are all examples of degree-structural properties and are therefore locally characterized.

Our results depend on a new Gowers inverse theorem by Tao and Ziegler for low characteristic fields that decomposes any polynomial with large Gowers norm into a function of low-degree non-classical polynomials. We establish a new equidistribution result for high rank non-classical polynomials that drives the proofs of both the testability results and the local characterization of degree-structural properties.

Joint work with Eldar Fischer, Hamed Hatami, Pooya Hatami, and Shachar Lovett.

Wednesday, April 24, 2013 at 3pm: Speaker: Michael Forbes (MIT). Topic: Explicit Noether Normalization for Simultaneous Conjugation via Polynomial Identity Testing

Polynomial identity testing asks whether a given algebraic circuit C computes the identically zero polynomial. Randomized algorithms are known for this problem, via the Schwartz-Zippel lemma, and in recent years there has been much effort to find a deterministic algorithm, even for restricted types of circuits C. Such algorithms would have several algorithmic applications, as well as implications for circuit lower bounds.

Mulmuley recently observed that the "Noether Normalization Lemma" of commutative-algebra/algebraic-geometry can be "derandomized" assuming a deterministic algorithm for polynomial identity testing, and argued that this is further evidence that constructing such algorithms are hard. He also conjectured that the specific case of Noether Normalization (for "matrices under simultaneous conjugation") would still be hard. In this work, we derandomize this specific case of Noether Normalization, using a recent polynomial identity testing algorithm of the authors for the class of read-once oblivious algebraic branching programs.

Joint work with Amir Shpilka.

Wednesday, May 8, 2013: Speaker: Zeyuan Allen Zhu (MIT). Topic: A Simple, Combinatorial and Nearly-Linear-Time Algorithm for Solving Ax=b When A is Symmetric Diagonally Dominant

Fast algorithms for solving linear systems Ax=b when A is symmetric diagonally dominant (SDD) or graph Laplacian have found broad applications across both the theory and practice of computer science, playing a foundational role in scientific computing, machine learning, random processes, image processing, network analysis, and computational biology, among others. More recently, they have emerged as a powerful tool in the design of graph algorithms, enabling researchers to break longstanding algorithmic barriers for a wide range of fundamental graph problems, including finding maximum flows and multicommodity flows, generating random spanning trees, sparsifying graphs, routing in distributed systems, and finding sparse cuts and balanced separators.

Finding fast solvers for these systems has been an active area of research, culminating in algorithms that solve them in nearly-linear time. These solvers are quite complex, and developing them required multiple fundamental innovations in spectral and combinatorial graph theory, graph algorithms, and computational linear algebra, which were used to construct and analyze an intricate recursively preconditioned iterative solver.

In this talk, I will give a very simple combinatorial algorithm that solves SDD systems in nearly-linear time. It uses very little of the machinery that previously appeared to be necessary for a such an algorithm. It does not require recursive preconditioning, or even the Chebyshev Method or Conjugate Gradient. After constructing a "nice" spanning tree of a graph associated to the linear system, the entire algorithm consists of the repeated application of a simple (non-recursive) update rule, which it implements using a lightweight data structure. The algorithm is numerically stable and can be implemented without the increased bit-precision required by previous solvers. As such, the algorithm presented has the fastest known running time under the standard unit-cost RAM model.

This is joint work with Jonathan Kelner, Lorenzo Orecchia and Aaron Sidford.

Wednesday, May 15, 2013: Speaker: Or Meir (Stanford). Topic: IP = PSPACE using error correcting codes

The IP theorem, which asserts that IP = PSPACE (Lund et. al., and Shamir, in J. ACM 39(4)), is one of the major achievements of complexity theory. The known proofs of the theorem are based on the arithmetization technique, which transforms a quantified Boolean formula into a related polynomial. The intuition that underlies the use of polynomials is commonly explained by the fact that polynomials constitute good error correcting codes. However, the known proofs seem tailored to the use of polynomials, and do not generalize to arbitrary error correcting codes.

In this work, we show that the IP theorem can be proved by using general error correcting codes. We believe that this establishes a rigorous basis for the aforementioned intuition, and sheds further light on the IP theorem.

Wednesday, May 22, 2013: Speaker: Vladimir Braverman (Johns Hopkins University). Topic: Approximating Large Frequency Moments with Pick-and-Drop Sampling

Given data stream $D = \{p_1,p_2,...,p_m\}$ of size $m$ of numbers from $\{1,..., n\}$, the frequency of $i$ is defined as $f_i = |\{j: p_j = i\}|$. The $k$-th \emph{frequency moment} of $D$ is defined as $F_k = \sum_{i=1}^n f_i^k$. We consider the problem of approximating frequency moments in insertion-only streams for $k\ge 3$. For any constant $c$ we show an $O(n^{1-2/k}\log(n)\log^{(c)}(n))$ upper bound on the space complexity of the problem. Here $\log^{(c)}(n)$ is the iterative $\log$ function. To simplify the presentation, we make the following assumptions: $n$ and $m$ are polynomially far; approximation error $\epsilon$ and parameter $k$ are constants. We observe a natural bijection between streams and special matrices. Our main technical contribution is a non-uniform sampling method on matrices. We call our method a \emph{pick-and-drop sampling}; it samples a heavy element (i.e., element $i$ with frequency $\Omega(F_k)$) with probability $\Omega(1/n^{1-2/k})$ and gives approximation $\tilde{f_i} \ge (1-\epsilon)f_i$. In addition, the estimations never exceed the real values, that is $ \tilde{f_j} \le f_j$ for all $j$. As a result, we reduce the space complexity of finding a heavy element to $O(n^{1-2/k}\log(n))$ bits. We apply our method of recursive sketches and resolve the problem with $O(n^{1-2/k}\log(n)\log^{(c)}(n))$ bits.

Friday, July 12, 2013: Speaker: Atri Rudra (University at Buffalo). Topic: One algorithm to rule them all: One join query at a time

We present a recent algorithm (PODS 2012) that is the first provably optimal (worst-case) algorithm to compute database joins.

As a special case, we show that this join algorithm implies (i) The first algorithmic versions of some well-known geometric inequalities due to Loomis and Whitney (and their generalizations by Bollobas and Thomason); (ii) Algorithmically list recoverable codes that work with parameters that no known algorithmic list recovery result work with (e.g. those based on the Reed-Solomon codes) and an application of this result in designing sublinear time decodable compressed sensing schemes; (iii) Worst-case optimal algorithm to list all occurrences of any fixed hypergraph H in a given large hypergraph G.

We believe that this algorithm should find many more applications. (If time permits, I'll also mention some followup work on instance optimal join algorithms.)

This talk will focus on (i) and (ii) and is based on joint works with Gilbert, Ngo, Nguyen, Porat, Re and Strauss.

Algorithms and Complexity Seminar: 2012-2013

Organizer: Eric Price `ecprice@mit.edu`

Partial Schedule:

Archives from previous semesters:

People who attended talks like these also attended:

Algorithms and Complexity Seminar: 2012-2013

Organizer: Eric Price ecprice@mit.edu

Partial Schedule:

Archives from previous semesters:

People who attended talks like these also attended:

Organizer: Eric Price `ecprice@mit.edu`