Multi-armed bandit problems are the predominant theoretical model of exploration-exploitation tradeoffs in learning, and they have countless applications ranging from medical trials, to communication networks, to Web search and advertising. In many of these application domains the learner may be constrained by one or more supply (or budget) limits, in addition to the customary limitation on the time horizon. The literature lacks a general model encompassing these sorts of problems. We introduce such a model, called "bandits with knapsacks", that combines bandit learning with aspects of stochastic integer programming. In particular, a bandit algorithm needs to solve a stochastic version of the well-known "knapsack problem", which is concerned with packing items into a limited-size knapsack. A distinctive feature of our problem, in comparison to the existing regret-minimization literature, is that the optimal policy for a given latent distribution may significantly outperform the policy that plays the optimal fixed arm. Consequently, achieving sublinear regret in the bandits-with-knapsacks problem is significantly more challenging than in conventional bandit problems. We present two algorithms whose reward is close to the information-theoretic optimum: one is based on a novel ``balanced exploration'' paradigm, while the other is a primal-dual algorithm that uses multiplicative updates. Further, we prove that the regret achieved by both algorithms is optimal up to polylogarithmic factors. We illustrate the generality of the problem by presenting applications in a number of different domains, including electronic commerce, routing, and scheduling. As one example of a concrete application, we consider the problem of dynamic posted pricing with limited supply and obtain the first algorithm whose regret, with respect to the optimal dynamic policy, is sublinear in the supply.

#### Worst-case Optimal Join Algorithms

Hung Q. Ngo (University at Buffalo, SUNY); Ely Porat (Bar-Ilan University); Christopher Ré (Stanford University); Atri Rudra (University at Buffalo, SUNY)Many of today's graph query languages are based on graph pattern matching. We investigate optimization of tree-shaped patterns that have transitive closure operators. Such patterns do not only appear in the context of graph databases but were originally studied for querying tree-structured data, where they can perform child-, descendant-, node label-, and wildcard-tests. The minimization problem aims at reducing the number of nodes in patterns and goes back to the early 2000's. We provide an example showing that, in contrast to earlier claims, tree patterns cannot be minimized by deleting nodes only. The example resolves the M =?= NR problem, which asks if a tree pattern is minimal if and only if it is nonredundant. The example can be adapted to prove that minimization is Pi2P-complete, which resolves another question that was open since the early research on the problem. The latter result shows that, unless NP = Pi2P, more general approaches for minimizing tree patterns are also bound to fail in general.

We prove that any algorithm for learning parities requires either a memory of quadratic size or an exponential number of samples. This proves a recent conjecture of Steinhardt, Valiant and Wager and shows that for some learning problems a large storage space is crucial. More formally, in the problem of parity learning, an unknown string $x \in \{0,1\}^n$ was chosen uniformly at random. A learner tries to learn $x$ from a stream of samples $(a_1, b_1), (a_2, b_2)... $, where each $a_t$ is uniformly distributed over $\{0,1\}^n$ and $b_t$ is the inner product of $a_t$ and $x$, modulo 2. We show that any algorithm for parity learning, that uses less than $n^2/25$ bits of memory, requires an exponential number of samples. Previously, there was no non-trivial lower bound on the number of samples needed, for any learning problem, even if the allowed memory size is $O(n)$ (where $n$ is the space needed to store one sample). We also give an application of our result in the field of bounded-storage cryptography. We show an encryption scheme that requires a private key of length $n$, as well as time complexity of $n$ per encryption/decryption of each bit, and is provenly and unconditionally secure as long as the attacker uses less than $n^2/25$ memory bits and the scheme is used at most an exponential number of times. Previous works on bounded-storage cryptography assumed that the memory size used by the attacker is at most linear in the time needed for encryption/decryption.

We say an algorithm on n x n matrices with integer entries in [-M,M] (or n-node graphs with edge weights from [-M,M]) is * truly subcubic* if it runs in O(n^{3-´} * polylog M) time for some ´ > 0. We define a notion of *subcubic reducibility*, and show that many important problems on graphs and matrices solvable in O(n^3) time are * equivalent* under subcubic reductions. Namely, the following weighted problems either * all* have truly subcubic algorithms, or none of them do:

- The all-pairs shortest paths problem on weighted digraphs (APSP).
- Detecting if a weighted graph has a triangle of negative total edge weight.
- Listing up to $n^{2.99}$ negative triangles in an edge-weighted graph.
- Finding a minimum weight cycle in a graph of non-negative edge weights.
- The replacement paths problem on weighted digraphs.
- Finding the second shortest simple path between two nodes in a weighted digraph.
- Checking whether a given matrix defines a metric.
- Verifying the correctness of a matrix product over the $(\min,+)$-semiring.
- Finding a maximum subarray in a given matrix.

The 3SUM problem is to decide, given a set of n real numbers, whether
any three sum to zero. It is widely conjectured that a trivial O(n^{2})-time algorithm is optimal and over the years the consequences of this conjecture have been revealed. This 3SUM conjecture implies &(n^{2}) lower bounds on numerous problems in computational geometry and a variant of the conjecture implies strong lower bounds on triangle enumeration, dynamic graph algorithms, and string matching data structures.
In this paper we refute the 3SUM conjecture. We prove that the decision tree complexity of 3SUM is O(n^{3/2}(log n)) and give two subquadratic 3SUM algorithms, a deterministic one running in O(n^{2} / (log n/ log log n)^{2/3}) time and a randomized one running in O(n^{2} (log log n)^{2} / log n) time with high probability. Our results lead directly to improved bounds for *k*-variate linear degeneracy testing for all odd *k*e3.
Finally, we give a subcubic algorithm for a generalization of the (min,+)-product over real-valued matrices and apply it to the problem of finding zero-weight triangles in weighted graphs. We give a depth-O(n^{5/2}(log n)) decision tree for this problem, as well as an algorithm running in time O(n^{3} (log log n)^{2}/log n).

The $(\Delta+1)$-coloring problem is a fundamental symmetry breaking problem in distributed computing. We give a new randomized coloring algorithm for $(\Delta+1)$-coloring running in $O(\sqrt{\log \Delta})+ 2^{O(\sqrt{\log \log n})}$ rounds with probability $1-1/n^{\Omega(1)}$ in a graph with $n$ nodes and maximum degree $\Delta$. This implies that the $(\Delta+1)$-coloring problem is easier than the maximal independent set problem and the maximal matching problem, due to their lower bounds by Kuhn, Moscibroda, and Wattenhofer [PODC'04]. Our algorithm also extends to the list-coloring problem where the palette of each node contains $\Delta+1$ colors. %Based on a local density measure we decompose a graph into sparse and dense parts, whereas prior work is based on non-local properties [FOCS '89]. Its application yields a new randomized coloring algorithm for $\Delta+1$ coloring running in time $O(\sqrt{\log \Delta}+ 2^{O(\sqrt{\log \log n})})$ time with probability $1-1/n^{\Omega(1)}$ in a general graph with $n$ nodes with maximal degree $\Delta$. Our algorithm improves the state of art up [FOCS '12] up to a factor of $\sqrt{\log \Delta}$. We also cover list-colorings for locally sparse graphs.

#### Invited Article Foreword for 65.2

Eva Tardos#### Invited Article Foreword for 65.3

Eva TardosWe present a probabilistic version of PCF, a well-known simply typed universal functional language. The type hierarchy is based on a single ground type of natural numbers. Even if the language is globally call-by-name, we allow a call-by-value evaluation for ground type arguments in order to provide the language with a suitable algorithmic expressiveness. We describe a denotational semantics based on probabilistic coherence spaces, a model of classical Linear Logic developed in previous works. We prove an adequacy and an equational full abstraction theorem showing that equality in the model coincides with a natural notion of observational equivalence.

We introduce the notion of ``non-malleable codes'' which relaxes the notion of error-correction and error-detection. Informally, a code is non-malleable if the message contained in a modified codeword is either the original message, or a completely unrelated value. In contrast to error-correction and error-detection, non-malleability can be achieved for very rich classes of modifications.
We construct an efficient code that is non-malleable with respect to modifications that effect each bit of the codeword arbitrarily (i.e. leave it untouched, flip it or set it to either 0 or 1), but independently of the value of the other bits of the codeword. Using the probabilistic method, we also show a very strong and general statement: there exists a non-malleable code for *every* ``small enough'' family *F* of functions via which codewords can be modified.
Although this probabilistic method argument does not directly yield efficient constructions, it gives us efficient non-malleable codes in the random-oracle model for very general classes of tampering functions --- e.g. functions where every bit in the tampered codeword can depend arbitrarily on any 99% of the bits in the original codeword.
As an application of non-malleable codes, we show that they provide an elegant algorithmic solution to the task of protecting functionalities implemented in hardware (e.g. signature cards) against ``tampering attacks''. In such attacks, the secret state of a physical system is tampered, in the hopes that future interaction with the modified system will reveal some secret information. This problem, was previously studied in the work of Gennaro et al. in 2004 under the name ``algorithmic tamper proof security'' (ATP). We show that non-malleable codes can be used to achieve important improvements over the prior work. In particular, we show that any functionality can be made secure against a large class of tampering attacks, simply by encoding the secret-state with a non-malleable code while it is stored in memory.

The NP-hard number-partitioning problem is to separate a multiset S of n positive integers into k subsets, such that the largest sum of the integers assigned to any subset is minimized. The classic application is scheduling a set of n jobs with different run times on k identical machines such that the makespan, the elapsed time to complete the schedule, is minimized. The two-way number-partitioning decision problem is one of the original 21 problems Richard Karp proved NP-complete. It is also one of Garey and Johnson's six fundamental NP-complete problems, and the only one based on numbers. This article explores algorithms for solving multi-way number-partitioning problems optimally. We explore previous algorithms as well as our own algorithms which fall into three categories: sequential number partitioning (SNP), a branch-and-bound algorithm; binary-search improved bin completion (BSIBC), a bin-packing algorithm; and cached iterative weakening (CIW), an iterative weakening algorithm. We show experimentally that for large random numbers, SNP and CIW are state of the art algorithms depending on the values of n and k. Both algorithms outperform the previous state of the art by up to seven orders of magnitude in terms of run time.

The celebrated Cheeger's Inequality \cite{am85,a86} establishes a bound on the edge expansion of a graph via its spectrum. This inequality is central to a rich spectral theory of graphs, based on studying the eigenvalues and eigenvectors of the adjacency matrix (and other related matrices) of graphs. It has remained open to define a suitable spectral model for hypergraphs whose spectra can be used to estimate various combinatorial properties of the hypergraph. In this paper we introduce a new hypergraph Laplacian operator generalizing the Laplacian matrix of graphs. In particular, the operator is induced by a diffusion process on the hypergraph, such that within each hyperedge, measure flows from vertices having maximum weighted measure to those having minimum. Since the operator is non-linear, we have to exploit other properties of the diffusion process to recover a spectral property concerning the ``second eigenvalue'' of the resulting Laplacian. Moreover, we show that higher order spectral properties cannot hold in general using the current framework. We consider a stochastic diffusion process, in which each vertex also experiences Brownian noise from outside the system. We show a relationship between the second eigenvalue and the convergence behavior of the process. We show that various hypergraph parameters like multi-way expansion and diameter can be bounded using this operator's spectral properties. Since higher order spectral properties do not hold for the Laplacian operator, we instead use the concept of procedural minimizers to consider higher order Cheeger-like inequalities. For any $k \in \N$, we give a polynomial time algorithm to compute an $O(\log r)$-approximation to the $k$-th procedural minimizer, where $r$ is the maximum cardinality of a hyperedge. We show that this approximation factor is optimal under the \sse~ hypothesis (introduced by \cite{rs10}) for constant values of $k$. Moreover, using the factor preserving reduction from vertex expansion in graphs to hypergraph expansion, we show that all our results for hypergraphs extend to vertex expansion in graphs.

We show that equivalence of deterministic top-down tree-to-string transducers is decidable, thus solving a long standing open problem in formal language theory. We also present efficient algorithms for subclasses: polynomial time for total transducers with unary output alphabet (over a given top-down regular domain language), and co-randomized polynomial time for linear transducers; these results are obtained using techniques from multi-linear algebra. For our main result, we prove that equivalence can be certified by means of inductive invariants using polynomial ideals. This allows us to construct two semi-algorithms, one searching for a proof of equivalence, one for a witness of non-equivalence. Furthermore, we extend our result to deterministic top-down tree-to-string transducers which produce output not in a free monoid but in a free group.

For an even integer t >= 2, the Matching Connectivity matrix H_t is a matrix that has rows and columns both labeled by all perfect matchings of the complete graph K_t on t vertices; an entry H_t[M_1, M_2] is 1 if the union of M_1 and M_2 is a Hamiltonian cycle and 0 otherwise. Motivated by the computational study of the Hamiltonicity problem, we present three results on the structure of H_t : We first show that H_t has rank at most 2^{t/21} over GF(2) via an appropriate factorization that explicitly provides families of matchings X_t forming bases for H_t. Second, we show how to quickly change representation between such bases. Third, we notice that the sets of matchings X_t induce permutation matrices within H_t. Subsequently, we use the factorization to obtain an 1.888^n time Monte Carlo algorithm that solves the Hamiltonicity problem in directed bipartite graphs. Our algorithm as well counts the number of Hamiltonian cycles modulo two in directed bipartite or undirected graphs in the same time bound. Moreover, we use the fast basis change algorithm from the second result to present a Monte Carlo algorithm that given an undirected graph on n vertices along with a path decomposition of width at most pw decides Hamiltonicity in (2 + sqrt(2))^pw n^O(1) time. Finally, we use the third result to show that for every > 0 this (2+sqrt(2)) cannot be improved to (2 + sqrt(2) - eps)^pw n^O(1) time unless the Strong Exponential Time Hypothesis fails, i.e., a faster algorithm for this problem would imply the breakthrough result of a (2-eps)^n time algorithm for CNF-Sat for some > 0.

#### Rumor Spreading and Conductance

Flavio Chierichetti (Sapienza, University of Rome); George Giakkoupis (INRIA); Silvio Lattanzi (Google Research); Alessandro Panconesi (Sapienza, University of Rome)#### Path ORAM: An Extremely Simple Oblivious RAM Protocol

Emil Stefanov (UC Berkeley); Marten van Dijk (University of Connecticut); Elaine Shi (Cornell University); T-H. Hubert Chan (University of Hong Kong); Christopher Fletcher (MIT CSAIL); Ling Ren (MIT CSAIL); Xiangyao Yu (MIT CSAIL); Srinivas Devadas (MIT CSAIL)We show that the existence of a coin-flipping protocol safe against *any* non-trivial constant bias (e.g., .499) implies the existence of one-way functions. This improves upon a recent result of Haitner and Omri [FOCS '11], who proved this implication for protocols with bias (2 - 1)/2 - o(1) H .207. Unlike the result of Haitner and Omri, our result also holds for *weak* coin-flipping protocols.