Many of today's graph query languages are based on graph pattern matching. We investigate optimization of tree-shaped patterns that have transitive closure operators. Such patterns do not only appear in the context of graph databases but were originally studied for querying tree-structured data, where they can perform child-, descendant-, node label-, and wildcard-tests. The minimization problem aims at reducing the number of nodes in patterns and goes back to the early 2000's. We provide an example showing that, in contrast to earlier claims, tree patterns cannot be minimized by deleting nodes only. The example resolves the M =?= NR problem, which asks if a tree pattern is minimal if and only if it is nonredundant. The example can be adapted to prove that minimization is Pi2P-complete, which resolves another question that was open since the early research on the problem. The latter result shows that, unless NP = Pi2P, more general approaches for minimizing tree patterns are also bound to fail in general.

We prove that any algorithm for learning parities requires either a memory of quadratic size or an exponential number of samples. This proves a recent conjecture of Steinhardt, Valiant and Wager and shows that for some learning problems a large storage space is crucial. More formally, in the problem of parity learning, an unknown string $x \in \{0,1\}^n$ was chosen uniformly at random. A learner tries to learn $x$ from a stream of samples $(a_1, b_1), (a_2, b_2)... $, where each $a_t$ is uniformly distributed over $\{0,1\}^n$ and $b_t$ is the inner product of $a_t$ and $x$, modulo 2. We show that any algorithm for parity learning, that uses less than $n^2/25$ bits of memory, requires an exponential number of samples. Previously, there was no non-trivial lower bound on the number of samples needed, for any learning problem, even if the allowed memory size is $O(n)$ (where $n$ is the space needed to store one sample). We also give an application of our result in the field of bounded-storage cryptography. We show an encryption scheme that requires a private key of length $n$, as well as time complexity of $n$ per encryption/decryption of each bit, and is provenly and unconditionally secure as long as the attacker uses less than $n^2/25$ memory bits and the scheme is used at most an exponential number of times. Previous works on bounded-storage cryptography assumed that the memory size used by the attacker is at most linear in the time needed for encryption/decryption.

We give solutions to two fundamental computational problems in ontology-based data access with the W3C standard ontology language OWL2QL: the succinctness problem for first-order rewritings of ontology-mediated queries (OMQs), and the complexity problem for OMQ answering. We classify OMQs according to the shape of their conjunctive queries (treewidth, the number of leaves) and the existential depth of their ontologies. For each of these classes, we determine the combined complexity of OMQ answering, and whether all OMQs in the class have polynomial-size first-order, positive existential, and nonrecursive datalog rewritings. We obtain the succinctness results using hypergraph programs, a new computational model for Boolean functions, which makes it possible to connect the size of OMQ rewritings and circuit complexity.

We say an algorithm on n x n matrices with integer entries in [-M,M] (or n-node graphs with edge weights from [-M,M]) is * truly subcubic* if it runs in O(n^{3-´} * polylog M) time for some ´ > 0. We define a notion of *subcubic reducibility*, and show that many important problems on graphs and matrices solvable in O(n^3) time are * equivalent* under subcubic reductions. Namely, the following weighted problems either * all* have truly subcubic algorithms, or none of them do:

- The all-pairs shortest paths problem on weighted digraphs (APSP).
- Detecting if a weighted graph has a triangle of negative total edge weight.
- Listing up to $n^{2.99}$ negative triangles in an edge-weighted graph.
- Finding a minimum weight cycle in a graph of non-negative edge weights.
- The replacement paths problem on weighted digraphs.
- Finding the second shortest simple path between two nodes in a weighted digraph.
- Checking whether a given matrix defines a metric.
- Verifying the correctness of a matrix product over the $(\min,+)$-semiring.
- Finding a maximum subarray in a given matrix.

In Artificial Intelligence, a key question concerns how an agent may rationally revise its beliefs in light of new information. The standard (AGM) approach to belief revision assumes that the underlying logic contains classical propositional logic. This is a significant limitation, since many representation schemes in AI don't subsume propositional logic. In this paper we consider the question of what the minimal requirements are on a logic, such that the AGM approach to revision may be formulated. We show that AGM-style revision can be obtained even when extremely little is assumed of the underlying language and its semantics; in fact, one requires little more than a language with sentences that are satisfied at models, or possible worlds. The classical AGM postulates are expressed in this framework and a representation result is established between the postulate set and certain preorders on possible worlds. To obtain the representation result, we add a new postulate to the AGM postulates, and we add a constraint to preorders on worlds. Crucially, both of these additions are redundant in the original AGM framework, and so we extend, rather than modify, the AGM approach. As well, iterated revision is addressed and shown to be compatible with our approach. Various examples are given to illustrate the approach, including Horn clause revision, revision in extended logic programs, and belief revision in a very basic logic called literal revision.

This paper presents a wp-style calculus for obtaining bounds on the expected runtime of randomized algorithms. Its application includes determining the (possibly infinite) expected termination time of a randomized algorithm and proving positive almost-sure termination---does a program terminate with probability one in finite expected time? We provide several proof rules for bounding the runtime of loops, and prove the soundness of the approach with respect to a simple operational model. We show that our approach is a conservative extension of Nielson's approach for reasoning about the runtime of deterministic programs. We analyze the expected runtime of some example programs including the coupon collector's problem, a one--dimensional random walk and a randomized binary search.

The 3SUM problem is to decide, given a set of n real numbers, whether
any three sum to zero. It is widely conjectured that a trivial O(n^{2})-time algorithm is optimal and over the years the consequences of this conjecture have been revealed. This 3SUM conjecture implies &(n^{2}) lower bounds on numerous problems in computational geometry and a variant of the conjecture implies strong lower bounds on triangle enumeration, dynamic graph algorithms, and string matching data structures.
In this paper we refute the 3SUM conjecture. We prove that the decision tree complexity of 3SUM is O(n^{3/2}(log n)) and give two subquadratic 3SUM algorithms, a deterministic one running in O(n^{2} / (log n/ log log n)^{2/3}) time and a randomized one running in O(n^{2} (log log n)^{2} / log n) time with high probability. Our results lead directly to improved bounds for *k*-variate linear degeneracy testing for all odd *k*e3.
Finally, we give a subcubic algorithm for a generalization of the (min,+)-product over real-valued matrices and apply it to the problem of finding zero-weight triangles in weighted graphs. We give a depth-O(n^{5/2}(log n)) decision tree for this problem, as well as an algorithm running in time O(n^{3} (log log n)^{2}/log n).

In 1979 Valiant showed that the complexity class VP_e of families with polynomially bounded formula size is contained in the class VP_s of families that have algebraic branching programs (ABPs) of polynomially bounded size. Motivated by the problem of separating these classes we study the topological closure VP_e-bar, i.e. the class of polynomials that can be approximated arbitrarily closely by polynomials in VP_e. We describe VP_e-bar with a strikingly simple complete polynomial (in characteristic different from 2) whose recursive definition is similar to the Fibonacci numbers. Further understanding this polynomial seems to be a promising route to new formula lower bounds. Our methods are rooted in the study of ABPs of small constant width. In 1992 Ben-Or and Cleve showed that formula size is polynomially equivalent to width-3 ABP size. We extend their result (in characteristic different from 2) by showing that approximate formula size is polynomially equivalent to approximate width-2 ABP size. This is surprising because in 2011 Allender and Wang gave explicit polynomials that cannot be computed by width-2 ABPs at all! The details of our construction lead to the aforementioned characterization of VP_e-bar. As a natural continuation of this work we prove that the class VNP can be described as the class of families that admit a hypercube summation of polynomially bounded dimension over a product of polynomially many affine linear forms. This gives the first separations of algebraic complexity classes from their nondeterministic analogs.

For dynamic networks with {\em unknown diameter}, we prove novel lower bounds on the time complexity of a range of basic distributed computing problems. Together with trivial upper bounds under dynamic networks with {\em known diameter} for these problems, our lower bounds show that the complexities of all these problems are {\em sensitive} to whether the diameter is known to the protocol beforehand: Not knowing the diameter increases the time complexities by a large poly$(N)$ factor as compared to when the diameter is known, resulting in an exponential gap. Our lower bounds are obtained via communication complexity arguments and by reducing from the two-party {\sc DisjointnessCP} problem. We further prove that sometimes this large poly$(N)$ cost can be completely avoided if the protocol is given a good estimate on $N$. In other words, having such an estimate makes some problems no longer sensitive to unknown diameter.

We present a probabilistic version of PCF, a well-known simply typed universal functional language. The type hierarchy is based on a single ground type of natural numbers. Even if the language is globally call-by-name, we allow a call-by-value evaluation for ground type arguments in order to provide the language with a suitable algorithmic expressiveness. We describe a denotational semantics based on probabilistic coherence spaces, a model of classical Linear Logic developed in previous works. We prove an adequacy and an equational full abstraction theorem showing that equality in the model coincides with a natural notion of observational equivalence.

We introduce the notion of ``non-malleable codes'' which relaxes the notion of error-correction and error-detection. Informally, a code is non-malleable if the message contained in a modified codeword is either the original message, or a completely unrelated value. In contrast to error-correction and error-detection, non-malleability can be achieved for very rich classes of modifications.
We construct an efficient code that is non-malleable with respect to modifications that effect each bit of the codeword arbitrarily (i.e. leave it untouched, flip it or set it to either 0 or 1), but independently of the value of the other bits of the codeword. Using the probabilistic method, we also show a very strong and general statement: there exists a non-malleable code for *every* ``small enough'' family *F* of functions via which codewords can be modified.
Although this probabilistic method argument does not directly yield efficient constructions, it gives us efficient non-malleable codes in the random-oracle model for very general classes of tampering functions --- e.g. functions where every bit in the tampered codeword can depend arbitrarily on any 99% of the bits in the original codeword.
As an application of non-malleable codes, we show that they provide an elegant algorithmic solution to the task of protecting functionalities implemented in hardware (e.g. signature cards) against ``tampering attacks''. In such attacks, the secret state of a physical system is tampered, in the hopes that future interaction with the modified system will reveal some secret information. This problem, was previously studied in the work of Gennaro et al. in 2004 under the name ``algorithmic tamper proof security'' (ATP). We show that non-malleable codes can be used to achieve important improvements over the prior work. In particular, we show that any functionality can be made secure against a large class of tampering attacks, simply by encoding the secret-state with a non-malleable code while it is stored in memory.

The NP-hard number-partitioning problem is to separate a multiset S of n positive integers into k subsets, such that the largest sum of the integers assigned to any subset is minimized. The classic application is scheduling a set of n jobs with different run times on k identical machines such that the makespan, the elapsed time to complete the schedule, is minimized. The two-way number-partitioning decision problem is one of the original 21 problems Richard Karp proved NP-complete. It is also one of Garey and Johnson's six fundamental NP-complete problems, and the only one based on numbers. This article explores algorithms for solving multi-way number-partitioning problems optimally. We explore previous algorithms as well as our own algorithms which fall into three categories: sequential number partitioning (SNP), a branch-and-bound algorithm; binary-search improved bin completion (BSIBC), a bin-packing algorithm; and cached iterative weakening (CIW), an iterative weakening algorithm. We show experimentally that for large random numbers, SNP and CIW are state of the art algorithms depending on the values of n and k. Both algorithms outperform the previous state of the art by up to seven orders of magnitude in terms of run time.

We prove near-optimal trade-offs for quantifier depth versus number of variables in first-order logic by exhibiting pairs of n-element structures that can be distinguished by a k-variable first-order sentence but where every such sentence requires quantifier depth at least n^{©(k/logk)}. Our trade-offs also apply to first-order counting logic, and by the known connection to the k-dimensional Weisfeiler-Leman algorithm imply near-optimal lower bounds on the number of refinement iterations.
A key component in our proof is the hardness condensation technique recently introduced by [Razborov '16] in the context of proof complexity. We apply this method to reduce the domain size of relational structures while maintaining the minimal quantifier depth to distinguish them in finite variable logics.