Latest Articles

## Reconciliation With Nonbinary Gene Trees Revisited

By reconciling the phylogenetic tree of a gene family with the corresponding species tree, it is possible to infer lineage-specific duplications and... (more)

## Source Sets

Stateless model checking is a powerful method for program verification that, however, suffers from an exponential growth in the number of explored executions. A successful technique for reducing this number, while still maintaining complete coverage, is Dynamic Partial Order Reduction (DPOR), an algorithm originally introduced by Flanagan and... (more)

## The Tractability Frontier of Graph-Like First-Order Query Sets

The focus of this work is first-order model checking, by which we refer to the problem of deciding whether or not a given first-order sentence is... (more)

## Optimal Induced Universal Graphs and Adjacency Labeling for Trees

In this article, we show that there exists a graph G with O(n) nodes such that any forest of n nodes is an induced subgraph of G. Furthermore, for... (more)

## Bounds on Monotone Switching Networks for Directed Connectivity

We separate monotone analogues of L and NL by proving that any monotone switching network solving directed connectivity on n vertices must have size... (more)

##### NEWS

Important Note on P/NP: Some submissions purport to solve a long-standing open problem in complexity theory, such as the P/NP problem. Many of these turn out to be mistaken, and such submissions tax JACM volunteer editors and reviewers. JACM remains open to the possibility of eventual resolution of P/NP and related questions, and continues to welcome submissions on the subject. However, to mitigate the burden of repeated resubmissions due to incremental corrections of errors identified during editorial review, no author may submit more than one such paper to JACM, ACM Trans. on Algorithms, or ACM Trans. on Computation in any 24-month period, except by invitation of the Editor-in-Chief. This applies to resubmissions of previously rejected manuscripts. Please consider this policy before submitting a such a paper.

The Journal of the ACM (JACM) provides coverage of the most significant work on principles of computer science, broadly construed. The scope of research we cover encompasses contributions of lasting value to any area of computer science. To be accepted, a paper must be judged to be truly outstanding in its field.  JACM is interested  in work in core computer science and at the boundaries, both the boundaries of subdisciplines of computer science and the boundaries between computer science and other fields.

##### Forthcoming Articles
Separations in Query Complexity Based on Pointer Functions

In 1986, Saks and Wigderson conjectured that the largest separation between deterministic and zero-error randomized query complexity for a total boolean function is given by the function $f$ on $n=2^k$ bits defined by a complete binary tree of NAND gates of depth $k$, which achieves $R_0(f) = O(D(f)^{0.7537\ldots})$. We show this is false by giving an example of a total boolean function $f$ on $n$ bits whose deterministic query complexity is $\Omega(n/\log(n))$ while its zero-error randomized query complexity is $\tO(\sqrt{n})$. We further show that the quantum query complexity of the same function is $\tO(n^{1/4})$, giving the first example of a total function with a super-quadratic gap between its quantum and deterministic query complexities. We also construct a total boolean function $g$ on $n$ variables that has zero-error randomized query complexity $\Omega(n/\log(n))$ and bounded-error randomized query complexity $R(g) = \tO(\sqrt{n})$. This is the first super-linear separation between these two complexity measures. The exact quantum query complexity of the same function is $Q_E(g) = \tO(\sqrt{n})$. These functions show that the relations $D(f) = O(R_1(f)^2)$ and $R_0(f) = \tO(R(f)^2)$ are optimal, up to poly-logarithmic factors. Further variations of these functions give additional separations between other query complexity measures: a cubic separation between $Q$ and $R_0$, a $3/2$-power separation between $Q_E$ and $R$, and a 4th power separation between approximate degree and bounded-error randomized query complexity. All of these examples are variants of a function recently introduced by Goos, Pitassi, and Watson which they used to separate the unambiguous 1-certificate complexity from deterministic query complexity and to resolve the famous Clique versus Independent Set problem in communication complexity.

The matching polytope has exponential extension complexity

A popular method in combinatorial optimization is to express polytopes P, which may potentially have exponentially many facets, as solutions of linear programs that use few extra variables to reduce the number of constraints down to a polynomial. After two decades of standstill, recent years have brought amazing progress in showing lower bounds for the so called extension complexity, which for a polytope P denotes the smallest number of inequalities necessary to describe a higher dimensional polytope Q that can be linearly projected on P. However, the central question in this field remained wide open: can the perfect matching polytope be written as an LP with polynomially many constraints? We answer this question negatively. In fact, the extension complexity of the perfect matching polytope in a complete n-node graph is 2^Omega(n). By a known reduction this also improves the lower bound on the extension complexity for the TSP polytope from 2^Omega(n^1/2) to 2^Omega(n).

Parallel-Correctness and Transferability for Conjunctive Queries

A dominant cost for query evaluation in modern massively distributed systems is the number of communication rounds. For this reason, there is a growing interest in single-round multiway join algorithms where data is first reshuffled over many servers and then evaluated in a parallel but communication-free way. The reshuffling itself is specified as a distribution policy. We introduce a correctness condition, called parallel-correctness, for the evaluation of queries w.r.t. a distribution policy. We study the complexity of parallel-correctness for conjunctive queries as well as transferability of parallel-correctness between queries. We also investigate the complexity of transferability for certain families of distribution policies, including, for instance, the Hypercube distribution.

Polynomial Time Corresponds to Solutions of Polynomial Ordinary Differential Equations of Polynomial Length

The outcomes of this paper are twofold. Implicit Complexity: We provide an implicit characterization of polynomial time computation in terms of ordinary differential equations: we characterize the class PTIME of languages computable in polynomial time in terms of differential equations with polynomial right-hand side. This result gives a purely continuous elegant and simple characterization of PTIME. We believe it is the first time complexity classes are characterized using only ordinary differential equations. Our characterization extends to functions computable in polynomial time over the reals in the sense of computable analysis. Our results may provide a new perspective on classical complexity, by giving a way to define complexity classes, like PTIME, in a very simple way, without any reference to a notion of (discrete) machine. This may also provide ways to state classical questions about computational complexity via ordinary differential equations. Continuous-Time Models of Computation. Our results can also be interpreted in terms of analog computers or analog models of computation: As a side effect, we get that the 1941 General Purpose Analog Computer (GPAC) of Claude Shannon is provably equivalent to Turing machines both in terms of computability and complexity, a fact that has never been established before. This result provides arguments in favour of a generalised form of the Church-Turing Hypothesis, which states that any physically realistic (macroscopic) computer is equivalent to Turing machines both in terms of computability and complexity.

Proving the Incompatibility of Efficiency and Strategyproofness via SMT Solving

Two important requirements when aggregating the preferences of multiple agents are that the outcome should be economically efficient and the aggregation mechanism should not be manipulable. In this paper, we provide a computer-aided proof of a sweeping impossibility using these two conditions for randomized aggregation mechanisms. More precisely, we show that every efficient aggregation mechanism can be manipulated for all expected utility representations of the agents' preferences. This settles an open problem and strengthens a number of existing theorems, including statements that were shown within the special domain of assignment. Our proof is obtained by formulating the claim as a satisfiability problem over predicates from real-valued arithmetic, which is then checked using an SMT (satisfiability modulo theories) solver. In order to verify the correctness of the result, a minimal unsatisfiable set of constraints returned by the SMT solver was then translated back into a proof in higher-order logic, which was automatically verified by an interactive theorem prover. To the best of our knowledge, this is the first application of SMT solvers in computational social choice.

Which Is the Fairest (Rent Division) of Them All?

What is a fair way to assign rooms to several housemates, and divide the rent between them? This is not just a theoretical question: many people have used the Spliddit website to obtain envy-free solutions to rent division instances. But envy freeness, in and of itself, is insufficient to guarantee outcomes that people view as intuitive and acceptable. We therefore focus on solutions that optimize a criterion of social justice, subject to the envy freeness constraint, in order to pinpoint the "fairest" solutions. We develop a general algorithmic framework that enables the computation of such solutions in polynomial time. We then study the relations between natural optimization objectives, and identify the maximin solution, which maximizes the minimum utility subject to envy freeness, as the most attractive. We demonstrate, in theory and using experiments on real data from Spliddit, that the maximin solution gives rise to significant gains in terms of our optimization objectives. Finally, a user study with Spliddit users as subjects demonstrates that people find the maximin solution to be significantly fairer than arbitrary envy-free solutions; this user study is unprecedented in that it asks people about their real-world rent division instances. Based on these results, the maximin solution has been deployed on Spliddit since April 2015.

Communication Steps for Parallel Query Processing

We study the problem of computing conjunctive queries over large databases on parallel architectures without shared storage. Using the structure of such a query q and the skew in the data, we study tradeoffs between the number of processors, the number of rounds of communication, and the per-processor load -- the number of bits each processor can send or can receive in a single round -- that are required to compute q. Since each processor must store its received bits, the load is at most the number of bits of storage per processor. When the data is free of skew, we obtain essentially tight upper and lower bounds for one round algorithms and we show how the bounds degrade when there is skew in the data. In the case of skewed data, we show how to improve the algorithms when approximate degrees of the (necessarily small number of) heavy-hitter elements are available, obtaining essentially optimal algorithms for queries such as skewed simple joins and skewed triangle join queries. For queries that we identify as tree-like, we also prove nearly matching upper and lower bounds for multi-round algorithms for a natural class of skew-free databases. One consequence of these latter lower bounds is that for any µ>0, using p processors to compute the connected components of a graph, or to output the path, if any, between a specified pair of vertices of a graph with m edges and per-processor load that is O(m/p^(1-µ)) requires ©(log p) rounds of communication. Our upper bounds are given by simple structured algorithms using MapReduce. Our one-round lower bounds are proved in a very general model, which we call the Massively Parallel Communication (MPC) model, that allows processors to communicate arbitrary bits. Our multi-round lower bounds apply in a restricted version of the MPC model in which processors in subsequent rounds after the first communication round are only allowed to send tuples.

Estimating the Unseen: Improved Estimators for Entropy and other Properties

We show that a class of statistical properties of distributions, which includes such practically relevant properties as entropy, the number of distinct elements, and distance metrics between pairs of distributions, can be estimated given a sublinear sized sample. Specifically, given a sample consisting of independent draws from any distribution over at most k distinct elements, these properties can be estimated accurately using a sample of size O(k/log k). For these estimation tasks, this performance is optimal, to constant factors. Complementing these theoretical results, we also demonstrate that our estimators perform exceptionally well, in practice, for a variety of estimation tasks, on a variety of natural distributions, for a wide range of parameters. The key step in our approach is to first use the sample to characterize the unseen portion of the distributioneffectively reconstructing this portion of the distribution as accurately as if one had a logarithmic factor larger sample. This goes beyond such tools as the Good-Turing frequency estimation scheme, which estimates the total probability mass of the unobserved portion of the distribution: we seek to estimate the shape of the unobserved portion of the distribution. This work can be seen as introducing a robust, general, and theoretically principled framework that, for many practical applications, essentially amplifies the sample size by a logarithmic factor; we expect that it may be fruitfully used as a component within larger machine learning and statistical analysis systems.

#### Qualitative Determinacy and Decidability of Stochastic Games with Signals

The Complexity of Mean-Payoff Pushdown Games

Two-player games on graphs are central in many problems in formal verification and program analysis such as synthesis and verification of open systems. In this work we consider solving recursive game graphs (or pushdown game graphs) that model the control flow of sequential programs with recursion. While pushdown games have been studied before with qualitative objectives, such as reachability and \omega-regular objectives, in this work we study for the first time such games with the most well-studied quantitative objective, namely, mean-payoff objective. In pushdown games two types of strategies are relevant: (1) global strategies, that depend on the entire global history; and (2) modular strategies, that have only local memory and thus do not depend on the context of invocation, but only on the history of the current invocation of the module. Our main results are as follows: (1) One-player pushdown games with mean-payoff objectives under global strategies are decidable in polynomial time. (2) Two-player pushdown games with mean-payoff objectives under global strategies are undecidable. (3) One-player pushdown games with mean-payoff objectives under modular strategies are NP-hard. (4) Two-player pushdown games with mean-payoff objectives under modular strategies can be solved in NP (i.e., both one-player and two-player pushdown games with mean-payoff objectives under modular strategies are NP-complete). We also establish the optimal strategy complexity by showing that global strategies for mean-payoff objectives require infinite memory even in one-player pushdown games; and memoryless modular strategies are sufficient in two-player pushdown games. Finally we also show that all the problems have the same complexity if the stack boundedness condition is added, where along with the mean-payoff objective the player must also ensure that the stack height is bounded.

Characterizing Transactional Memory Consistency Conditions using Observational Refinement

Transactional memory (TM) facilitates the development of concurrent applications by letting a programmer designate certain code blocks as atomic. The common approach to stating TM correctness is through a consistency condition that restricts the possible TM executions. Unfortunately, existing consistency conditions fall short of formalizing the intuitive semantics of atomic blocks through which programmers use a TM. To close this gap, we formalize programmer expectations as observational refinement between TM implementations. This states that properties of a program using a concrete TM implementation can be established by analyzing its behavior with an abstract TM, serving as a specification of the concrete one. We show that a variant of transactional memory specification (TMS), a TM consistency condition, is equivalent to observational refinement for a programming language where local variables are rolled back upon a transaction abort. We thereby establish that TMS is the weakest acceptable condition for this case. We then propose a new consistency condition, called Strong TMS, and show that is equivalent to observational refinement for a language where local variables are not rolled back upon aborts. Finally, we show that, under certain natural assumptions on TM implementations, Strong TMS is equivalent to a variant of a well-known condition of opacity. Our results suggest a new approach to evaluating TM consistency conditions and enable TM implementors and language designers to make better-informed decisions.

#### Streaming Tree Transducers

The Applied Pi Calculus: Mobile Values, New Names, and Secure Communication

We study the interaction of the programming construct "new", which generates statically scoped names, with communication via messages on channels. This interaction is crucial in security protocols, which are the main motivating examples for our work; it also appears in other programming-language contexts. We define the applied pi calculus, a simple, general extension of the pi calculus in which values can be formed from names via the application of built-in functions, subject to equations, and be sent as messages. (In contrast, the pure pi calculus lacks built-in functions; its only messages are atomic names.) We develop semantics and proof techniques for this extended language and apply them in reasoning about security protocols. This paper essentially subsumes the conference paper that introduced the applied pi calculus in 2001. It fills gaps, incorporates improvements, and further explains and studies the applied pi calculus. Since 2001, the applied pi calculus has been the basis for much further work, described in many research publications and sometimes embodied in useful software, such as the tool ProVerif, which relies on the applied pi calculus to support the specification and automatic analysis of security protocols. Although this paper does not aim to be a complete review of the subject, it benefits from that further work and provides better foundations for some of it. In particular, the applied pi calculus has evolved through its implementation in ProVerif, and the present definition reflects that evolution.

###### All ACM Journals | See Full Journal Index

Search JACM
enter search term and/or author name