We study learning problems involving arbitrary classes of functions $F$, distributions $X$ and targets $Y$. Because \emph{proper} learning procedures, i.e., procedures that are only allowed to select functions in $F$, tend to perform poorly unless the problem satisfies some additional structural property (e.g., that $F$ is convex), we consider \emph{unrestricted learning procedures} that are free to choose functions outside the given class. We present a new unrestricted procedure that is optimal in a very strong sense: the required sample complexity is essentially the best one can hope for, and the estimate holds for (almost) any problem, including heavy-tailed situations. Moreover, the sample complexity coincides with the what one would expect if $F$ were convex, even when $F$ is not. And if $F$ is convex, the procedure turns out to be proper. Thus, the unrestricted procedure is actually optimal in both realms, for convex classes as a proper procedure and for arbitrary classes as an unrestricted procedure.

Query answering over incomplete data invariably relies on the standard notion of certain answers which gives a very coarse classification of query answers into those that are certain and those that are not. Our goal is to refine it by measuring how close an answer is to certainty. This measure is defined as the probability that the query is true under a random interpretation of missing information in a database. Since there are infinitely many such interpretations, to pick one at random we adopt the approach used in the study of asymptotic properties and 0--1 laws for logical sentences, and define the measure as the limit of a sequence. We prove that without any restrictions imposed, the standard model of missing data admits the 0--1 law. That is, the limit always exists and can be only 0 or 1 for a very large class of queries. In other words, query answers are either almost certainly true, or almost certainly false. We show that almost certainly true answers are precisely those returned by the naive evaluation of the query. When restrictions are imposed and databases are required to satisfy constraints, the measure is the conditional probability of the query being true if the constraints are true. This too is defined as a limit; we prove that it always exists, can be an arbitrary rational number, and is computable. For some constraints, such as functional dependencies, the 0-1 law continues to hold. We also look at evaluation procedures based on many-valued logics, as used in relational database systems that handle incomplete information. We identify conditions when such evaluation procedures return almost certainly true answers, and explain reasons why real-life DBMSs break such conditions and can thus return arbitrarily bad answers. As another refinement of the notion of certainty, we introduce a comparison of query answers: an answer with a larger set of interpretations that make it true is better. We identify the precise complexity of such comparisons, and of finding sets of best answers, for first-order queries.

In this paper we study the problem of deterministic factorization of sparse polynomials. We show that if f ? **F**[x_{1}, x_{2}, ... , x_{n}] is a polynomial with s monomials, with individual degrees of its variables bounded by d, then f can be deterministically factored in time s^{poly(d)·log (n)}. Prior to our work, the only efficient factoring algorithms known for this class of polynomials were randomized, and other than for the cases of d=1 and d=2, only exponential time deterministic factoring algorithms were known.
A crucial ingredient in our proof is a quasi-polynomial sparsity bound for factors of sparse polynomials of bounded individual degree. In particular we show if f is an s-sparse polynomial in n variables, with individual degrees of its variables bounded by d, then the sparsity of each factor of f is bounded by s^{O(d^2·log(n) )}. This is the first nontrivial bound on factor sparsity for d>2. Our sparsity bound uses techniques from convex geometry, such as the theory of Newton polytopes and an approximate version of the classical Carathéodory's Theorem.
Our work addresses and partially answers a question of von zur Gathen and Kaltofen (JCSS 1985) who asked whether a quasi-polynomial bound holds for the sparsity of factors of sparse polynomials.

The geometric intersection number of a curve on a surface is the minimal number of self-intersections of any homotopic curve, i.e. of any curve obtained by continuous deformation. Given a curve $c$ represented by a closed walk of length at most $\ell$ on a combinatorial surface of complexity $n$ we describe simple algorithms to (1) compute the geometric intersection number of $c$ in $O(n+ \ell^2)$ time, (2) construct a curve homotopic to $c$ that realizes this geometric intersection number in $O(n+\ell^4)$ time, (3) decide if the geometric intersection number of $c$ is zero, i.e. if $c$ is homotopic to a simple curve, in $O(n+\ell\log\ell)$ time. The algorithms for (2) and (3) are restricted to orientable surfaces, but the algorithm for (1) is also valid on non-orientable surfaces. To our knowledge, no exact complexity analysis had yet appeared on those problems. An optimistic analysis of the complexity of the published algorithms for problems (1) and (3) gives at best a $O(n+g^2\ell^2)$ time complexity on a genus $g$ surface without boundary. No polynomial time algorithm was known for problem (2) for surfaces without boundary. Interestingly, our solution to problem (3) provides a quasi-linear algorithm to a problem raised by Poincar\'e more than a century ago. Finally, we note that our algorithm for problem (1) extends to computing the geometric intersection number of two curves of length at most $\ell$ in $O(n+ \ell^2)$ time.

Strassen's algorithm (1969) was the first sub-cubic matrix multiplication algorithm. Winograd (1971) improved the leading coefficient of its complexity from 6 to 7. Many asymptotic improvements followed. Unfortunately, most of them have done so at the cost of very large, often gigantic, hidden constants. Consequently, Strassen-Winograd's $O\left(n^{\log_{2}7}\right)$ algorithm often outperforms other fast matrix multiplication algorithms for all feasible matrix dimensions. The leading coefficient of Strassen-Winograd's algorithm was believed to be optimal for matrix multiplication algorithms with $2\times2$ base case, due to a lower bound by Probert (1976). Surprisingly, we obtain a faster matrix multiplication algorithm, with the same base case size and asymptotic complexity as Strassen-Winograd's algorithm, but with the leading coefficient reduced from 6 to 5. To this end, we extend Bodrato's (2010) method for matrix squaring, and transform matrices to an alternative basis. We prove a generalization of Probert's lower bound that holds under change of basis, showing that for matrix multiplication algorithms with a $2\times2$ base case, the leading coefficient of our algorithm cannot be further reduced, hence optimal. We apply our method to other fast matrix multiplication algorithms, improving their arithmetic and communication costs by significant constant factors.