Theory of Computing Blog Aggregator

We study the Boolean Hierarchy in the context of two-party communication complexity, as well as the analogous hierarchy defined with one-sided error randomness instead of nondeterminism. Our results provide a complete picture of the relationships among complexity classes within and across these two hierarchies. In particular, we prove a query-to-communication lifting theorem for all levels of the Boolean Hierarchy and use it to resolve an open problem stated in the paper by Halstenberg and Reischuk (CCC 1988) which initiated this topic.

at March 24, 2019 07:12 AM UTC

Authors: Henrique Becker, Luciana S. Buriol
Download: PDF
Abstract: This work presents an empirical analysis of exact algorithms for the unbounded knapsack problem, which includes seven algorithms from the literature, two commercial solvers, and more than ten thousand instances. The terminating step-off, a dynamic programming algorithm from 1966, presented the lowest mean time to solve the most recent benchmark from the literature. The threshold and collective dominances are properties of the unbounded knapsack problem first discussed in 1998, and exploited by the current state-of-the-art algorithms. The terminating step-off algorithm did not exploit such dominances, but has an alternative mechanism for dealing with dominances which does not explicitly exploits collective and threshold dominances. Also, the pricing subproblems found when solving hard cutting stock problems with column generation can cause branch-and-bound algorithms to display worst-case times. The authors present a new class of instances which favors the branch-and-bound approach over the dynamic programming approach but do not have high amounts of simple, multiple and collective dominated items. This behaviour illustrates how the definition of hard instances changes among algorithm approachs. The codes used for solving the unbounded knapsack problem and for instance generation are all available online.

at March 23, 2019 11:26 PM UTC

Authors: Fady Massarwi, Pablo Antolin, Gershon Elber
Download: PDF
Abstract: 3D objects, modeled using Computer Aided Geometric Design tools, are traditionally represented using a boundary representation (B-rep), and typically use spline functions to parameterize these boundary surfaces. However, recent development in physical analysis, in isogeometric analysis (IGA) in specific, necessitates a volumetric parametrization of the interior of the object. IGA is performed directly by integrating over the spline spaces of the volumetric spline representation of the object. Typically, tensor-product B-spline trivariates are used to parameterize the volumetric domain. A general 3D object, that can be modeled in contemporary B-rep CAD tools, is typically represented using trimmed B-spline surfaces. In order to capture the generality of the contemporary B-rep modeling space, while supporting IGA needs, Massarwi and Elber (2016) proposed the use of trimmed trivariates volumetric elements. However, the use of trimmed geometry makes the integration process more difficult since integration over trimmed B-spline basis functions is a highly challenging task. In this work, we propose an algorithm that precisely decomposes a trimmed B-spline trivariate into a set of (singular only on the boundary) tensor-product B-spline trivariates, that can be utilized to simplify the integration process in IGA. The trimmed B-spline trivariate is first subdivided into a set of trimmed B\'ezier trivariates, at all its internal knots. Then, each trimmed B\'ezier trivariate, is decomposed into a set of mutually exclusive tensor-product B-spline trivariates, that precisely cover the entire trimmed domain. This process, denoted untrimming, can be performed in either the Euclidean space or the parametric space of the trivariate. We present examples on complex trimmed trivariates' based geometry, and we demonstrate the effectiveness of the method by applying IGA over the (untrimmed) results.

at March 23, 2019 11:28 PM UTC

Authors: Michelle Blom, Peter J. Stuckey, Vanessa Teague
Download: PDF
Abstract: Risk-limiting post election audits guarantee a high probability of correcting incorrect election results, independent of why the result was incorrect. Ballot-polling audits select ballots at random and interpret those ballots as evidence for and against the reported result, continuing this process until either they support the recorded result, or they fall back to a full manual recount. For elections with digitised scanning and counting of ballots, a comparison audit compares randomly selected digital ballots with their paper versions. Discrepancies are referred to as errors, and are used to build evidence against or in support of the recorded result. Risk-limiting audits for first-past-the-post elections are well understood, and used in some US elections. We define a number of approaches to ballot-polling and comparison risk-limiting audits for Instant Runoff Voting (IRV) elections. We show that for almost all real elections we found, we can perform a risk-limiting audit by looking at only a small fraction of the total ballots (assuming no errors were made in the tallying and distribution of votes).

at March 23, 2019 11:20 PM UTC

Authors: Umut A. Acar, Daniel Anderson, Guy E. Blelloch, Laxman Dhulipala
Download: PDF
Abstract: With the rapid growth of graph datasets over the past decade, a new kind of dynamic algorithm, supporting the ability to ingest batches of updates and exploit parallelism is needed in order to efficiently process large streams of updates. In this paper, we study batch and parallel algorithms for the dynamic connectivity problem, a fundamental problem that has received considerable attention in sequential setting. Perhaps the best known sequential algorithm is the elegant level-set algorithm of Holm, de Lichtenberg and Thorup (HDT), which achieves $O(\log^2 n)$ amortized time per edge insertion or deletion, and $O(\log n)$ time per query.

In this paper, we design a parallel batch-dynamic connectivity algorithm that is work-efficient with respect to the HDT algorithm for small batch sizes, and is asymptotically faster when the average batch size is sufficiently large. Given a sequence of batched updates, where $\Delta$ is the average batch size of all deletions, our algorithm achieves $O(\log n \log(1 + n / \Delta))$ expected amortized work per edge insertion and deletion and $O(\log^3 n)$ depth w.h.p. Our algorithm answers a batch of $k$ connectivity queries in $O(k \log(1 + n/k))$ expected work and $O(\log n)$ depth w.h.p. To the best of our knowledge, our algorithm is the first parallel batch-dynamic algorithm for connectivity.

at March 23, 2019 11:26 PM UTC

Short Presburger arithmetic is hard!

This is a belated report on a remarkable breakthrough from 2017. The paper is Short Presburger arithmetic is hard, by Nguyen and Pak.

Danny Nguyen

Integer programming in bounded dimension: Lenstra’s Theorem

Algorithmic tasks are often intractable. But there are a few miracles where efficient algorithms exist: Solving systems of linear equations, linear programming, testing primality,  and solving integer programming problems when the number of variables is bounded. The last miracle is a historic 1983 theorem of Hendrik Lenstra (Here is the paper) and it is the starting point of this post.

Lensra’s theorem: Consider a system of linear inequalities

Ax ≤ b

where x=(x_1,x_2,\dots x_k) is a vector of k variables, A is an integral k by n matrix and b is an integral vector of length n.

Let k be a fixed integer. There is a polynomial time algorithm to determine if the system has an integral solution.

Of course, the full Integer Programming problem when k is part of the input, is NP-complete. This problem came already (second in Karp’s list!) in the Mayflower of NP-complete problems –  Karp’s paper.

Sasha Barvinok famously showed in 1993 that even counting the number of solutions is in P. (Barvinok utilized the short generating function approach pioneered by Brion, Vergne and others.)

Kannan’s theorem

Next,  I want to describe an amazing 1990 theorem of Ravi Kannan,

Kannan’s theorem considers formulas with one quantifier alternation in the Presburger arithmetic and it asserts that when the number of variables is fixed,  there is a polynomial time algorithm to decide if the formula is satisfiable.

(Here is a free version of Kannan’s paper.) Also here the counting problems were tackled with great success. Barvinok and Kevin Woods remarkably showed how to count projections of integer points in a (single) polytope in polynomial time, and subsequently Woods extended this approach to general Presburger expressions Φ with a fixed number of inequalities!

An important strengthening was achieved by Friedrich Eisenbrand and  Gennady Shmonin in the 2008 paper Parametric integer programming in fixed dimension. See also the survey chapter by Barvinok Lattice points and lattice polytopes.

You can find the formulation of Kannan’s theorem in full generality a little further but let me present now a special case related to the famous Frobenius coin problem. (See this post on GLL for more on Presburger arithmetic)

Frobenius coin problem

Given k coins with integral values, the Frobinius coin problem is to determine the largest integer that cannot be described as positive integer combinations of the values of the coins. (See also this post on GLL.)

Theorem (Kannan): There is a polynomial time algorithm to solve the Frobenius coin problem for every fixed number of coins.

The issue of the way theory meets practice for the problems discussed in this post is very interesting but we will not discuss it. Let me remark that Herb Scarf (who along with Kannan played a role in B-L (Before Lenstra) developments) offered another approach for the solution of the Frobenius coin problem and related IP (Integer Programming)  problems based on his theory of maximal lattice-free convex bodies. See this related post.

More than one quantifier

Given the result of Kannan and later that of Barvinok and Woods, many people expected that also for two alternations, or even for any other fixed number of alternations, Presburger arithmetic would be in polynomial time. Nguyen and Pak proved that the problem is NP-complete already for two quantifier alternations! Here is the link to the paper Short Presburger arithmetic is hard. Igor Pak’s homepage has a few other related papers.

Let me bring here Sasha Barvinok’s MathSciNet featured review of Nguyen and Pak’s paper which tells the story better than I could.

Barvinok’s featured review to Nguyen and Pak’s paper

Presburger arithmetic allows integer variables, integer constants, Boolean operations (&, ∧, ¬), quantifiers (∃, ∀), equations and inequalities (=, <, >, ≤, ≥), addition and subtraction (+, −) and multiplication by integer constants. It does not allow multiplication of variables (if we allow multiplication of variables, we get Peano arithmetic).

Geometrically, a quantifier-free formula of Presburger arithmetic describes the set of integer points in a Boolean combination of rational polyhedra (that is, in the set obtained from finitely many rational polyhedra by taking unions, intersections and complements). Similarly, a formula of Presburger arithmetic with existential quantifiers only describes the set of integer points obtained from the set of integer points in a Boolean combination of polyhedra by a projection along some coordinates.

Unlike Peano arithmetic, Presburger arithmetic is decidable. Here the authors zoom in on the computational complexity of Presburger arithmetic, once the combinatorial complexity of the formula is bounded in advance. If we fix the number of variables, the validity of a formula with no quantifier alternations (that is, of the type ∃x_1 . . . ∃x_kΦ(x_1, \dots , latex x_k) or of the type ∀x_1 . . . ∀x_kΦ(x_1, \dots , x_k)) can be established in polynomial time by Lenstra’s integer programming algorithm [see H. W. Lenstra Jr., Math. Oper. Res. 8 (1983), no. 4, 538–548; MR0727410].

For a fixed number of variables, formulas with one quantifier alternation (∃x_1 . . . ∃x_ky_1 . . . ∀y_mΦ(x_1, \dots , x_k, y_1, \dots , y_m)) can also be solved in polynomial time, as shown by R. Kannan [in Polyhedral combinatorics (Morristown, NJ, 1989), 39–47, DIMACS Ser. Discrete Math. Theoret. Comput. Sci., 1, Amer. Math. Soc., Providence, RI, 1990; MR1105115]. The decision procedure can be characterized as a polynomial time algorithm for parametric integer programming.

Suppose now that we fix the number of variables and the number of Boolean operations in advance (and hence get what is called a short formula of Presburger arithmetic). Thus the only parameters of the formula are the numerical values of the constants in the formula. The authors show that deciding validity becomes NP-complete if one allows
two quantifier alternations. Remarkably, they present an example of a formula

∃z ∈ \mathbb Z ∀y ∈ \mathbb Z^2 ∃x ∈ \mathbb Z^2 Φ(x, y, z)

with an NP-complete decision problem, even though Φ contains at most 10 inequalities.
Another remarkable example is an NP-complete decision problem for a formula of the

∃z ∈ \mathbb Z ∀y ∈ \mathbb Z^2 ∃x ∈ \mathbb Z^6: Ax + By + Cz ≤ b,  

with at most 24 inequalities.

As the number of quantifier alternations is allowed to increase, the computational complexity in the polynomial hierarchy also moves up. The authors also describe the computational complexity of corresponding counting problems.

The proof is very clever; it uses the continued fraction expansion of a rational number to encode a growing family of intervals, with the help of which the authors build an NP-complete problem.


by Gil Kalai at March 22, 2019 09:47 AM UTC

March 25-29, 2019 The International Centre for Theoretical Sciences (ICTS), Bengaluru (India) The primary objective of this workshop is to bring together experts in the field of algebraic complexity and related areas to present their research, initiate collaborations etc. The idea is to continue the tradition of having a Workshop on Algebraic Complexity Theory … Continue reading Workshop on Algebraic Complexity Theory

by shacharlovett at March 22, 2019 09:16 AM UTC

The Curse of Metric Island

The deadline was approaching without mercy and there was, of course, still some polishing to be done for our SODA paper. But then we run into an issue. To make things worse, this issue turned out to be a hard one, a fundamental known open problem in computational geometry. The good thing is, I liked the problem so much that I decided to dedicate it this post. This is the story about the Sum of Square Roots problem and how we bypassed (ignored) it without solving it.

Everything began in the haze of the 70's of the last millennium. It is nebulous who stumbled first upon this enigma. Some say that Ron Graham has discussed the problem in public lectures, some others say that Joseph O'Rourke has posed it as an open question in the American Mathematical Monthly, while some others suspect that the problem had been already hiding in older different formulations. However, it is a historical fact that one shinny/cloudy day, three computer scientists finished polishing a manuscript that became a classical paper known as "Some NP-complete geometric problems". In this paper, Michael Garey, Ron Graham and David Johnson showed the NP-hardness of two important problems in geometric metrics: Steiner Tree and Traveling Salesman. For the Euclidean plane, they showed only NP-hardness as they did not manage to show that these problems are contained in NP. Moreover, they accentuated that we cannot even rule out that the decision version of Euclidean Minimum Spanning Tree is outside of NP. What a seeming paradox given that we can compute such a minimum tree in polynomial time! So, whom did they blame? The short answer: The Euclidean metric. Garey and his coauthors explain that all these problems have a common hidden issue: They rely on comparing Euclidean lengths, that is, they rely on comparing irrational numbers based on square roots. Whereas this task is trivial if we just want to compare two line segments (e.g. by comparing the radicands), the problem starts when we want to compare two polygonal paths. Even assuming rational (or, after scaling, integer) coordinates, this problem translates into a question that is fundamental in computational geometry: Given two lists of integers, a1 ... and b1 ..., can we decide whether "∑ √ai ≥ ∑ √bi" in P? Put into words: Can we efficiently compare two sums of square roots over integers?

To emphasize the significance of this question, let me cite David Eppstein: "A major bottleneck in proving NP-completeness for geometric problems is a mismatch between the real-number and Turing machine models of computation: one is good for geometric algorithms but bad for reductions, and the other vice versa. Specifically, it is not known on Turing machines how to quickly compare a sum of distances (square roots of integers) with an integer or other similar sums, so even (decision versions of) easy problems such as the minimum spanning tree are not known to be in NP."

"Is this problem computable at all?" - might the curious reader ask after realizing that the approach of iteratively increasing the precision until the first difference in digits will never terminate if both numbers happen to be equal. Luckily, Allan Borodin et al. and Johannes Blömer showed that polynomial time is enough to decide whether two such sums have the same value. Therefore, it is astonishing that the problem of deciding the sign of their difference, a question that seems only slightly harder, has been only recently spotted by Eric Allender et al on an island of the counting hierarchy CH (3rd level) located in the wide ocean of PSPACE. Thus, it is still far far away from the needle head of P, where Gregorio Malajovich conjectured it to live.

Until this will be shown, Jeff Erickson proposes to study which problems are (non-boringly) Σ√-hard.

Oh, if we could only sufficiently bound the difference of the sums from below! Then we could bound from above the precision required to determine the correct sign of the difference: Just image that B is such a lower bound on the difference. Now, take a precision that allows us to compute an approximate difference being only B/2 away from the real one. Then it is easy to verify that the following procedure determines the correct sign: Output the sign of the approximate difference if it is outside of [-B/2,B/2], otherwise output 0. What is the run time of this procedure? It is proportional to the precision, which corresponds to the length of B, that is, to -log B. Hence, if -log B is bounded from above by a polynomial in the length of the input, then we are in P! So, is there any reasonable lower bound for the worst case? That is, what is the smallest positive difference between any two sums of k square roots of integers of size at most n? Welcome to Problem 33 of The Open Problems Project! Let r(n,k) denote such a minimum difference as a function of n and k. For instance, consider n=20 and k=2. Then r(n,k) is roughly 0.0002 and it is attained by √{10} + √{11} - √{5} - √{18}. As being an open problem, there is no tight bound known for r(n,k). Jianbo Qian and Cao An Wang as well as Qi Cheng and Yu-Hsin Li proved(*) that there are sums where -log r(n,k) is at least in the order of magnitude of k log n. However, these lower bounds constitute an exponentially gap to the best known upper bounds which are O(22k log n) [Christoph Burnikel et al.(**)] and 2O(n/log n) [Qi Cheng et al.]. So, maybe you, dear reader, will be the one to close the gap?

In the dramatic opening of this post, I mentioned our paper, where the problem of Sum of Square Roots appeared. There, we design a PTAS for the Traveling Salesman problem (TSP) with hyperplane neighborhoods, that is, we look for a (1+ε)-approximation of a minimum-length tour that, instead of points, visits a given set of hyperplanes in ℝd with d being fixed. The basis of our algorithm is an integer grid of constant size (depending on ε and d). Our key observation is that there is a translation and scaling of the grid such that snapping an (adequately sparsified) optimum solution to the grid results in a (1+ε)-approximation. Using this insight, we let our algorithm enumerate all (reasonable) TSP tours lying in our integer grid. With a simple LP, we translate and scale each tour such that it visits all the hyperplanes of the input whilst minimizing its tour length (i.e., the scaling factor). At the end, we obtain a set of feasible TSP tours and output the shortest one. And here we were confronted with the Sum of Square Roots problem. Which of the tours is the shortest one? Was all our work devastated, all our blood, sweat and tears in vain? Just because the very last line of our algorithm had unknown complexity? Our solution was simple: We just skipped the issue and our paper got accepted. Why? Well, the issue wasn't one for the following three reasons:

  • We could be lazy and assume the real RAM computational model with constant time for square root operations (as often done in computational geometry).
  • We could be industrious and approximate the real tour lengths with a sufficiently small error since we look anyway for an approximation in overall.
  • We could think and then realize that our candidate tours consist of only constant many vertices. Indeed, the best known lower bound on the difference r(n,k) implies the following nice take-away which shall conclude this epilogue:

We can compare two sums of square roots in polynomial time if the number of square roots is constant!

Krzysztof Fleszar


Proof sketches for the interested reader of the lower and upper bounds on r(n,k):

(*) Ω(k log n) [Cheng et al.]: The interval [k, k √{n}] contains so many distinct sums of square roots (over square-free integers) that, by a pigeonhole argument, there are two sums with very small distance.

(**) O(22k log n) [Burnikel et al.]: The difference is an algebraic integer and, consequently, the root of a polynomial with integer coefficients and degree at most 22k. A simple inequality yields the bound.

by Renata Czarniecka at March 22, 2019 08:54 AM UTC

While melding topology, geometry, and analysis

IAS page

Karen Uhlenbeck is a mathematician who has won a number of awards in the past and has just now been announced as winner of the 2019 Abel Prize.

Today Ken and I want to explain a tiny bit about what Uhlenbeck did.

The Abel Prize citation says that Uhlenbeck won for

“pioneering achievements in geometric partial differential equations, gauge theory, and integrable systems, and for the fundamental impact on analysis, geometry and mathematical physics.”

A story in Quanta and another in Scientific American are among those with readable summaries of the general nature of this work. The latter describes Uhlenbeck’s discovery with the mathematician Jonathan Sacks of a phenomenon called bubbling as follows:

Sacks and Uhlenbeck were studying ‘minimal surfaces,’ the mathematical theory of how soap films arrange themselves into shapes that minimize their energy. But the theory had been marred by the appearance of points at which energy appeared to become infinitely concentrated. Uhlenbeck’s insight was to “zoom in” on those points to that this were caused by a new bubble splitting off the surface.

Some of the coolest comments are by Uhlenbeck’s doctoral graduate Mark Haskins in the story in the current issue of Nature.

Haskins says Uhlenbeck is one of those mathematicians who have ‘an innate sense of what should be true,’ even if they cannot always explain why.

The story recounts his often being baffled by answers to his questions, thinking Uhlenbeck had misheard them. But

“maybe weeks later, you would realize that you had not asked the correct question.”

Calculus Of Variations

Simon Donaldson wrote a piece in the current issue of AMS Notices that explains Uhlenbeck’s research in the Calculus of Variations. The article starts with

\displaystyle  F(u) = \int \Phi(u,u') dx.

You can think of {F(u)} as assigning a cost to a function {u=u(x)}. The goal of the calculus of variations is to find the best {u} that minimizes {F(u)} subject to some conditions on {u}. This is a huge generalization of simple minimization problems that arise in basic calculus. He then goes on to explain that in order to study the minimum solutions of such a function one quickly needs to examine partial differential equations. The math gets complex and beautiful very quickly.

As computer scientists who like discrete structures this is not our sweet spot. We rarely use partial derivatives in our work. Well not very often. See these two posts for an example.

To get a taste of this area, we will consider a classic variation problem coming out of these helpful online notes. It leads to integrals such as

\displaystyle  \int_{0}^{a} (1+u'(x)^{2})^{1/2} dx.

Well, we take to integrals even less than partial derivatives.

Straight-line Shortest Path

We will change things up by starting with a discrete approach—as is our wont. Our given task is to prove in general that a straight line is the shortest path from the origin to a given point {p = (a,b)}. We first consider polygonal paths with {n} line segments.

First, if {n = 1} then the only option allowed is to go from {(0,0)} to {(a,b)} in one line segment. Thus the conclusion holds trivially: the Euclidean distance {d(0,p)} is the minimum length of a {1}-segment path.

Now let {n \geq 2}. Let

\displaystyle  P = (0,0) \rightarrow x_1 \rightarrow x_2 \rightarrow \cdots \rightarrow x_n = (a,b)

be a series of line segments that form the shortest path from {0} to {p}. Now by induction, the minimum length of a path of up to {n-1} segments from {x_1} to {p} is {d(x_1,p)} via a straight line from {x_1} to {p}. And the length of the segment from the origin {0} to {x_1} of course is {d(0,x_1)}. Now the Euclidean triangle inequality says that the length {d(0,x_1) + d(x_1,p)} which bounds the length of this path from below is not less than {d(0,p)}. Thus we have proved it for {n} and the induction goes through.

What we really want to do, however, is prove that {d(0,p)} is the shortest length for any path, period. The path need not have any straight segments. It may go in circular arcs, continually changing direction. The arcs need not be circular per-se; they could be anything.

The idea that occurs to us computer scientists is to let {n} go to infinity. That is, we want to consider any path as being a limit of polygonal paths. But is this really legitimate? We can certainly approximate any path by paths of segments. But real analysis is littered with examples of complicated curves—themselves defined by limits—that defeat many intuitive expectations about continuity and limits. So how can we make such an infinitistic proof go through rigorously? This is where the calculus of variations takes over.

Minimizers of Functionals

To set up the problem for fully general paths, we could represent them as functions {f(t) = (x(t),y(t))} such that {f(0) = 0} and {f(1) = p}. The length of the path is then obtained by integrating all the horizontal and vertical displacements:

\displaystyle  \ell(f)(0,p) = \int_{t=0}^{t=1} \sqrt{\left(\frac{dx(t)}{dt}\right)^2 + \left(\frac{dy(t)}{dt}\right)^2} dt. \ \ \ \ \ (1)

Wrangling this integral seems daunting enough, but the real action involving {\ell(f)} only begins after doing so. Both the length {\ell(f)} and the body of the integral are functionals—that is, functions of a function. We need to minimize {\ell(f)} over all functions {f}. This is a higher-order task than minimizing a function at a point.

Our source simplifies the problem by assuming without loss of generality that {x} increases from {0} to {a}, giving the function {f} as {y(x)} instead. Then the problem becomes to minimize

\displaystyle  \ell(f) = \int_{x=0}^{x=a} \sqrt{1 + \left(\frac{dy(x)}{dx}\right)^2} dx. \ \ \ \ \ (2)

The body can be abstracted as a functional {F(u,u')} where {u} and its derivative {u'} are functions of {x}. Here we have {u = y(x)} and {u' = \frac{dy}{dx}}. The condition for {F} to minimize {\mathcal{F} = \int_x F} was derived by Leonhard Euler and Joseph Lagrange:

\displaystyle  \frac{d}{dx}\left(\frac{\partial F}{\partial u'}\right) = \frac{\partial F}{\partial u}\;. \ \ \ \ \ (3)

We won’t reproduce here how our source derives this but give some interpretation. This is a kind of regularity property that {F} must obey in order to minimize {\mathcal{F}}. To quote Donaldson’s survey:

Then the condition that {F} is stationary with respect to compactly supported variations of {u} is a second order differential equation—the Euler-Lagrange equation associated to the functional.

However you slice it, the point is that the equation (3), when applied to cases like the above, is attackable. In the minimum-length path example, our source—after doing eight more equation lines of work—deduces that {u' = \frac{dy}{dx}} must be constant. Any function argument {F = y(x)} that yields this must be a straight line. The initial conditions force this to be the straight line from {0} to {p}.

Some of Uhlenbeck’s Work

The point we are emphasizing is that this simple case of paths in the plane—and its abstraction via functionals {\mathcal{F}} that are ultimately founded on one variable {x}—have a ready-made minimization scheme, thanks to Euler and Lagrange. The scheme is fully general—not subject to the caveats about our simple approximation by line segments.

What happens in higher-dimensional cases? We can quote from the wonderful two-page essay accompanying the Abel Prize citation. It first notes the importance of a condition on functionals and their ambient spaces named for Richard Palais and Stephen Smale, which however fails for many cases of interest including harmonic maps.

[T]he Palais-Smale compactness condition … guarantees existence of minimizers of geometric functionals and is successful in the case of 1-dimensional domains, such as closed geodesics. Uhlenbeck realized that the condition of Palais-Smale fails in the case of surfaces due to topological reasons.

The papers with Sacks explored the roots of these breakdowns and found a way to patch them. The violation of the Palais-Smale condition allows minimizing sequences of functionals to converge with dependence on points outside the space being analyzed. But those loci are governed by a finite set of singular points within the space. This enables the calculus outside the space to be treated as a re-scaling of what goes on inside the space.

In general cases the view of the process from inside to outside can be described and analyzed as bubbles emerging from the singular locations. More than this picture and interpretation, the Sacks-Uhlenbeck papers produced a now-standard tool-set for higher-dimensional minimization of functionals. It is also another successful marriage of topology—determining the singularities—and analysis.

This work was extensible to more-general kinds of functionals such as a central one of Yang-Mills theory in physics. Geometric properties of a Riemannian manifold {M} are expressed via the concept of a connection {A} and the functional associates to {A} its curvature {F(A)}. This is the body for the Yang-Mills functional

\displaystyle  \mathcal{F} = \int_M |F(A)|^2.

There is a corresponding lifting of the Euler-Lagrange equation. This led to developments very much along lines of the previous work with Sacks and more besides. There was particular success analyzing cases where {M} has dimension 4 that were soon relevant to Donaldson’s own Fields Medal-winning research on these spaces. Most in particular, Uhlenbeck working solo proved that these cases were immune to the “bubbling” issue—with the consequence as related in Quanta that

any finite-energy solution to the Yang-Mills equations that is well-defined in the neighborhood of a point will also extend smoothly to the point itself.

Open Problems

We’ve been happy to report that Uhlenbeck has won the prestigious Abel Prize. We have avoided referencing one aspect—despite giving numerous quotes verbatim—that can be appreciated in subsequent fullness here and here and in this. By so doing we’ve abided the desire stated in the twelfth paragraph of this essay. We wonder if this is the right way to do things. What do you think?

by RJLipton+KWRegan at March 22, 2019 02:45 AM UTC

Participants and Their Research Interests

This week I'm in Germany for the Dagstuhl workshop on Computational Complexity of Discrete Problems well timed for Georgia Tech spring break. No Bill, so no typecast, no family, just a bunch of fellow theorists. New this year, beer.

Dagstuhl always had bottled beer (and wine), after all this is Germany. However, Ronen Shaltiel is living his lifelong dream of bringing a keg to Dagstuhl. Turns out Dagstuhl had a refrigerator/tap for a beer keg, one only needs to order and pay for the beer. Ronen's daughter designed a special logo, though the keg contains Bitburger, a fine German pilsner. Prost to Ronen and thanks for the beer.

Of course the fun is hanging with colleagues old and new. Talking about open problems old and new. Used to solve more of them back in the day, now it seems harder.

I did learn a new old theorem, planarity testing, whether you can embed a given graph in the plane so no two edges cross, is computable in log space. In 2000 Eric Allender and Meena Mahajan showed that you can test for planarity in symmetric log space, basically the complexity class whose complete problem is undirected connectivity. In 2005, Omer Reingold famously showed that undirected connectivity is computable in log-space. Thus planarity testing is in log-space, a result you might have missed if you didn't know both papers.

This came out in Eric's talk on his work with Archit Chauhan, Samir Datta and Anish Mukherjee showing that checked whether there is a directed path from a given node s to a given node t in planar graphs can be computed by concurrent-read exclusive-write on parallel random-access machines in logarithmic time, and thus likely weaker than directed s-t paths on general graphs.

by Lance Fortnow ( at March 21, 2019 10:11 AM UTC

Semantic representations (aka semantic embeddings) of complicated data types (e.g. images, text, video) have become central in machine learning, and also crop up in machine translation, language models, GANs, domain transfer, etc. These involve learning a representation function $f$ such that for any data point $x$ its representation $f(x)$ is “high level” (retains semantic information while discarding low level details, such as color of individual pixels in an image) and “compact” (low dimensional). The test of a good representation is that it should greatly simplify solving new classification tasks, by allowing them to be solved via linear classifiers (or other low-complexity classifiers) using small amounts of labeled data.

Researchers are most interested in unsupervised representation learning using unlabeled data. A popular approach is to use objectives similar to the word2vec algorithm for word embeddings, which work well for diverse data types such as molecules, social networks, images, text etc. See the wikipage of word2vec for references. Why do such objectives succeed in such diverse settings? This post is about an explanation for these methods by using our new theoretical framework with coauthor Misha Khodak. The framework makes minimalistic assumptions, which is a good thing, since word2vec-like algorithms apply to vastly different data types and it is unlikely that they can share a common Bayesian generative model for the data. (An example of generative models in this space is described in an earlier blog post on the RAND-WALK model.) As a bonus this framework also yields principled ways to design new variants of the training objectives.

Semantic representations learning

Do good, broadly useful representations even exist in the first place? In domains such as computer vision, we know the answer is “yes” because deep convolutional neural networks (CNNs), when trained to high accuracy on large multiclass labeled datasets such as ImageNet, end up learning very powerful and succinct representations along the way. The penultimate layer of the net — the input into the final softmax layer — serves as a good semantic embedding of the image in new unrelated visual tasks. (Other layers from the trained net can also serve as good embeddings.) In fact the availability of such embeddings from pre-trained (on large multiclass datasets) nets has led to a revolution in computer vision, allowing a host of new classification tasks to be solved with low-complexity classifiers (e.g. linear classifiers) using very little labeled data. Thus they are the gold standard to compare to if we try to learn embeddings via unlabeled data.

word2vec-like methods: CURL

Since the success of word2vec, similar approaches were used to learn embeddings for sentences and paragraphs, images and biological sequences. All of these methods share a key idea: they leverage access to pairs of similar data points $x, x^+$, and learn an embedding function $f$ such that the inner product of $f(x)$ and $f(x^+)$ is on average higher than the inner product of $f(x)$ and $f(x^-)$, where $x^{-}$ is a random data point (and thus presumably dissimilar to $x$). In practice similar data points are usually found heuristically, often using co-occurrences, e.g. consecutive sentences in a large text corpus, nearby frames in a video clip, different patches in the same image, etc.

A good example of such methods is Quick Thoughts (QT) from Logeswaran and Lee, which is the state-of-the-art unsupervised text embedding on many tasks. To learn a representation function $f$, QT minimizes the following loss function on a large text corpus

where $(x, x^+)$ are consecutive sentences and presumably “semantically similar” and $x^-$ is a random negative sample. For images $x$ and $x^+$ could be nearby frames from a video. For text, two successive sentences serve as good candidates for a similar pair: for example, the following are two successive sentences in the Wikipedia page on word2vec: “High frequency words often provide little information.” and “Words with frequency above a certain threshold may be subsampled to increase training speed.” Clearly they are much more similar than a random pair of sentences, and the learner exploits this. From now on we use Contrastive Unsupervised Representation Learning (CURL) to refer to methods that leverage similar pairs of data points and our goal is to analyze these methods.

Need for a new framework

The standard framework for machine learning involves minimizing some loss function, and learning is said to succeed (or generalize) if the loss is roughly the same on the average training data point and the average test data point. In contrastive learning, however, the objective used at test time is very different from the training objective: generalization error is not the right way to think about this.

Main Hurdle for Theory: We have to show that doing well on task A (minimizing the word2vec-like objective) allows the representation to do well on task B (i.e., classification tasks revealed later).

Earlier methods along such lines include kernel learning and semi-supervised learning, but there training typically requires at least a few labeled examples from the classification tasks of future interest. Bayesian approaches using generative models are also well-established in simpler settings, but have proved difficult for complicated data such as images and text. Furthermore, the simple word2vec-like learners described above do not appear to operate like Bayesian optimizers in any obvious way, and also work for very different data types.

We tackle this problem by proposing a framework that formalizes the notion of semantic similarity that is implicitly used by these algorithms and use the framework to show why contrastive learning gives good representations, while defining what good representations mean in this context.

Our framework

Clearly, the implicit/heuristic notion of similarity used in contrastive learning is connected to the downstream tasks in some way — e.g., similarity carries a strong hint that on average the “similar pairs” tend to be assigned the same labels in many downstream tasks (though there is no hard guarantee per se). We present a simple and minimalistic framework to formalize such a notion of similarity. For purposes of exposition we’ll refer to data points as “images”.

Semantic similarity

We assume nature has many classes of images, and has a measure $\rho$ on a set of classes $\mathcal{C}$, so that if asked to pick a class it selects $c$ with probability $\rho(c)$. Each class $c$ also has an associated distribution $D_c$ on images i.e. if nature is asked to furnish examples of class $c$ (e.g., the class “dogs”) then it picks image $x$ with probability $D_c(x)$. Note that classes can have arbitrary overlap, including no overlap. To formalize a notion of semantic similarity we assume that when asked to provide “similar” images, nature picks a class $c^+$ from $\mathcal{C}$ using measure $\rho$ and then picks two i.i.d. samples $x, x^{+}$ from the distribution $D_{c^+}$. The dissimilar example $x^{-}$ is picked by selecting another class $c^-$ from measure $\rho$ and picking a random sample $x^{-}$ from $D_{c^-}$.

The training objective for learning the representation is exactly the QT objective from earlier, but now inherits the following interpretation from the framework

Note that the function class $\mathcal{F}$ is an arbitrary deep net architecture mapping images to embeddings (neural net sans the final layer), and one would learn $f$ via gradient descent/back-propagation as usual. Of course, no theory currently exists for explaining when optimization succeeds for complicated deep nets, so our framework will simply assume that gradient descent has already resulted in some representation $f$ that achieves low loss, and studies how well this does in downstream classification tasks.

Testing representations

What defines a good representation? We assume that the quality of the representation is tested by using it to solve a binary (i.e., two-way) classification task using a linear classifier. (The paper also studies extensions to $k$-way classification in the downstream task.) How is this binary classification task selected? Nature picks two classes $c_1, c_2$ randomly according to measure $\rho$ and picks data points for each class according to the associated probability distributions $D_{c_1}$ and $D_{c_2}$. The representation is then used to solve this binary task via logistic regression: namely, find two vectors $w_1, w_2$ so as to minimize the following loss

The quality of the representation is estimated as the average loss over nature’s choices of binary classification tasks.

It is important to note that the latent classes present in the unlabeled data are the same classes present in the classification tasks. This allows us to formalize a sense of ‘semantic similarity’ as alluded to above: the classes from which data points appear together more frequently are the classes that make up relevant classification tasks. Note that if the number of classes is large, then typically the data used in unsupervised training may involve no samples from the classes used at test time. Indeed, we are hoping to show that the learned representations are useful for classification on potentially unseen classes.

Provable guarantees for unsupervised learning

What would be a dream result for theory? Suppose we fix a class of representation functions ${\mathcal F}$, say those computable by a ResNet 50 architecture with some choices of layer sizes etc.

Dream Theorem: Minimizing the unsupervised loss (using modest amount of unlabeled data) yields a representation function $f \in {\mathcal F}$ that is competitive with the best representation from ${\mathcal F}$ on downstream classification tasks, even with very few labeled examples per task.

While the number of unlabeled data pairs needed to learn an approximate minimizer can be controlled using Rademacher complexity arguments (see paper), we show that the dream theorem is impossible as phrased: we can exhibit a simple class ${\mathcal F}$ where the contrastive objective does not yield representations even remotely competitive with the best in the class. This should not be surprising and only suggests that further progress towards such a dream result would require making more assumptions than the above minimalistic ones.

Instead, our paper makes progress by showing that under the above framework, if the unsupervised loss happens to be small at the end of contrastive learning then the resulting representations perform well on downstream classification.

Simple Lemma: The average classification loss on downstream binary tasks is upper bounded by the unsupervised loss. where $\alpha$ depends on $\rho$. ($\alpha\rightarrow 1$ when $|\mathcal{C}|\rightarrow\infty$, for uniform $\rho$)

This says that the unsupervised loss function can be treated as a surrogate for the performance on downstream supervised tasks solved using linear classification, so minimizing it makes sense. Furthermore, just a few labeled examples are needed to learn the linear classifiers in future downstream tasks. Thus our minimalistic framework lets us show guarantees for contrastive learning and also highlights the labeled sample complexity benefits provided by it. For details as well as more finegrained analysis see the paper.

Extensions of the theoretical analysis

This conceptual framework not only allows us to reason about empirically successful variants of (1), but also leads to the design of new, theoretically grounded unsupervised objective functions. Here we give a high level view; details are in our paper.

A priori, one might imagine that the log and exponentials in (1) have some information-theoretic interpretation; here we relate the functional form to the fact that logistic regression is going to be used in the downstream classification tasks. Analogously, if the classification is done via hinge loss, then (2) is true for a different unsupervised loss that uses a hinge-like loss instead. This objective, for instance, was used to learn image representations from videos by Wang and Gupta. Also, usually in practice $k>1$ negative samples are contrasted with each positive sample $(x,x^+)$ and the unsupervised objective looks like the $k$-class cross-entropy loss. We prove a statement similar to (2) for this setting, where the supervised loss now is the average $(k+1)$-way classification loss.

Finally, the framework provides guidelines for designing new unsupervised objectives when blocks of similar data are available (e.g., sentences in a paragraph). Replacing $f(x^+)$ and $f(x^-)$ in (1) with the average of the representations from the positive and the negative block respectively, we get a new objective which comes with stronger guarantees and better performance in practice. We experimentally verify the effectiveness of this variant in our paper.


We report some controlled experiments to verify the theory. Lacking a canonical multiclass problem for text, we constructed a new 3029-class labeled dataset where a class is one of 3029 articles from Wikipedia, and datapoints are one of $200$ sentences in these articles. Representations will be tested on a random binary classification task that involves two articles, where the labels of the data point is which of the two articles it belongs to. (A 10-way classification task is similarly defined.) Datapoints for the test tasks will be held out while training representations. The class of sentence representation ${\mathcal F}$ is a simple multilayer architecture one based on Gated Recurrent Unit (GRU).

The supervised method for learning representations trains a multiclass classifier on the 3029-way task and the representation is taken from the layer before the final softmax output. This was the gold standard in above discussions.

The unsupervised method is fed pairs of similar data points generated according to our theory: similar data points are just pairs of sentences sampled from the same article. Representations are learnt by minimizing the above unsupervised loss objectives.

The highlighted parts in the table show that the unsupervised representations compete well with the supervised representations on the average $k$-way classification task ($k=2, 10$).

Additionally, even though not covered by our theory, the representation also performs respectably on the full multiclass problem. We find some support for a suggestion of our theory that the mean (centroid) of the unsupervised representations in each class should be good classifiers for average $k$-way supervised tasks. We find this to be true for unsupervised representations, and surprisingly for supervised representations as well.

The paper also has other experiments studying the effect of number of negative samples and larger blocks of similar data points, including experiments on the CIFAR-100 image dataset.


While contrastive learning is a well-known intuitive algorithm, its practical success has been a mystery for theory. Our conceptual framework lets us formally show guarantees for representations learnt using such algorithms. While shedding light on such algorithms, the framework also lets us come up with and analyze variants of it. It also provides insights into what guarantees are provable and shapes the search for new assumptions that would allow stronger guarantees. While this is a first cut, possible extensions include imposing a metric structure among the latent classes. Connections to meta-learning and transfer learning may also arise. We hope that this framework influences and guides practical implementations in the future.

at March 19, 2019 07:00 PM UTC

The determinant polynomial $Det_n(\mathbf{x})$ of degree $n$ is the determinant of a $n \times n$ matrix of formal variables. A polynomial $f$ is equivalent to $Det_n$ over a field $\mathbf{F}$ if there exists a $A \in GL(n^2,\mathbf{F})$ such that $f = Det_n(A \cdot \mathbf{x})$. Determinant equivalence test over $\mathbf{F}$ is the following algorithmic task: Given black-box access to a $f \in \mathbf{F}[\mathbf{x}]$, check if $f$ is equivalent to $Det_n$ over $\mathbf{F}$, and if so then output a transformation matrix $A \in GL(n^2,\mathbf{F})$. In Kayal (2012), a randomized polynomial time determinant equivalence test was given over $\mathbf{C}$. But, to our knowledge, the complexity of the problem over finite fields and over rationals was not well understood. In this work, we give a randomized $poly(n,\log |\mathbf{F}|)$ time determinant equivalence test over finite fields $\mathbf{F}$ (under mild restrictions on the characteristic and size of $\mathbf{F}$). Over rationals, we give an efficient randomized reduction from factoring square-free integers to determinant equivalence test for quadratic forms (i.e. the $n=2$ case), assuming GRH. This shows that designing a polynomial-time determinant equivalence test over rationals is a challenging task. Nevertheless, we show that determinant equivalence test over rationals is decidable: For bounded $n$, there is a randomized polynomial-time determinant equivalence test over rationals with access to an oracle for integer factoring. Moreover, for any $n$, there is a randomized polynomial-time algorithm that takes input black-box access to a rational polynomial $f$ and if $f$ is equivalent to $Det_n$ over rationals then it returns a $A \in GL(n^2,\mathbf{L})$ such that $f = Det_n(A \cdot \mathbf{x})$, where $\mathbf{L}$ is an extension field of $\mathbf{Q}$ of degree at most $n$. The above algorithms over finite fields and over rationals are obtained by giving a polynomial-time randomized reduction from determinant equivalence test to another problem, namely the full matrix algebra isomorphism problem. We also show a reduction in the converse direction which is efficient if $n$ is bounded. These reductions, which hold over any $\mathbf{F}$ (under mild restrictions on the characteristic and size of $\mathbf{F}$), establish a close connection between the complexity of the two problems. This then lead to our results via applications of known results on the full algebra isomorphism problem over finite fields and over $\mathbf{Q}$.

at March 18, 2019 12:28 PM UTC

I took a poll of the theory community (and others) about P vs NP and related issues in 2002, 2012, and 2019 (sorry its not an arithmetic  sequence --- read the third poll article to see why). The 2002 and 2012 polls are on the web (see here and here ), and now the third poll is out and its here or here . All three appear in Lane's Complexity Column in SIGACT News.

I'll give some highlights and thought about the third poll, but for more read the article. Or read all three and see how opinions have changed over time.

0) How many respondents: 100 the first time, and 152 the second time, 124 the third time.

1) P≠NP is at about 80%. When restricted to people who have thought about the problem a lot (my judgement) then it goes to 99%. The strongest P≠NP is by Lance who says

                                People who think P=NP are like people who believe Elvis is alive.

I disagree. I think Elvis is alive and is trying to prove P=NP.

2) When will it be solved? 66% thought it would be solved before 2100, a bit more than in prior years. But also more thought it would never be solved. Some commented that a computer might solve it.  I doubt that, but I do think this poll will be done by a computer in 10 years.

3) Because I used Survey monkey (1) each question got more answers, and (2) most  questions got a forced YES or NO  so less funny comments or room for creative answers. People could still leave comments, and some did.

4) Related to point (3): The last time I did the poll P=BPP was thought by everyone who answered the question. This year far more people answered it and quite a few thought P≠BPP. This may be because Survey monkey had a psychological affect of making people vote even if they didn't really know (people who have thought a lot about P vs NP  all thought P=BPP). Has  there been evidence that P≠BPP that I am unaware of? Or since there has not been that much progress on it maybe they are unequal. 10 years ago I would have thought we would have L=RL by now.

5) The last time I did the poll 10 people thought factoring IS in P and the NSA or the KGB knows that. This time around nobody thought that. Why? Those 10 people have vanished mysteriously.

6) Five people thought P vs NP will be resolved by Harambe. That is more than any person got.

7)  Is this poll worth doing? I would say yes (gee, I have to having done his poll three times) because it is good to have an objective record of subjective opinion.

8) I'll give some of my answers: P≠NP, Elvis is dead, and both will be proven in the year 2525. For more about the future see this.

by GASARCH ( at March 17, 2019 10:55 PM UTC

In this paper we study the quantum learnability of constant-depth classical circuits under the uniform distribution and in the distribution-independent framework of PAC learning. In order to attain our results, we establish connections between quantum learning and quantum-secure cryptosystems. We then achieve the following results. 1) Hardness of learning $AC^0$ and $TC^0$ under the uniform distribution. Our first result concerns the concept class $TC^0$ (resp. $AC^0$), the class of constant depth and polynomial-sized circuits with unbounded fan-in majority gates (resp. AND, OR, NOT gates). We show that if there exists no quantum polynomial time (resp. sub-exponential time) algorithm to solve the Learning with Errors (LWE) problem, then there exists no polynomial time quantum learning algorithm for $TC^0$ (resp. $AC^0$) under the uniform distribution (even with access to quantum membership queries). The main technique in this result uses explicit pseudo-random generators that are believed to be quantum-secure to construct concept classes that are hard to learn quantumly under the uniform distribution. 2) Hardness of learning $TC^0_2$ in the PAC setting. Our second result shows that if there exists no quantum polynomial time algorithm for the LWE problem, then there exists no polynomial time quantum PAC learning algorithm for the class $TC^0_2$, i.e., depth-$2$ $TC^0$ circuits. The main technique in this result is to establish a connection between the quantum security of public-key cryptosystems and the learnability of a concept class that consists of decryption functions of the cryptosystem. This gives a strong conditional negative answer to one of the "Ten Semi-Grand Challenges for Quantum Computing Theory" raised by Aaronson, who asked if $AC^0$ and $TC^0$ can be PAC-learned in quantum polynomial time.

at March 17, 2019 07:55 AM UTC

Facing nonexistential realities

Neil L. is a Leprechaun. He has graced these pages before.

Today, the day before St. Patrick’s Day, we ponder universal riddles of existence. Ken, who will visit me this coming week, insisted on reporting what happened this morning.

I woke up early, walked alone into our living room, and was amazed to see Neil already here. He was looking out the window at dawn sunlight flooding over St. Patrick’s Cathedral and the many other landmarks we can see from our apartment in midtown Manhattan.

Spring has at last given an earnest of coming. The scene was transfixing, exhilirating, all the glory of nature except for the green-clad little man puffing his pipe with his back to me.


He turned and tipped his hat to me but then turned back to the window. I wondered if I had caught him in meditation. I stood back until he turned a second time. He had always visited me late on the evening before St. Patrick’s Day, or the morning of it.

“You’re here early.”

“Aye. Top o’ the morning to ye.”


He gave me a long stare but warmly, not hostile.

“By the living daylights in ye…”

Then I understood. The Romans have their Ides, the Irish have the 17th, but I got the day in between—when last year I was not here but in an operating room for long hours.

“Thanks” was all I could say. I had actually come out to look up a mathematical idea for a “part 2” post I’ve been struggling with ever since posting the “part 1” years ago. This pulled me back to math, then to how Neil has always tricked me and I’ve never been able to pin him down. I realized this might be my best opportunity.

“Neil, may I ask you a Big Question? Not directly personal.”

He simply said, “Aye.” No tricks yet.

The Question

I had my chance—something I’ve wanted to ask him for years but not had an opening for.

“Neil, what can you tell me about female Leprechauns?”

Neil had mentioned family, cousins, even siblings—I figured they had to come from somewhere. But nothing I’d read mentioned female leprechauns. I braced for silence again, but Neil’s reply was immediate and forthright:

“They comprise the most heralded subset of our people. They are included in everything we do.”

“How so?”

“Our society agreed early on that every female leprechaun should have the highest station. All female leprechauns are sent to the best schools, with personal tutoring reserved in advanced subjects. They are our superpartners in every way.”

“How do you treat them?”

“If a lady leprechaun applies for a position, she is given full consideration. Whenever a lady leprechaun gives advice, it is harkened to. None of us has ever ‘malesplained’ or demeaned a lady leprechaun in any way.”

“Are they as short—uh, tall—as you?”

“Every adult female leprechaun is between one cubit and one-and-a-half in stature, like us menfolk. We don’t have quite the variation of your human species.”

“I would surely like to meet one.”

Neil heaved a sigh.

“Ye know the trepidation with us and your womenfolk.” Indeed, I recalled Neil’s story of what was evidently his own harrowing encounter with the wife of William Hamilton. “It is symmetrical, I’m afraid. No female leprechaun has ventured into your world.”

I had figured that.

“But can you give me an example of a female leprechaun? Someone who has done something notable.”

The Reality

Neil puffed on his pipe. I knew I had him cornered and moved in.

“Neil, is everything you just said about female leprechauns true?”

To my surprise, he answered quickly.

Aye. As in mathematics, each of my statements was perfectly true.”

Oh. Then I realized how empty domains are treated in mathematical logic. I looked at him sharply.

“Neil, there are no female leprechauns. All of your statements have been vacuous, bull-.”

“That’s not right or fair. That which I told ye have been important values of our society from the beginning. Many studies of female leprechauns have established their natures precisely. We have devoted vast resources to the quest for female leprechauns. Readiness—how we would integrate them—this is enshrined in all of our laws. Even your late Senator, Birch Bayh, was informed by us at Notre Dame.”

Neil’s sincerity was evident. I still felt bamboozled. He carried on.

“Ye have whole brilliant careers studying physical objects that may not exist. The analysis of them is still important in mathematics and other scientific applications. And in mathematics itself—”

I was actually relieved to turn the subject back toward mathematics, as Neil and I usually talked about.

“—your most storied advance of the past quarter century was achieved by ten years’ concerted study of the Frey-Hellegouarch curve. Which by Fermat’s Last Theorem does not exist.”


I could not argue with that. This set me pondering. Whether mathematical objects exist in reality defines the debate over Platonism. But what, then, is the reality of nonexistent mathematical objects?

Nonexistent objects still follow rules and aid deductions about objects that do exist. So, they too exist? Moreover, we often cannot prove that those objects do not exist—for all we know they may be just as real. Polynomial-time algorithms for satisfiability and many other problems—those are bread and butter in theory. I wanted Neil to tell me more about some of them.

“Neil, have your folk worked on the projective plane of order 10?”

Aye. Not only did we long ago have all the results of this paper on them, our engineers recommended it as the thoroughfare pattern for new towns.”

“You have built them?”

“Indeed—in two new districts around Carlingford back home.”

“But they don’t exist.”

“By our laws and treaties they exist until your people prove they don’t. Then they have a 50-year sunset. Which means the towns have 20 years left.

“That’s sad.”

“It’s actually a much better system than the Scots have. Their villages of this type such as Brigadoon come back every 100 years just for one day. But Brexit may put paid to our two towns sooner. We are protected under E.U. law, but those towns are just over the Northern Ireland border.

Cropped from src1, src2

“How about odd perfect numbers?”

Nay. Ye know the rules. We can nay tell that which your folk do not know, unless there is nothing else ye learn that ye could not learn without us.”

I pondered: what could I ask about computational complexity that Neil could respond to in this kind of zero-knowledge way? But Neil cut in.

“It will be much more fruitful to discuss objects that ye know do not exist—and yet ye use them anyway.”

Mathematical Leprechauns

I thought, which could he mean? I recalled the story we told about a student getting his degree for a thesis with many new results on a strong kind of Lipschitz/Hölder function even though an examiner observed that non-constant ones do not exist. Ken heard the same story as an undergraduate, with the detail that it was an undergraduate senior thesis, not PhD. The Dirac delta function? Well, that definitely exists as a measure. I forgot an obvious example until Neil said it.

“There is the field with one element: {\mathbb{F}_1}.”

Of course. We posted about it.

“It is an active research subject, especially in the past dozen years. Your Fields medalist Alain Connes has co-written a whole series of papers on it. Yuri Manin also. Your blog friend Henry Cohn got a paper on it into the Monthly—think of the youngsters… Edifices are even being built upon it.”

“Can you give some less-obvious ones for our readers?”

“There is the uniform distribution on the natural numbers. This paper gives variations that do exist, but it makes the need and use of the original concept clear.”

Ken had in fact once finished a lecture on Kolmogorov complexity when short on time by appealing to it.

“There are fields F whose algebraic closures have degrees 3 and 5 over F. Those are highly useful.”

I knew by a theorem of Emil Artin and Otto Schreier that they cannot exist. But it struck me as natural that they should exist. I wondered how much one would need to tweak the rules of group theory to make them possible.

“Then there is the free complete lattice on 3 generators.”

Wait—I thought that could exist if one loosened up set theory. So I asked:

“Neil, do some of these objects exist in alternative worlds known to leprechauns in which the rules of mathematics are crazy, not like the real world?”

“What makes you think ye don’t live in one of those other worlds?”

Before I could react, Neil picked up a solid glass sphere art object that belonged to Kathryn. He pulled out a diamond-encrusted knife and slashed it as green light flashed in every direction until he stopped. Then from his hands he gave me two glass spheres of the same size.

“For the happy couple—la le Padraig suna ye-uv!

And he was gone.

Open Problems

What is your view on the nature of mathematical reality? Do nonexistent mathematical objects exist? What are your favorite examples?

by RJLipton+KWRegan at March 17, 2019 04:42 AM UTC

It’s the last day of classes for the winter quarter here at UCI, and a good time for some spring cleaning of old bookmarked links. Probably also a good time for a reminder that Google+ is shutting down in two weeks so if, like me, you still have links to it then you don’t have long to replace them with archived copies before it gets significantly more difficult.

by David Eppstein at March 15, 2019 10:53 PM UTC

The purpose of this post is mostly just to signal-boost Konstantin Kakaes’s article in MIT Technology Review, entitled “No, scientists didn’t just ‘reverse time’ with a quantum computer.” The title pretty much says it all—but if you want more, you should read the piece, which includes the following droll quote from some guy calling himself “Director of the Quantum Information Center at the University of Texas at Austin”:

If you’re simulating a time-reversible process on your computer, then you can ‘reverse the direction of time’ by simply reversing the direction of your simulation. From a quick look at the paper, I confess that I didn’t understand how this becomes more profound if the simulation is being done on IBM’s quantum computer.

Incredibly, the time-reversal claim has now gotten uncritical attention in Newsweek, Discover, Cosmopolitan, my Facebook feed, and elsewhere—hence this blog post, which has basically no content except “the claim to have ‘reversed time,’ by running a simulation backwards, is exactly as true and as earth-shattering as a layperson might think it is.”

If there’s anything interesting here, I suppose it’s just that “scientists use a quantum computer to reverse time” is one of the purest examples I’ve ever seen of a scientific claim that basically amounts to a mind-virus or meme optimized for sharing on social media—discarding all nontrivial “science payload” as irrelevant to its propagation.

by Scott at March 15, 2019 06:49 PM UTC

My 1983 Ph D thesis was on Helly-type theorems which is an exciting part of discrete geometry and, in the last two decades, I have had an ongoing research project with Roy Meshulam on topological Helly-type theorems. The subject found new connections with structural graph theory and Gyárfás-type problems, and also with high-dimensional expanders.  We already devoted quite a few posts directly and indirectly to Helly-type theorems.

In this post I would like to report on two recent related breakthroughs, one by Andreas Holmsen and Dong-Gyu Lee, and the other by Andreas Holmsen. These developments are related to problems that are described in Problems for Imre Bárány’s birthday, and to my earlier 2004 Oberwolfach report “Expectations from commutative algebra“. While at it I will mention some earlier important papers by Pauline Sarrabezolles, by Shay Moran and Amir Yehudayoff, and by Boris Bukh.

Helly theorem, colorful Helly, and the Fractional Helly Theorem.

You can find the statements of Helly theorem, Radon theorem, and the colorful Caratheodory theorem in this post.  The colorful Helly theorem was first observed by Lovasz. The fractional Helly theorem was first proved by Katchalski and Liu.

The colorful Helly theorem: Let {\cal K}_1, {\cal K}_2, \dots, {\cal K}_{d+1} be d+1 nonempty families of convex sets in \mathbb R^d. If for every choice of a transversal – one set from every family – there is a point in common to all the chosen sets then for one of the families, there is a point in common to all sets in the family.

[Sorry, I got it wrong first, thanks to Shakhar Smorodinsky for alerting me.]

The fractional Helly theorem: For every \alpha >0 there is \beta >0 such that if K_1,K_2,\dots, K_n are n convex sets in \mathbb R^d and at least \alpha fraction of (d+1)-tuples of the sets have a point in common then a fraction of at least \beta of the sets have a point in common.

I should mention that these theorems do not follow combinatorially from Helly’s theorem itself. They manifest deep properties of hypergraphs coming from geometry, and it is an important question to give general abstract properties, combinatorial or topological leading to such properties.

The result of Holmsen and Lee asserts roughly that the fractional Helly property follows combinatorially from Radon’s theorem! Holmsen’s result asserts that  the fractional Helly property follows combinatorially from the colorful Helly theorem!

Andreas Holmsen Dong-Gyu Lee: Fractional Helly from Radon

Radon numbers and the fractional Helly theorem

Authors: Andreas Holmsen Dong-Gyu Lee

Abstract: A basic measure of the combinatorial complexity of a convexity space is its Radon number. In this paper we show a fractional Helly theorem for convexity spaces with a bounded Radon number, answering a question of Kalai. As a consequence we also get a weak epsilon-net theorem for convexity spaces with a bounded Radon number. This answers a question of Bukh and extends a recent result of Moran and Yehudayoff.

Andreas Holmsen: Fractional Helly follows from colorful Helly

Large cliques in hypergraphs with forbidden substructures

Author: Andreas F. Holmsen

Abstract: A result due to Gyárfás, Hubenko, and Solymosi (answering a question of Erdös) states that if a graph G on n vertices does not contain K_{2,2} as an induced subgraph yet has at least c{{n} \choose {2}} edges, then G has a complete subgraph on at least c^2/10n vertices. In this paper we suggest a “higher-dimensional” analogue of the notion of an induced K_{2,2} which allows us to generalize their result to k-uniform hypergraphs. Our result also has an interesting consequence in discrete geometry. In particular, it implies that the fractional Helly theorem can be derived as a purely combinatorial consequence of the colorful Helly theorem.


Let me mention that Gyárfás’ type problems in graph theory and a little on the connection with fractional Helly were mentioned in the post When a few colors suffice. On this front major advances has happened in a series of recent papers and I will discuss them at a later post. (But, for now see, Alex Scott and Paul Seymour’s A survey of χ-boundedness, and Proof of the Kalai-Meshulam conjecture by Maria Chudnovsky, Alex Scott, Paul Seymour, Sophie Spirkl.)

Pauline Sarrabezolles: The number of tranversals in the colorful Helly theorem

How many tranversals can we guarantee in colorful Carathéodory? This is a very interesting problem on its own, and may be also related to Sierksma’s conjecture on the number of Tverberg’s partitions that we mentioned in this recent post (and in this old one).

The colourful simplicial depth conjecture (2014)

Pauline Sarrabezolles

Given d+1 sets of points, or colours, S_1,\dots ,S_{d+1} in \mathbb R ^d, a colourful simplex is a set T\subset \cup _{i=1}^{d+1}S_i such that |T∩Si|≤1, for all i∈{1,…,d+1}. The colourful Carathéodory theorem states that, if 0 is in the convex hull of each S_i, then there exists a colourful simplex T containing 0 in its convex hull. Deza, Huang, Stephen, and Terlaky (Colourful simplicial depth, Discrete Comput. Geom., 35, 597–604 (2006)) conjectured that, when |S_i|=d+1 for all i∈{1,…,d+1}, there are always at least d^2+1 colourful simplices containing 0 in their convex hulls. We prove this conjecture via a combinatorial approach.

On weak ε-nets and the Radon number (2017)

Shay Moran, Amir Yehudayoff

Abstract: We show that the Radon number characterizes the existence of weak nets in separable convexity spaces (an abstraction of the euclidean notion of convexity). The construction of weak nets when the Radon number is finite is based on Helly’s property and on metric properties of VC classes. The lower bound on the size of weak nets when the Radon number is large relies on the chromatic number of the Kneser graph. As an application, we prove a boosting-type result for weak ϵ-nets.

Boris Bukh: Beyond nerves, the abstract setting for the disproof of the partition conjecture

We already mentioned Boris Bukh’s beautiful counter example to Eckhoff’s partition conjecture in the 2010 paper Radon partitions in convexity spaces.  Let me mention a general object used by Boris which is of much independent interest.

Let A=\{a_1,a_2,\dots, a_n\} be a set of points in \mathbb R^d. For S \subset [n] let A_S=conv \{a_i: i \in S\}, and let N be the nerve of all the sets A_S: S \subset [n]. N is a simplicial complex whose vertices are indexed by all non empty subsets of [n]. The abstract setting we have here is that of a simplicial complex, whose vertices are indexed by all subsets of [n] and such that a collection of vertices which represent a family of subsets with non empty intersection must be a face of the complex.

by Gil Kalai at March 15, 2019 01:29 PM UTC

Here's an old trick that we found useful for proving some tight complexity lower bounds. You are given m coins, each of weight either a or b, and a modern scale that can tell you the total weight of any chosen subset of coins. How many weighings do you need to identify which coin is which? Checking each coin individually uses m weighings, but can you do less?

In any weighing, we try some unknown number of weight-a coins between 0 and m, so this results in one of m + 1 possible values, giving us at most log(m + 1) bits of information. In total we need m bits of information to identify each coin, so clearly we will need at least Ω(m / log m) weighings.

It turns out that this many is in fact enough, and this generalizes to various other settings with less restricted weights. This is the basis for two of our recent results: a tight complexity lower bound for Integer Linear Programming with few constraints and for multicoloring (a.k.a. b-fold coloring), assuming the Exponential Time Hypothesis. The trick allows us to use constraints that check the value of some number between 0 and m to indeed extract about log(m) bits of new information from each, in a way that is general enough to check m clauses of a 3-CNF-SAT instance using only O(m / log m) constraints.

Detecting matrices

One way to rephrase this coin-weighing problem is as follows: the unknown coin weights are described by a (column) vector x in {a,b}m. A subset of coins is described by a characteristic (row) vector v in {0,1}m and the resulting value of the weighing is the dot product v·x.

For a number of weighings k, the row vectors form a 0-1 matrix M with k rows and m columns, while the obtained values are simply the k-element vector M x. The condition that these values are enough to identify each entry of x is formalized as follows:

For all m, there is a 0-1 matrix M with k=O(m / log m) rows and m columns
such that for any x, y in {a,b}m, x ≠ y implies M x ≠ M y.

This is called a detecting matrix. In other words, instead of checking x = y entry by entry, it suffices to compare the k entries of M x and M y. (An equivalent definition is to require that the kernel of M is trivial on {-1,0,1}m, since we could instead compare (x - y)/(a - b) with zero). A generalization for x, y in {0, 1, …, d - 1}m was first proved by (Lindström 1965) (in fact he proves the constant hidden behind O is 2 log2(d) and cannot be improved, for any d), giving a deterministic, poly-time construction. (Bshouty 2009) gives a more direct and general construction using Fourier analysis. However, a random matrix can be shown to be detecting with fairly elementary methods.

Integer Linear Programming

In ILP we are given a matrix A with n columns and m rows, a vector b with m entries, and we ask for a vector x with n non-negative integer entries such that A x = b. So each of the m rows is a linear equality to be satisfied. What is the complexity for small m?

We start with an instance A x = b encoding a suitable 3-CNF-SAT instance of size O(m) in a straightforward way, using only {0,1,2,3} entries in A and b. Instead of checking each row individually, we use detecting matrices to check them in bunches, reducing our problem to an equivalent instance M A x = M b. The matrix M A of the new instance has only O(m / log m) rows (at the cost of having larger entries in M b). With some tinkering, this simple reduction shows that, assuming ETH, ILP with m constraints cannot be solved in time 2o(m log m), even when the matrix has only 0-1 entries and b has poly(m) entries. This matches a recent (surprisingly simple and clever!) 2O(m log m) algorithm by Eisenbrand and Weismantel (SODA 2018).

One caveat when applying detecting matrices is that we do not know any bound on one side, A x, except that it is a non-negative integer vector. Fortunately, (Grebinsky & Kucherov 2000) showed that a random 0-1 matrix satisfies the following:

For all d,m, there is a 0-1 matrix M with k=O(mlogd / log m) rows and m columns
such that for any x, y in ℕm with ‖y1 ≤ dm, x ≠ y implies M x ≠ M y.

Here no upper bound on x is needed, because a single row in M full of ones is enough to guarantee that the sum of entries is the same: ‖x1 = ‖y1. However, it remains an open problem to find a deterministic construction (we work around this by giving a somewhat weaker construction). See the full paper (STACS 2019) for details (with Dušan Knop and Michał Pilipczuk). The reduction for multicoloring is quite different and more problem-specific, see the full paper (ESA 2017) for definitions (with Marthe Bonamy, Łukasz Kowalik, Michał Pilipczuk and Arkadiusz Socała).


Marcin Wrochna

by Renata Czarniecka at March 15, 2019 08:59 AM UTC

July 4-7, 2019 Moscow, Russia RAAI Summer School is a 4-day school with tutorials and practical workshops on: Machine Learning and Data Analysis Simulation of Human Cognitive Functions Natural Language Processing Multi-Agent Systems and Coalitions Together with the school, there will be a hackathon in reinforcement learning using computer vision tasks.

by shacharlovett at March 14, 2019 06:44 PM UTC

The next TCS+ talk will take place this coming Wednesday, March 20th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 18:00 Central European Time, 17:00 UTC). Aleksandar Nikolov from University of Toronto will speak about “Sticky Brownian Rounding and its Applications to Constraint Satisfaction Problems” (abstract below).

Please make sure you reserve a spot for your group to join us live by signing up on the online form. As usual, for more information about the TCS+ online seminar series and the upcoming talks, or to suggest a possible topic or speaker, please see the website.

Abstract: Semidefinite programming is a powerful tool in the design and analysis of approximation algorithms for combinatorial optimization problems. In particular, the random hyperplane rounding method of Goemans and Williamson has been extensively studied for more than two decades, resulting in various extensions to the original technique and beautiful algorithms for a wide range of applications. Despite the fact that this approach yields tight approximation guarantees for some problems, like Max Cut, for many others, like Max Sat, Max DiCut, and constraint satisfaction problems with global constraints, the tight approximation ratio is still unknown. One of the main reasons for this is the fact that very few techniques for rounding semi-definite relaxations are known.

In this work, we present a new general and simple method for rounding semi-definite programs, based on Brownian motion. Our approach is inspired by recent results in algorithmic discrepancy theory. We develop and present tools for analyzing our new rounding algorithms, utilizing mathematical machinery from the theory of Brownian motion, complex analysis, and partial differential equations. Focusing on constraint satisfaction problems, we apply our method to several classical problems, including Max Cut, Max 2-Sat, and Max DiCut, and derive new algorithms that are competitive with the best known results. We further show that our algorithms can be used, together with the Sum of Squares hierarchy, to approximate constraint satisfaction problems subject to multiple global cardinality constraints.

Join work with Sepehr Abbasi-Zadeh, Nikhil Bansal, Guru Guruganesh, Roy Schwartz, and Mohit Singh


by plustcs at March 14, 2019 04:30 PM UTC

If the president of the United States uses "complexity" in a tweet, I can't leave it alone.

I got my MIT degree in Applied Mathematics so I don't count, but I know plenty of MIT computer scientists and the one undeniable truth is that I don't want any of them flying my plane.

I also agree with the Donald that a complex environment can often lead to confusion, especially in an emergency. HCI matters.

But that's where it stops. Better technology in the plane and on the ground, combined with better procedures have all but eliminated deadly major commercial airplane crashes in the US and quite rare in the world at large. I'll call that more than a "little gain". I remember the "old and simpler" days where we had deadly airline crashes every few months. No thanks.

I've had great discussions with some of the aerospace engineering professors at Georgia Tech. The amount of effort to get the "front-of-the-plane" software right via formal methods and rigorous testings has become harder to engineer than the plane itself. The "back-of-the-plane" software that controls the video screens and WiFi gets less attention. I can live with safety over movies.

When technology makes things much safer, whether it be planes or self-driving cars, the rare accidents become far more noticeable. while we should use these opportunities to learn and make things safer, the rare accidents help us realize just how much more safer technology can make us.

by Lance Fortnow ( at March 14, 2019 02:08 PM UTC

This call for PhD positions might be of interest to some of your students or you. Please distribute it as you see fit. 

The GSSI - Gran Sasso Science Institute offers seven PhD fellowships in Computer Science for the academic year 2019/20. One of those positions is earmarked for the development of efficient techniques and tools for the analysis of large genomic data and is supported by the genomic-research start-up Dante Labs srl. The official language for all PhD courses is English.

The fellowships are awarded for 4 years and their yearly amount is € 16.159,91 gross. All PhD students have free accommodation at the GSSI facilities and use of the canteen.

The application must be submitted through the online form available here by 18 June 2019 at 6 pm (Italian time zone). 

 The GSSI-Gran Sasso Science Institute is an international PhD school and a center for research and higher education in the areas of Physics, Mathematics, Computer Science and Social Sciences. Founded in 2012 in L’Aquila (Italy) as Center for Advanced Studies of the National Institute for Nuclear Physics (INFN) and then established in March 2016 as a School of Advanced Studies providing post-graduate education. Through a day-to-day collaboration and interaction, researchers and students have the opportunity to build a sound knowledge of the research methods and to experiment contamination of interests, innovative approaches and multicultural exchanges in all the GSSI activities. In addressing the complexity of today’s world, GSSI is committed to removing all barriers between its areas of study and research. The dissemination of scientific results towards society and the promotion of cultural events for generic public, citizens and schools are among GSSI goals.

See here for further details.

by Luca Aceto ( at March 14, 2019 11:50 AM UTC

IMT School for Advanced Studies Lucca invites applications for PhD
positions in the Systems Science program, and specifically for its Computer
Science and Systems Engineering

We carry out foundational, applied, and interdisciplinary research on the
modeling and analysis of systems, broadly construed. The SYSMA
<> research unit welcomes applications from
candidates in Computer Science with an interest in any of the following

- cloud computing;

- computational methods for the analysis of cyber-physical systems;

- cybersecurity;

- modeling and verification of concurrent, distributed, and self-adaptive

- program analysis;

- smart contracts and blockchain technology;

- software performance evaluation.

A non-exhaustive list of suggested PhD topics is available here
<>. Prospective candidates are
warmly encouraged to get in touch with members
<> of the SYSMA unit for informal

The scholarship is for 4 years and consists of grant amounting to € 15,300
gross/year, in addition to free accommodation and board at the IMT Campus
<>. PhD candidates have the
possibility to defend their thesis from the beginning of the fourth year of
the program, but no earlier as per Italian legislation.

The initial start date of the PhD program is 1 November 2019. The working
language at IMT is English.

Application deadline is 23 April 2019. Note that candidates who have not
obtained their undergraduate degree by the deadline can still apply, and
can be admitted if they graduate no later than 31 October 2019.

Applications must be submitted through the online form at:

by Luca Aceto ( at March 14, 2019 11:44 AM UTC

We offer a postdoctoral position at Bar-Ilan University, Israel, in the field of coding theory and distributed computations. The appointment is for 1 or 2 years, depending on funding and progress, with starting date in October 2019 (flexible). The university resides in Tel-Aviv area.

Apply by sending CV and a short research statement by 15.4.2019.


by shacharlovett at March 14, 2019 10:00 AM UTC

[Posted at the request of Craig Gentry. I think this is actually a great idea that should be imitated by other sub-areas of TCS as well. –Boaz]

Update: The Conference for Failed Approaches and Insightful Losses in cryptology is celebrating our first failure early: the failure to keep a deadline! We are extending the submission deadline to 11:59 pm EST on April 7 to give everyone more time to work on submissions. In these last weeks leading up to the deadline, you can follow the @cfail2019 twitter account for inspiring jokes about failure to help you reach the finish line!

The CFAIL 2019 program chairs would like to remind you that some failures are good. Failure to attack a cryptosystem? Good! It might be a strong cryptosystem. Failure to publish in Eurocrypt? No problem! You’ll get ’em next time. Failure to cut back on caffeine like you promised in your New Year’s resolution? Good! You must be full of energy!

But failure to submit to CFAIL… lame! 
Work on those submissions, cryptologists! We want to see your failures in all their glory! 
Just a friendly reminder that the CFAIL submission deadline is April 1 April 7th.

by windowsontheory at March 14, 2019 01:30 AM UTC

As I said in my previous post my two SoCG papers both involved trying and failing to prove something else, and writing down what I could prove instead. For the one that appears today, “Cubic planar graphs that cannot be drawn on few lines” (arXiv:1903.05256), that something else was Open Problem 16.14 of my book, which can be rephrased as: does there exist a family of planar graphs that cannot be drawn planarly with all vertices on a constant number of convex curves?

Instead I expand on something that was already known, the existence of planar graphs that cannot be drawn with all vertices on a constant number of lines.

There were two previously known methods for constructing such graphs:

  • Use a planar 3-tree. This is a planar graph formed by recursively subdividing triangles into three smaller triangles. If the vertices are drawn on some family of lines, the line through the subdivision point must miss one of the three smaller triangles, so by induction you can be forced to use a non-constant number of lines. But the bound you get is only logarithmic.1

Planar 3-tree

  • Use a maximal planar graph whose dual graph has no long paths. In a maximal planar graph, any line through many vertices can be translated into a path through many dual vertices. So if there is no long dual path there must be many lines. This gives a polynomial lower bound on the number of lines, but with a tiny exponent.2

The new paper finds a third method:

  • Use a graph that has many disjoint collections of many nested cycles. Drawing a big set of nested cycles on a small number of lines necessarily uses up one of the crossing points of the lines. So if there are more nests than crossing points, one of them will have to use many lines.

Graph with many disjoint collections of many nested cycles

This leads to stronger polynomial bounds, but also (more importantly I think) wider families of planar graphs that need non-constant numbers of lines. For instance, it works for series-parallel graphs of maximum degree three, which are far from being maximal planar. And even though every tree can be drawn with its vertices on two lines, adding a single vertex to a tree can produce a graph that requires a non-constant number of lines.

Tree plus one vertex requiring a non-constant number of lines

  1. Forbidden Configurations in Discrete Geometry, Theorem 16.13. 

  2. Ravsky, Alexander, and Verbitsky, Oleg (2011), “On collinear sets in straight-line drawings”, 37th Int. Worksh. Graph-Theoretic Concepts in Computer Science (WG 2011), LNCS 6986, Springer, pp. 295–306; Chaplick, Steven, Fleszar, Krzysztof, Lipp, Fabian, Verbitsky, Oleg, and Wolff, Alexander (2016), “Drawing graphs on few lines and few planes”, 24th Int. Symp. Graph Drawing and Network Visualization (GD 2016), LNCS 9801, Springer, pp. 166–180. 

(Discuss on Mastodon)

by David Eppstein at March 13, 2019 06:37 PM UTC

Manolis Kellis is a computational biologist at MIT, known as one of the leaders in applying big data to genomics and gene regulatory networks. Throughout my 9 years at MIT, Manolis was one of my best friends there, even though our research styles and interests might seem distant. He and I were in the same PECASE class; see if you can spot us both in this photo (in the rows behind America’s last sentient president). My and Manolis’s families also became close after we both got married and had kids. We still keep in touch.

Today Manolis will be celebrating his 42nd birthday, with a symposium on the meaning of life (!). He asked his friends and colleagues to contribute talks and videos reflecting on that weighty topic.

Here’s a 15-minute video interview that Manolis and I recorded last night, where he asks me to pontificate about the implications of quantum mechanics for consciousness and free will and whether the universe is a computer simulation—and also about, uh, how to balance blogging with work and family.

Also, here’s a 2-minute birthday video that I made for Manolis before I really understood what he wanted. Unlike the first video, this one has no academic content, but it does involve me wearing a cowboy hat and swinging a makeshift “lasso.”

Happy birthday Manolis!

by Scott at March 13, 2019 04:45 PM UTC

In some sense both of my accepted papers at SoCG are about a situation where I really wanted to prove something else, wasn’t able to, and wrote up what I could prove instead. The one whose preprint appears today, “Counting polygon triangulations is hard” (arXiv:1903.04737), proves that it’s -complete to count the triangulations of a polygon with holes.

It’s easy to prove that it’s -hard to count maximum non-crossing subsets of line segment arrangements: the intersections graphs of line segments include all the planar graphs, so this follows immediately from the -completeness of maximum independent sets in planar graphs. The gadgets in my paper (drawn below) turn a single line segment into a part of a polygon for which many triangulations use diagonals in approximately the same position as the line segment, and there are few other triangulations. By using these gadgets, one could easily construct a polygon in which most of the triangulations correspond to maximum non-crossing subsets, showing that it’s also -hard to count triangulations. But this is unsatisfactory because it’s the wrong kind of completeness. For counting problems, we want -completeness, not just -hardness.

Gadgets for proving hardness of polygon triangulation

The reason the outline above isn’t already a -completeness proof is that we need to control the rest of the triangulation better. It’s not enough for most triangulations to come from maximum non-crossing subsets. We need each maximum non-crossing subset of line segments to contribute exactly the same number of triangulations (or at least a predictable and controlled number) so that we can recover the number of maximum non-crossing subsets from the number of triangulations. The gadgets control the parts of the triangulation near each segment of the non-crossing subsets, but we also need to control the parts of the triangulation that are not near those segments. Achieving this took much of the work in my new paper. There are two main ideas: First, we use special arrangements with the property that, when you close off the gadgets of a maximum non-crossing subset, the remaining parts of the polygon form pieces of predictable shapes for which we can count the triangulations. In these arrangements, there are two colors of segments, red and blue; each blue segment is crossed by two red segments, and each red segment is crossed in the order blue-red-blue. So we need a new counting reduction from graph independent sets that produces arrangements of this type. And second, we decompose the total number of maximum non-crossing subsets into a sum of smaller numbers, of subsets that have a given number of red segments. Subsets with the same number of red segments leads to the same numbers of triangulations, allowing the terms in the sum to be recovered from the number of triangulations.

A secondary goal was to do this all using a polynomial-time counting reduction rather than the Turing reductions that are more commonly used for -completeness. One cannot use an even stronger notion of reduction, parsimonious reductions, to reduce problems that might have zero solutions to a problem like triangulation that always has at least one. But to me, counting reductions are more closely analogous to the many-one reductions used for -completeness. We can use them, so we should use them. So the reductions in the paper are polynomial-time counting reductions, but more work would be needed to prove that the starting graph problem I used (counting independent sets in regular planar graphs) is complete for those reductions.

Surprisingly to me, this appears to be the first published proof of a -completeness result in two-dimensional computational geometry. And although some 2d geometric structures have been proven -hard to construct (and therefore -hard to count) it seems to be the first hardness proof of any kind for counting 2d structures that can be constructed rather than counted in polynomial time. But as I said at the top of this post, it’s not what I really wanted to prove. I was hoping to prove hardness for counting non-crossing structures on point sets, such as simple polygons, triangulations, non-crossing matchings, or pointed pseudotriangulations. The polygon question was suggested to me by Sara Billey when I visited her last April, in connection with a different problem we were working on (and still haven’t published). And counting the triangulations of a point set would fit the theme of my recent book, because this count is a numerical parameter that depends only on the order type of the points and mostly behaves monotonically under point deletion (exception: delete the center point of a quincunx). Instead I had to settle for polygon triangulations. But maybe another paper can take this line of research one step farther, to counting problems on points.

(Discuss on Mastodon)

by David Eppstein at March 12, 2019 06:48 PM UTC

A graph $G$ has an \emph{$S$-factor} if there exists a spanning subgraph $F$ of $G$ such that for all $v \in V: \deg_F(v) \in S$. The simplest example of such factor is a $1$-factor, which corresponds to a perfect matching in a graph. In this paper we study the computational complexity of finding $S$-factors in regular graphs. Our techniques combine some classical as well as recent tools from graph theory.

at March 12, 2019 05:07 PM UTC

We provide new upper bounds on the complexity of the s-t-connectivity problem in planar graphs, thereby providing additional evidence that this problem is not complete for NL. This also yields a new upper bound on the complexity of computing edit distance. Building on these techniques, we provide new upper bounds on the complexity of several other computational problems on planar graphs. All of these problems are shown to be solvable in logarithmic time on a concurrent-read exclusive-write (CREW) PRAM. The new upper bounds are provided by making use of a known characterization of CREW algorithms in terms of "unambiguous" AC$^1$ circuits. This seems to be the first occasion where this characterization has been used in order to provide new upper bounds on natural problems.

at March 12, 2019 04:11 AM UTC

I think we have an exciting program that awaits us during HALG 2019: Please remember that the call for short contributed talks ends this week - on 15th of March. 

by sank at March 11, 2019 09:17 PM UTC

For an exciting project funded by the European Commission for Research (ERC), I currently have several openings for post-docs working in the intersection of theoretical computer science and learning theory.


by shacharlovett at March 11, 2019 03:10 PM UTC

The past decade brought us several breakthroughs in the study of geometric incidences. Most importantly, Guth and Katz introduced polynomial partitioning, and this has become the main technique for studying incidences. The recent breakthroughs sometimes leads to a wrong impression about the state of this field. Actually, most of the main incidence problems are still […]

by Adam Sheffer at March 11, 2019 01:06 AM UTC

When I first saw the definition of NP-Complete I first thought if there are NP-complete problems I suspect they are contrived. When I saw the proof that  SAT is NP-complete I thought okay, so there is one natural problem  NP-complete by a trick of encoding, but I suspect there aren't any more.  I then read Karp's article that had 22 NP-complete problems. I thought okay, there are 22, but I suspect there are no more. No I didn't think that. But the point is that Cook and Karp together birthed modern complexity theory by introduction NP-completeness and showing that there are MANY natural problems that are NP-complete.

How many people has Richard Karp inspired? I don't know but I suspect it's a  cardinal between countable and the reals so it may depend on your model of set theory. Later in this post I will present Samir Khuller's story of how Richard Karp inspired him.

How to honor him?  The Simons inst. is currently welcoming contributions to the Richard M Karp Fund, which honors the scientific contributions of Founding Director Dick Karp. The fund will provide vital support for the mission and activities of the Simons institute. They have a deadline of Pi-Day. You can go here to contribute and/or here for a letter from Shafi Goldwasser, Prabhakar Raghavan, and Umesh Vazirani about the fund.

OKAY, Samir's story:

Mentoring and Influence via a Train Journey...

Almost to the day, 33 years ago I was running from my dorm room to catch an unreliable bus to the train station as I headed home for a mid-semester break.  On the way,I  stopped by the mailroom, and not knowing where to put the CACM that had just arrived,  stuffed it into the side of my bag and in the process, dropping my   notebook in the process. On the train, when I looked for my  notebook to go over  some material I realized it was missing! All I had was a CACM and a very long train ride. Skimming through the journal, I found Karp’s Turing Award article “Combinatorics, Complexity, and Randomness“. I  started reading this article, and was riveted! It was one of those life changing moments for me, when my immediate future became clear - I wanted to study Theoretical Computer Science and specifically the complexity of combinatorial problems. Without ever having met Dick Karp, he changed my life forever.

I met Dick long after he influenced me via his Turing Award Lecture and it was a real privilege to have had the opportunity to interview him via a fireside chat at Simons. See here.

I am delighted about the fund Bill mentions above as the money that goes to it will help inspire others as Karp inspired me.

by GASARCH ( at March 10, 2019 09:26 PM UTC

The Distributed Algorithms Group at IST Austria (near Vienna) is looking for strong postdoc candidates in CS Theory, with a focus on (distributed) optimization. The starting date is (roughly) in Fall 2019.


by shacharlovett at March 10, 2019 08:48 PM UTC

Bill and Clyde’s new book

Bill Gasarch and Clyde Kruskal are colleagues in Computer Science at the University of Maryland. They have just seen the publication of their book Problems With a Point.

Today Dick and I congratulate them on the book and give a brief overview of it.

The tandem of Bill and Clyde were featured before on this blog in connection with a series of papers with James Glenn on sets of numbers that have no arithmetical progressions of length 3.

Can a “series” have only two members? We could add a third paper to the series, but it is by Glenn and Bill with Richard Beigel rather than Clyde and is on the different topic of multi-party communication complexity. But Clyde gets into the act since it is the topic of Chapter 19 of the book. That chapter analyzes a game form of problems that go back to a 1983 paper by Dick with Ashok Chandra and Merrick Furst. Does this mean Dick should get some royalties? Note that this last link is to Bill’s website. And ‘vdw’ in this link seems to refer to van der Waerden’s theorem, which is a pillar of both Ramsey theory and number theory, which in turn receive much attention in the book.

My last paragraph flits through divergent questions but they are representative of things that we working mathematical computing theorists think about every day—and of various kinds of content in the book. Of course, Bill partners on Lance Fortnow’s prominent blog and so has shared all kinds of these musings. The book, which incorporates numerous posts by Bill but is not a “blog book” per-se, furnishes contexts for them. Let’s look at the book.

The Book

The book has three main parts, titled:

  • Stories With a Point.

  • Problems With a Point.

  • Theorems With a Point.

All the chapters except three are prefaced with a section titled “Point”—or “The Point” or “Points”—after one sentence on prior knowledge needed, which often reads, “None.”

The first part begins with a chapter titled, “When Terms From Math are Used By Civilians,” one of several with social discussion. The second is about how human aspects of sports upset statistical regularities one might expect to see. For example, many more baseball hitters have had season batting averages of .300 than .299 or .301, evidently because .300 is a “goal number” everyone strives to meet. There are numerous things in the book I didn’t know.

The next chapter largely reproduces this record-long post on the blog but adds much extra commentary. Then there are chapters on what makes a mathematical function worth defining, on how mathematical objects are named, on Bill’s visit to a “Gathering for [Martin] Gardner” conference, on coloring the plane, two on imagined applications of Ramsey Theory to medieval history, and one on the methodological and social impact of having theorems that represent barriers to ways of trying to prove other theorems.

Chapter Ten

This last chapter of Part I draws on several of Bill’s posts. It goes most to the heart of what we all in computational complexity grapple with daily and benefits from having more context: As we just mused in this blog’s 10-year anniversary post, the major questions in complexity are not only as open as when we started, but as open as when the field started 50-plus years ago. There has barely been any discernible direct progress, and this is not only a concern for entering students but a retardant on everyone—about which we’ve given exhortation.

There are instead barrier theorems saying why the main questions will not be resolvable by the very techniques that were used to build the field. Of three broad categories of barriers, “oracle results” de-fang all the methods typically taught in intro graduate complexity courses. Those methods meter computations at inputs and outputs but not how they “progress” inside toward a solution. Oracle languages supply blasts of external information that either make progress trivially quick or allow conditioning the terms of the complexity-class relation under analysis to information in the oracle that is tailored to obfuscate. Oracles of the former kind usually collapse complexity classes together, such as {A =} any polynomial-space complete language making {\mathsf{P}^A = \mathsf{NP}^A}, whereas oracles {B} of the latter kind drive them apart (so {\mathsf{P}^B \neq \mathsf{NP}^B}) but often for reasons that frustrate as much as enlighten. When contemplating a new complexity class relation it is incumbent to frame a “relativized” version of it and see if some oracles can make it true and others false. Such oracle constructions were initially easy to come by—“fast food” for research—but their ultimate besideness-of-the-point is reflected in this early post by Dick.

Next after oracles is the “Natural Proofs” barrier, which basically cuts against all the techniques taught in advanced graduate complexity courses unless they can somehow be weirdly narrow or intensely complex in their conceptual framing. The attitude on those is captured by the title of another early post by Dick titled, “Who’s Afraid of Natural Proofs?” The third barrier is algebrization, which cuts against efforts to escape from oracles.

Amid all this environment of obstruction, what of consequence can we prove? Well, the chapter on this has by far the highest concentration of Bill’s trademark all-capital letters and exclamation points. The initial “Point” section is titled, “Rant.” (The chapter also includes a page on the attempts to prove Euclid’s parallel postulate, which might be called the original “problem with a point.”)

Part II

Much of Part II involves the kind of problems that were used or considered for high-school mathematics competitions. By definition those are not research problems but they often point toward frontiers by representing accessible cases of them. The appealing ones blend equal parts research relevance, amusement value, and reward of technical command. Often they require mathematical “street smarts” and a repertoire of useful tricks and proof patterns. Practicing researchers will find this aspect of the book most valuable. The problems are mainly on the number-theory side of recreational mathematics, plus geometric and coloring problems. Egyptian fractions, prime-factor puzzles, sums over digits, summation formulas, and of course Ramsey theory all make their appearance. Here is one problem I mention partly to correct a small content error, where the correction seems to make the problem more interesting. Say that {C(n,m)} holds if there are integers {n_1,\dots,n_m} such that

\displaystyle  n_1 + \cdots + n_m = n \qquad\text{and}\qquad \frac{1}{n_1} + \cdots + \frac{1}{n_m} = 1.

A general principle discovered by Joseph Sylvester gives cases such as {C(1861,5)} being witnessed by

\displaystyle  1861 = 2 + 3 + 7 + 43 + 1806 \qquad\text{and}\qquad 1 = \frac{1}{2} + \frac{1}{3} + \frac{1}{7} + \frac{1}{43} + \frac{1}{1806}.

How can it and related ideas be extended to other cases? The particular problem is given {n} to find {m_n =} the least {m} such that {C(n,m)} holds—and to find a witnessing sum. The error is an assertion that {m_n \leq n} “by writing {n} as a sum of {n} {1}‘s,” but then the reciprocals add up to {n} not {1}. So is there any bound on {m_n}? That is rather fun to think about.

This problem’s chapter is made from a short post with an enormous comments section. There is also a chapter on the boundary between a fair problem and a “trick question” that much extends an April Fool’s post on the blog.

Part III

Part III continues the nature of the problem content with emphasis on the worths of proofs of theorems. One illustration involving perfect numbers and sums of cubes notes that the proof loses value because it also applies to certain non-perfect numbers. Two things en-passant: First, Lemma 21.1 in this chapter needs qualification that {a,b} are relatively prime and {x+1} is prime. Second, this prompts me to note via Gil Kalai the discovery by Tim Browning that the number 33 is a sum of three cubes:

\displaystyle  33 = 8,\!866,\!128,\!975,\!287,\!528^3 + (-8,\!778,\!405,\!442,\!862,\!239)^3 + (-2,\!736,\!111,\!468,\!807,\!040)^3.

Gil points to a Quora post by Alon Amit. There is only the bare equation on Browning’s homepage, and nothing about his methods or how much he used computer search is known. The theme of proofs by computer runs through several chapters of Bill and Clyde’s book. It is also current on Scott Aaronson’s blog—we would be remiss if we did not mention Scott in this post—and more widely.

The penultimate chapter covers the beautiful theory of rectangle-free colorings of grids. We mentioned Bill’s involvement with this Ramsey-style problem in a post on what happened to be this blog’s third anniversary. We connected it to Kolmogorov complexity, which together with uncomputability is the subject of the last chapter. This and other chapters I’ve mentioned exemplify how the research ambit connects traditional mathematics with the terms and concerns of computing theory. How to work in this ambit may be the greatest value of the book for students. In all I found the book light, lighthearted, and enlightening.

Open Problems

The big open problem with the book is how it might help gear us up to tackle the big open problems.

[sourced chapter near end of Part II to blog]

by KWRegan at March 10, 2019 07:43 PM UTC

The polarization lemma for statistical distance ($\mathrm{SD}$), due to Sahai and Vadhan (JACM, 2003), is an efficient transformation taking as input a pair of circuits $(C_0,C_1)$ and an integer $k$ and outputting a new pair of circuits $(D_0,D_1)$ such that if $\mathrm{SD}(C_0,C_1)\geq\alpha$ then $\mathrm{SD}(D_0,D_1) \geq 1-2^{-k}$ and if $\mathrm{SD}(C_0,C_1) \leq \beta$ then $\mathrm{SD}(D_0,D_1) \leq 2^{-k}$. The polarization lemma is known to hold for any constant values $\beta < \alpha^2$, but extending the lemma to the regime in which $\alpha^2 \leq \beta < \alpha$ has remained elusive. The focus of this work is in studying the latter regime of parameters. Our main results are: 1. Polarization lemmas for different notions of distance, such as Triangular Discrimination ($\mathrm{TD}$) and Jensen-Shannon Divergence ($\mathrm{JS}$), which enable polarization for some problems where the statistical distance satisfies $\alpha^2 < \beta < \alpha$. We also derive a polarization lemma for statistical distance with any inverse-polynomially small gap between $\alpha^2$ and $\beta$ (rather than a constant). 2. The average-case hardness of the statistical difference problem (i.e., determining whether the statistical distance between two given circuits is at least $\alpha$ or at most $\beta$), for any values of $\beta < \alpha$, implies the existence of one-way functions. Such a result was previously only known for $\beta < \alpha^2$. 3. A (direct) constant-round interactive proof for estimating the statistical distance between any two distributions (up to any inverse polynomial error) given circuits that generate them.

at March 10, 2019 08:50 AM UTC

I want to draw your attention to the call for workshop and tutorial proposals for STOC 2019:

Important Dates

Submission deadline:March 24, 2019
Notification:April 5, 2019
Workshop Day:Sunday June 23, 2019

STOC 2019 will hold a Workshop and Tutorial Day on Sunday June 23. We invite groups of interested researchers to submit workshop or tutorial proposals.

The STOC 2019 Workshop and Tutorial Day provides an informal forum for researchers to discuss important research questions, directions, and challenges of the field. We also encourage workshops that focus on connections between theoretical computer science and other areas, topics that are not well represented at STOC, new directions, and open problems. The program may also include tutorials, each consisting of a few survey talks on a particular area.

Format: We have room for three workshops/tutorials running in parallel for the course of the day with a break for lunch. In addition, workshops or tutorials may be full-day (6 hrs) or half-day (3 hrs).

Proposal Submission

Workshop and tutorial proposals should, ideally, fit one page. Please include a list of names and email addresses of the organizers, a brief description of the topic and the goals of the workshop or tutorial, the proposed workshop format (invited talks, contributed talks, contributed posters, panel, etc.), and proposed or tentatively confirmed speakers if known. Please also indicate the preferred length of time (3 hrs or 6 hrs) for your workshop/tutorial. If your proposal is accepted and you wish to solicit contributed talks (which we strongly encourage), we can link to your call-for-contributions from the STOC 2019 page. Feel free to contact the Chair of the Workshop and Tutorials Committee directly or at the email address below if you have any questions.

Submission Deadline

Proposals should be submitted by March 24, 2019 via email to Proposers will be notified by April 5, 2019 whether their proposals have been accepted.

Workshop and Tutorials Committee: Moses Charikar (Chair)

by Moses Charikar at March 08, 2019 04:25 PM UTC

Via Scott, John Horgan wrote a blog post following on his 1993 Scientific American article The Death of Proof. The article talked about computer-generated proofs, experimental mathematics and even mentioned some of our work on transparent proofs (basically time-efficient probabilistically checkable proofs). I got (sarcastically I think) accused of killing mathematics after that article but of course we had traditional proofs about probabilistically check proofs. Andrew Wiles got a sidebar titled "A Splendid Anachronism?" for merely solving the most famous open problem in all of mathematics. Somehow traditional mathematical proofs survived the article.

Meanwhile in CACM, Moshe Vardi writes Lost in Math? 
TCS is surely blessed with mathematical beauty. As a graduate student a long time ago, it was mathematical beauty that attracted me to TCS, and continued to lead my research for many years. I find computational complexity theory (or complexity theory, for short), with its theorems (for example, the time-hierarchy and space-hierarchy theorems) and its open questions (for example, P vs NP), to be hauntingly beautiful. Beauty, yes; but what about truth?
Here here on the beauty of complexity. So what about truth? I cover much of this in my P v NP survey. In short the P v NP captures the most critical aspects of what is efficiently computable but it is not the endpoint. Nevertheless P v NP captures both truth and beauty and that's why the problem has so captivated and guided me throughout my research career.

by Lance Fortnow ( at March 08, 2019 03:05 PM UTC