Theory of Computing Blog Aggregator

Authors: Maxime Soler, Martin Petitfrere, Gilles Darche, Melanie Plainchault, Bruno Conche, Julien Tierny
Download: PDF
Abstract: This application paper presents a novel framework based on topological data analysis for the automatic evaluation and ranking of viscous finger simulation runs in an ensemble with respect to a reference acquisition. Individual fingers in a given time-step are associated with critical point pairs in the distance field to the injection point, forming persistence diagrams. Different metrics, based on optimal transport, for comparing time-varying persistence diagrams in this specific applicative case are introduced. We evaluate the relevance of the rankings obtained with these metrics, both qualitatively thanks to a lightweight web visual interface, and quantitatively by studying the deviation from a reference ranking suggested by experts. Extensive experiments show the quantitative superiority of our approach compared to traditional alternatives. Our web interface allows experts to conveniently explore the produced rankings. We show a complete viscous fingering case study demonstrating the utility of our approach in the context of porous media fluid flow, where our framework can be used to automatically discard physically-irrelevant simulation runs from the ensemble and rank the most plausible ones. We document an in-situ implementation to lighten I/O and performance constraints arising in the context of parametric studies.

at August 23, 2019 12:03 AM UTC

Authors: Hugo A. Akitaya, Esther M. Arkin, Mirela Damian, Erik D. Demaine, Vida Dujmović, Robin Flatland, Matias Korman, Belén Palop, Irene Parada, André van Renssen, Vera Sacristán
Download: PDF
Abstract: We present the first universal reconfiguration algorithm for transforming a modular robot between any two facet-connected square-grid configurations using pivot moves. More precisely, we show that five extra "helper" modules ("musketeers") suffice to reconfigure the remaining $n$ modules between any two given configurations. Our algorithm uses $O(n^2)$ pivot moves, which is worst-case optimal. Previous reconfiguration algorithms either require less restrictive "sliding" moves, do not preserve facet-connectivity, or for the setting we consider, could only handle a small subset of configurations defined by a local forbidden pattern. Configurations with the forbidden pattern do have disconnected reconfiguration graphs (discrete configuration spaces), and indeed we show that they can have an exponential number of connected components. But forbidding the local pattern throughout the configuration is far from necessary, as we show that just a constant number of added modules (placed to be freely reconfigurable) suffice for universal reconfigurability. We also classify three different models of natural pivot moves that preserve facet-connectivity, and show separations between these models.

at August 22, 2019 11:49 PM UTC

Authors: Amyra Meidiana, Seok-Hee Hong, Peter Eades, Daniel Keim
Download: PDF
Abstract: Traditionally, graph quality metrics focus on readability, but recent studies show the need for metrics which are more specific to the discovery of patterns in graphs. Cluster analysis is a popular task within graph analysis, yet there is no metric yet explicitly quantifying how well a drawing of a graph represents its cluster structure. We define a clustering quality metric measuring how well a node-link drawing of a graph represents the clusters contained in the graph. Experiments with deforming graph drawings verify that our metric effectively captures variations in the visual cluster quality of graph drawings. We then use our metric to examine how well different graph drawing algorithms visualize cluster structures in various graphs; the results con-firm that some algorithms which have been specifically designed to show cluster structures perform better than other algorithms.

at August 22, 2019 11:20 PM UTC

Authors: Molly Baird, Sara C. Billey, Erik D. Demaine, Martin L. Demaine, David Eppstein, Sándor Fekete, Graham Gordon, Sean Griffin, Joseph S. B. Mitchell, Joshua P. Swanson
Download: PDF
Abstract: An open problem of Manuel Abellanas asks whether every set of disjoint closed unit disks in the plane can be connected by a conveyor belt, which means a tight simple closed curve that touches the boundary of each disk, possibly multiple times. We prove three main results. First, for unit disks whose centers are both $x$-monotone and $y$-monotone, or whose centers have $x$-coordinates that differ by at least two units, a conveyor belt always exists and can be found efficiently. Second, it is NP-complete to determine whether disks of varying radii have a conveyor belt, and it remains NP-complete when we constrain the belt to touch disks exactly once. Third, any disjoint set of $n$ disks of arbitrary radii can be augmented by $O(n)$ "guide" disks so that the augmented system has a conveyor belt touching each disk exactly once, answering a conjecture of Demaine, Demaine, and Palop.

at August 22, 2019 11:23 PM UTC

Authors: Therese Biedl, Stefan Felsner, Henk Meijer, Alexander Wolff
Download: PDF
Abstract: A measure for the visual complexity of a straight-line crossing-free drawing of a graph is the minimum number of lines needed to cover all vertices. For a given graph $G$, the minimum such number (over all drawings in dimension $d \in \{2,3\}$) is called the \emph{$d$-dimensional weak line cover number} and denoted by $\pi^1_d(G)$. In 3D, the minimum number of \emph{planes} needed to cover all vertices of~$G$ is denoted by $\pi^2_3(G)$. When edges are also required to be covered, the corresponding numbers $\rho^1_d(G)$ and $\rho^2_3(G)$ are called the \emph{(strong) line cover number} and the \emph{(strong) plane cover number}.

Computing any of these cover numbers -- except $\pi^1_2(G)$ -- is known to be NP-hard. The complexity of computing $\pi^1_2(G)$ was posed as an open problem by Chaplick et al. [WADS 2017]. We show that it is NP-hard to decide, for a given planar graph~$G$, whether $\pi^1_2(G)=2$. We further show that the universal stacked triangulation of depth~$d$, $G_d$, has $\pi^1_2(G_d)=d+1$. Concerning~3D, we show that any $n$-vertex graph~$G$ with $\rho^2_3(G)=2$ has at most $5n-19$ edges, which is tight.

at August 22, 2019 11:47 PM UTC

Authors: J. G. Benade, J. N. Hooker
Download: PDF
Abstract: We present a general method for obtaining strong bounds for discrete optimization problems that is based on a concept of branching duality. It can be applied when no useful integer programming model is available, and we illustrate this with the minimum bandwidth problem. The method strengthens a known bound for a given problem by formulating a dual problem whose feasible solutions are partial branching trees. It solves the dual problem with a "worst-bound" local search heuristic that explores neighboring partial trees. After proving some optimality properties of the heuristic, we show that it substantially improves known combinatorial bounds for the minimum bandwidth problem with a modest amount of computation. It also obtains significantly tighter bounds than depth-first and breadth-first branching, demonstrating that the dual perspective can lead to better branching strategies when the object is to find valid bounds.

at August 22, 2019 11:22 PM UTC

The schedule for PapaFest (celebrating Christos Papadimitriou) has been posted at

When/where: September 6-8, 2019, at Columbia University in Davis Auditorium. (Register at the URL above, for free.)

Confirmed speakers: Sanjeev Arora, Michael Collins, Michael Jordan, Anna Karlin, Jon Kleinberg, Elias Koutsoupias, Adi Livnat, Noam Nisan, Prabhakar Raghavan, Scott Shenker, Eva Tardos, John Tsitsiklis, Leslie Valiant, Umesh Vazirani, and Santosh Vempala.

There will also be two panels led by Richard Karp and Jitendra Malik, and a rock concert by Errors in Bars on Saturday night!

by timroughgarden at August 21, 2019 03:12 PM UTC

We invite applications in all areas of computer science for several open positions. Female researchers are strongly encouraged to apply.


by shacharlovett at August 21, 2019 11:55 AM UTC

Let me announce my CERN colloquium this Thursday, August 22, 2019, 16:30-17:30 entitled “The argument against quantum computers.” If you are at CERN or the neighborhood, please please come to the lecture. (Tea and coffee will be served at 16:00. ) If you are further away, there is a live broadcast.

A few weeks ago I uploaded to the arXive a new paper with the same title “The argument against quantum computers“. The paper will appear in the volume: Quantum, Probability, Logic: Itamar Pitowsky’s Work and Influence, Springer, Nature (2019), edited by Meir Hemmo and Orly Shenker. A short abstract for the lecture and the paper is:

We give a computational complexity argument against the feasibility of quantum computers. We identify a very low complexity class of probability distributions described by noisy intermediate-scale quantum computers, and explain why it will allow neither good-quality quantum error-correction nor a demonstration of “quantum supremacy.”  Some general principles governing the behavior of noisy quantum systems are derived.

The new paper and lecture have the same title as my 2018 interview with Katia Moskvitch at Quanta Magazine (see also this post).  Note that Christopher Monroe has recently contributed a very interesting comment to the Quanta article. My paper is dedicated to the memory of Itamar Pitowsky, and for more on Itamar see the post Itamar Pitowsky: Probability in Physics, Where does it Come From? See also this previous post for two other quantum events in Jerusalem: a seminar in the first semester and a winter school on The Mathematics of Quantum Computation  on December 15 – December 19, 2019.

A slide from a lecture by Scott Aaronson where he explains why soap bubble computers cannot solve the NP-complete Steiner-tree problem. Noisy intermediate scale quantum (NISQ) circuits are computationally much more primitive than Scott’s soap bubble computers and this will prevent them from achieving neither “quantum supremacy” nor good quality quantum error correcting codes.  (source for the picture)


Low-entropy quantum states give probability distributions described by low degree polynomials, and very low-entropy quantum states give chaotic behavior. Higher entropy enables classical information. 



by Gil Kalai at August 21, 2019 01:26 AM UTC

Why does randomness help?

Kathryn Farley is my dear wife. She and I are currently on a cruise through the Mediterranean. Our trip started in Barcelona and is stopping daily at various cities as we journey to Rome. “Tough duty,” but we are trying to enjoy it.

Today I wish to talk about our visit to Monte Carlo.

Our ship, the Encore, just docked there Sunday. The day was warm and clear, and we spent some time exploring the city. We did manage to avoid losing any money at the famous casino. Our secret was simple: do not play, do not gamble, do not lose.

Over lunch I started to explain to Kathryn why Monte Carlo is an important city for complexity theorists. I felt a bit like we were at a theory shrine.

Why Randomness Helps

Indeed. I realized that it is not so simple to explain why randomness helps. Kathryn has a Ph.D in theatre. She is smart, is a member of Mensa, but is not a complexity theorist. How do I explain that randomness is powerful? Indeed.

I started to explain, but my examples were lame. I think she got the main idea, but I also think that I did not do a great job. Russell Impagliazzo has a nice explanation on the role of randomness—I wish Russell had been there to help explain randomness to Kathryn.

After lunch I started to think more about the role of randomness. I looked at our friends over at Wikipedia and discovered they had a pretty good page. Some reasons are:

{\bullet } Games

Randomness was first investigated in the context of gambling. Dice, playing cards, roulette wheels, all have been studied by those interested in gambling. Clearly, betting on the roll of dice, deal of cards, or spin of the wheel, only makes sense when these actions are unpredictable. Random.

{\bullet } Political

Randomness is often used to create “fairness”. For example, in the US and UK, juror selection is done by a lottery.

{\bullet } Science

Monte Carlo methods in physics and computer science require random numbers.

{\bullet } Cryptography

Random keys for encryption algorithms should be unpredictable. Random. Otherwise, they can be guessed by others. The password “password” is usually not allowed.

{\bullet } Arts

Kathryn is interested in the arts: in plays and in painting and other fine arts. Some theories of art claim that all art is random. One thinks of artists like Jackson Pollock with his famous drip paintings. He was a major player in the abstract expressionist movement.

Pseudo or True?

Ken has been paying intensive devotions at the same shrine. As he wrote in the previous post, he has been conducting millions of randomized tests of his new chess model.

Why random? What he needs to do is show that his model will not tend to “cry wolf” by giving a too-high {z}-score to a set of games in a tournament by an honest player. He wants to show that his model is equally tempered no matter the rating of the player. So he runs trials at different rating levels ranging from Elo 1000 for novice players to Elo 2800 which is championship level. To show that the {z}-scores given by his model conform to a normal bell curve, he needs to do 10,000s or 100,000s of tests at each level.

The problem is there just don’t exist enough games. Most large tournaments give only the games played on their “top boards” which use special auto-recording equipment, and the losers on those boards in one round may play on lower boards in the next round. Thus out of about 60,000 player-tournament pairs Ken can track each year, most are only partial samples. So what Ken does is generate “synthetic players” by randomly taking subsets of (say) 9 games—from his data set of 1,000 or so games for each level—and randomly choosing white or black for each game. This is a common resampling technique, and it uses Monte Carlo.

Ken uses pseudo-random generators (PRGs). He starts a C++ library PRG on a seed {s} based on the current time. The fact that the choices are deterministic once {s} is given might allow him to reproduce an entire run exactly (after a model tweak) by preserving the {s} it used. This is a paradox: we might want our “random” bits to be deterministic. Monte Carlo with predestined loaded dice.

From time to time on this blog we have mused about what a world without randomness or with reduced entropy would be like. We were struck a few weeks ago when the noted physics blogger Sabine Hossenfelder wrote about “superdeterminism.” That post provoked a few hundred comments in her blog, as did her post last week on the quantum measurement problem—including long exchanges with Peter Shor. Ken and I don’t know which side to take, but I can say that the side of a ship is a great place to think about possible real effects of these differences.

Open Problems

What is your take on randomness? Do you employ it? How “true” do you need it to be?

by RJLipton+KWRegan at August 20, 2019 04:24 PM UTC

Multiple Postdoctoral positions available to work on Fine-grained Complexity, Approximation Algorithms and Additive Combinatorics. Please email directly to Barna with resume and contact information of three references. The start date is flexible.


by shacharlovett at August 20, 2019 04:00 AM UTC

Today I would like to report about some breakthroughs in the area of combinatorics, and about blog posts in Yufei Zhao’s blog that describe these remarkable results (better than I can). Many of the coauthors in these breakthroughs are undergraduate students.

Equiangular lines with a fixed angle

by Zilin Jiang, Jonathan Tidor, Yuan Yao, Shengtong Zhang and Yufei Zhao.

Paper ; blog post (end of July, 2019)

A collection of equiangular lines is a collection of lines so that the angles between every pair of lines in the same?

Here are two classical questions:

  1. What is the maximum number of equiangular lines in \mathbb R^d?
  2. Given an angle \alpha what is the maximum number of equiangular lines in \mathbb R^d? so that the angle between every two lines is \alpha?

In 2000 Dom de Caen found for the first time an equiangular set of lines in d space of size cd^2. (The crucial observation that one of the graphs in the Cameron–Seidel association scheme has a certain eigenvalue of large multiplicity.  Prior to this construction, the largest sets had sizes of order d^{3/2}.  In de Caen’s example the lines have angles approaching 90 degrees, and question 2 for a fixed value of \alpha led to very different bounds. (For another result by Dom, see this post.)

There were some important progress on this problem by Igor Balla, Felix Dräxler, Peter Keevash, and Benny Sudakov, as featured in this  Quanta Magazine article written by Kevin Hartnett. Zilin, Jonathan, Yuan, Shengtong, and Yufei  finished off the problem in a clean and crisp manner, in a 10-page paper with a self-contained proof. On the way they proved the following very interesting theorem.

Theorem: A bounded degree graph must have sublinear second eigenvalue multiplicity.

Impartial digraphs

by Yufei Zhao and Yunkun Zhou.

Paper, blog post (end of June 2019)

Abstract: We prove a conjecture of Fox, Huang, and Lee that characterizes directed graphs H that have constant density in all tournaments: they are disjoint unions of trees that are each constructed in a certain recursive way.

“Constant density in all tournaments” means that for some n_0 (and hence for all n \ge n_0, every tournament with n vertices has the same number of copies of H. (On the nose!)

This result is related to the famous Sidorenko’s conjecture. Let me copy its description from the paper:

For undirected graphs, conjectures of Sidorenko and Erdős–Simonovits (commonly referred to as Sidorenko’s conjecture) say that for every bipartite graph H, the H-density in a graph of fixed density is minimized asymptotically by a random graph. Lately the conjecture has been proved for many families of H though the conjecture remains open in general. In particular, the case H = K_{5,5}\backslash C_{10} is open.

Yufei Zhao


Let me describe briefly three additional posts on Yufei’s blog with two results from 2018 and one result from 2014.

Ashwin Sah, Mehtaab Sawhney, David Stoner, and Yufei Zhao  A reverse Sidorenko inequality; (blog post) ; The number of independent sets in an irregular graph; blog post.

These are two remarkable papers by the same team of researchers. The paper  on the number of independent sets in a regular graph settled a famous 2001 conjecture by Jeff Kahn. An earlier breakthrough on the problem was made by Zhao in 2009 when he was an undergraduate student.

Eyal Lubetzky and Yufei Zhao,  On the variational problem for upper tails of triangle counts in sparse random graphs. Blog post.

Recently  I wrote a post Matan Harel, Frank Mousset, and Wojciech Samotij and the “the infamous upper tail” problem describing a major breakthrough on the infamous upper tail problem. Over the years I heard a lot of lectures and private explanation on upper tails and the remarkable related mathematics. I remember lectures by Kim and Vu from the early 2000’s and by Varadhan from ICM 2010 describing (among other things) the fundamental paper by  Chatterjee and Varadhan,  and later works by  DeMarco and Kahn,  and by  Chatterjee and Dembo  and Lubetzky and Zhao and others. But when I wrote the post I realized my knowledge is too sparse for giving a thorough description, and I decided not to wait and write a short post. This post by Yufei describes some of the history very nicely as well as the major papers by Eyal and Yufei from 2012 and 2014.

by Gil Kalai at August 19, 2019 06:33 PM UTC

The two-way quantum/classical finite automaton (2QCFA), defined by Ambainis and Watrous, is a model of quantum computation whose quantum part is extremely limited; however, as they showed, 2QCFA are surprisingly powerful: a 2QCFA, with a single qubit, can recognize, with one-sided bounded-error, the language $L_{eq}=\{a^m b^m |m \in \mathbb{N}\}$ in expected polynomial time and the language $L_{pal}=\{w \in \{a,b\}^*|w \text{ is a palindrome}\}$ in expected exponential time. We further demonstrate the power of 2QCFA by showing that they can recognize the word problems of a broad class of groups. In particular, we first restrict our attention to 2QCFA that: $(1)$ have a single qubit, $(2)$ recognize their language with one-sided bounded-error, and $(3)$ have transition amplitudes which are algebraic numbers. We show that such 2QCFA can recognize the word problem of any finitely-generated virtually abelian group in expected polynomial time, as well as the word problem of a large class of linear groups in expected exponential time. This latter class includes all groups whose word problem is a context-free language as well as all groups whose word problem is known to be the intersection of finitely many context-free languages. As a corollary, we obtain a direct improvement on the original Ambainis and Watrous result by showing that $L_{eq}$ can be recognized by a 2QCFA with better parameters. We also consider those word problems which a 2QCFA can recognize with one-sided unbounded-error, and show that this class includes the word problem of more exotic groups such as the free product of any finite collection of finitely-generated free abelian groups. As a corollary of this result, we demonstrate that a new class of group word problems are co-stochastic languages. Lastly, we exhibit analogous results for 2QCFA with any finite number of qubits or with more general transition amplitudes, as well as results for other classic QFA models.

at August 18, 2019 01:27 PM UTC

Over the last twenty years, an exciting interplay has emerged between proof systems and algorithms. Some natural families of algorithms can be viewed as a generic translation from a proof that a solution exists into an algorithm for finding the solution itself. This connection has perhaps been the most consequential in the context of semi-algebraic proof systems and basic primitives in algorithm design such as linear and semidefinite programming. The proof system perspective, in this context, has provided fundamentally new tools for both algorithm design and analysis. These news tools have helped in both designing better algorithms for well-studied problems and proving tight lower bounds on such techniques. This monograph is aimed at expositing this interplay between proof systems and efficient algorithm design and surveying the state-of-the-art for two of the most important semi-algebraic proof systems: Sherali-Adams and Sum-of-Squares. We rigorously develop and survey the state-of-the-art for Sherali-Adams and Sum-of-Squares both as proof systems, as well as a general family of optimization algorithms, stressing that these perspectives are formal duals to one-another. Our treatment relies on interpreting the outputs of the Sum-of-Squares and Sherali-Adams algorithms as generalized expectation functions -- a viewpoint that has been essential in obtaining both algorithmic results and lower bounds. The emphasis is on illustrating the main ideas by presenting a small fraction of representative results with detailed intuition and commentary. The monograph is self-contained and includes a review of the necessary mathematical background including basic theory of linear and semi-definite programming.

at August 18, 2019 06:14 AM UTC

A few weeks ago I read Malcolm Gladwell’s book The Tipping Point. The book doesn’t really deal with mathematics, but it does contain one math-related anecdote. This anecdote demonstrates an interesting principle of learning mathematics, so I wanted to share it. Problem 1. We have a deck of cards, such that each card contains a […]

by Adam Sheffer at August 18, 2019 02:03 AM UTC

Given an abstract optimization problem with multiple solutions, how much partial information about a solution do you have to know in order to uniquely identify that solution? That has been the topic of some of my earlier research, on how many creases of an origami folding pattern you have to force to be mountain or valley folds in order to cause the remaining folds to go the way you want. And it’s the topic of my new preprint “Tracking paths in planar graphs” (arXiv:1908.05445, with Mike Goodrich, James Liu, and Pedro Matias).

There’s an old story about how to design the footpaths on a college campus: wait for it to snow, see where the heaviest sets of footprints cross the snow from building to building, and then once the snow melts place paths in those same places. But what if you live somewhere like Irvine where it never snows? Or what if you want to perform some other type of data analysis on a data set of the paths that people take? For instance, in order to design improvements to the road networks used by commuter traffic, it would be helpful to figure out where all the traffic actually goes each day. How can you collect that data?

Our paper takes the point of view that you can attach sensors to the network that record the times and identities of people passing by them, but that these sensors are expensive. The goal is (for a given network with designated start and destination vertices and ) to place as few of these sensors as possible at graph vertices, in such a way that every simple -path is uniquely identified by the sequence of sensors that it passes through.

The problem turns out to be -complete, even on planar networks. But there’s a simple approximation ratio based on the idea that the optimal number of sensors is always going to be proportional to the number of faces in the network. Each face (in the sequence of biconnected components between and ) has to have at least one sensor, to distinguish paths that go one way around the face from paths that go the other way around. It turns out that placing one sensor at a vertex shared by many faces doesn’t work — those faces still need a proportional number of additional sensors. And our approximation algorithm ensures that the number of sensors is at most proportional to the number of faces.

We also use Courcelle’s theorem to prove that the exact solution is fixed-parameter tractable in the clique-width of the graph. Like most or all uses of Courcelle’s theorem, the resulting algorithm is impractical, so it would be of interest to find a more direct algorithm, perhaps for a weaker parameter.

(Discuss on Mastodon)

by David Eppstein at August 17, 2019 02:42 PM UTC

Given an unpredictable Boolean function $f: \{0, 1\}^n \rightarrow \{0, 1\}$, the standard Yao's XOR lemma is a statement about the unpredictability of computing $\oplus_{i \in [k]}f(x_i)$ given $x_1, ..., x_k \in \{0, 1\}^n$, whereas the Selective XOR lemma is a statement about the unpredictability of computing $\oplus_{i \in S}f(x_i)$ given $x_1, ..., x_k \in \{0, 1\}^n$ and $S \subseteq \{1, ..., k\}$. We give a reduction from the Selective XOR lemma to the standard XOR lemma. Our reduction gives better quantitative bounds for certain choice of parameters and does not require the assumption of being able to sample $(x, f(x))$ pairs.

at August 16, 2019 08:16 AM UTC

Using predictivity both to sharpen and cross-check models

Cropped from article source

Patrice Miller and Jeff Seder look under the hide of horses. Their company EQB does predictive modeling for horse racing based on biometric data. They are famous for having advised the owner of American Pharoah not to sell because the horse had a powerful heart. In 2015, American Pharoah became the first Triple Crown winner since the also-hearty Secretariat in 1978.

Today I am happy to announce an extended version of my predictive model for chess and discuss how it gains accuracy by looking under the hood of chess positions.

I had thought to credit Charles Babbage and Ada Lovelace for being the first to envision computational predictive modeling, but the evidence connected to her design of betting schemes for horse racing is scant and secondhand accounts differ. It is known that Babbage compiled voluminous data on the medical fitness and diet of animals, including heart function by taking their pulse. We have discussed their computing work here and here.

I will use horse racing as a device for explaining the main new ingredient of my model. It sharpens the prediction of moves—and the results of cheating tests—by using deeper information to “beat the bookie” as Lovelace tried to do. I have described the basic form of my model—and previous efforts to extend it—in several previous posts on this blog. Last month, besides being involved in several media stories involving a grandmaster caught in the act of cheating in France, I was invited to discuss this work by Ben Johnson for his “Perpetual Chess” podcast.

Betting the Favorite

My chess model does the same thing to a chess position—given information about the skill set of the player deciding on a move—that a bookie does to a horse race. It sets odds on each legal move {m_i} to “win” by being played in the game. The probabilities {p_i} need to be accurate for the same reason bookmakers need their “initial betting lines” to be close to how bets will ultimately balance, so they can preserve their margin. A horse with highest {p_i}—perhaps a tie—is the bookie’s favorite. The favorite might be “odds-on,” meaning {p_i \gg 0.5}, or might be a “narrow favorite” among several horses with near-equal chances.

Suppose you don’t care how much money you might win but just want to maximize your chance of being right—of winning something. Unless you have reason to doubt the bookie, you should bet on the favorite. That is what my basic chess model does. Whichever move is given the highest value by the computer at the end of its search is declared the favorite, regardless of the player’s Elo rating or other skill factors.

That the best move—we’ll label it {m_1}—should always be most likely even for the weakest players runs counter to sense. Aren’t weaker players weaker because they prefer weaker moves? When the right move is obvious, say a forced recapture or a checkmate in one, of course we expect any player to find it. But when {m_1} is subtle, what then?

My basic model still makes it the favorite. This doesn’t mean its probability {p_1} is greater than half. My model might make {p_1 \gg 0.5} for world champion level players but only, say, {p_1 = 0.25} for beginning players. Thus it will still say the beginner is 75% likely to play an inferior move. What my base model shies away from is saying any other particular move—any other horse—is more likely to win than {m_1}. As the rating gets lower it bunches up the probabilities so that while {p_1} is lower, no other probability passes it.

This is borne out in practice. The lowest Elo rating used by the World Chess Federation (FIDE) is 1000. Let’s take ratings between that and 1200 (which used to be the lowest rating) as denoting the novice class. Consider only those positions that have many reasonable choices—say at least ten moves valued within 0.25 (figuratively, a quarter of a pawn) of optimal. My main training set has 6.082 such positions in games between evenly-matched players of this level. Here are the frequencies of their playing the best through the tenth-best move in such many-choice positions:

Rank Pct.
1 17.76%
2 13.22%
3 9.95%
4 7.66%
5 6.25%
6 5.18%
7 4.41%
8 4.55%
9 3.50%
10 3.03%
11+ 24.49%

Both my basic model and the new one, when fitted over the entire training set for this class but then restricted to the many-choices subset, give projections close to these actual values. The basic model, always betting the favorite, is right on under 18% of its projections. Can we do better? That is, can we “beat the bookie” at chess?

Predicting Inferior Moves

It is almost four years since the idea for improving predictions was described on this blog. In place of “weaker players prefer weaker moves,” it advanced a hypothesis that we can state as follows:

Weaker players are more likely to be diverted by shiny objects.

Most in particular, they will fall for moves that look attractive early on, but which are revealed (by the computer) to be inferior after deeper consideration. The computer programs output values for each depth of search, and when these moves’ values are graphed against the depth, they start high but “swing down” at higher depths. Weaker players are more likely to be satisfied by the early flash and not think deeper. The old post has a great example of such a move from the pivotal game in the 2008 world championship match, where Viswanathan Anand set a deep trap that caught the previous world champion, Vladimir Kramnik.

The flip side are moves that look poor at low depths but whose high value emerges at high depths. My basic model, which uses only the final values of moves, gives too high a projection on these cases, and too low a likelihood of falling into traps. I have figured that these two kinds of ‘misses’ offset over a few dozen positions. Moreover, in both kinds of misses, the player is given express benefit of doubt by the projections. It is edgier, after all, to project that a player is more likely to fall into a trap than to find the safest and best move.

The effect of lower-depth values is still too powerful to ignore in identifiable cases. Including them, however, makes the whole model edgier, as I have described here before. Simply put, the lower-depth values are subject to more noise, from which we are trying to extract greater information. It has been like trying to catch lightning—or fusion—in a bottle.

Modest But Effective Success

My new model implements the “swing” feature without adding any more free parameters for fitting. It has new parameters but those are set “by hand” after an initial fitting of the free parameters under the “sliding-scale” regime I described last September, which is followed by a second round of re-fitting. It required heavy clamps on the weighting of lower-depth values and more-intense conditioning of inputs overall. It required a solution to the “firewall at zero” phenomenon that was the exact opposite of what I’d envisioned.

After all this, here is what it delivers in the above case—and quite generally:

It improves the prediction success rate—for the weakest players in the most difficult kind of positions to forecast—from 17.76% to 20.04%.

For the elite class—2600 to 2800—in the same kind of many-choice positions, the new model does even better. Much more data on elite players is available, so I have 49,793 such positions faced by them:

Whereas elite players found the best move in 30.85% of these difficult positions, my new model finds their move in 34.64% of them.

Over all positions, the average prediction gain ranges from about 1 percentage point for the lowest players to over 2% for masters. These gains may not sound like much, but for cheating tests they give prime value. The reasons are twofold:

  • The gained predictions are all against their finding the computer’s move, so the act of finding the best move is more exposed.

  • The standard deviation is typically 3–5 percentage points depending on the volume of moves. Thus the prediction gain can enhance the measured z-score by upwards of 0.5 or more.

Initial applications in recent cases seem to prove this out more often than not. Of course, the larger purpose is to have a better model of human chess play overall.

Cross-Checks and Caveats

In recent years, several new dimensions of quality and safety with predictive models have emerged. They supplement the two classic ones:

  • Avoiding false positives. In particular, over a natural population (such as the mass of honest players), the rate of outlier scores generated by the model must stay within the bounds of natural frequency for the population. Call this “natural safety.”

  • Avoiding false negatives. The model should provide enough statistical power to flag unnatural outliers without becoming unsafe.

I vetted my model’s natural safety by processing tens of millions of generated z-scores under resampling after my final design tweaks earlier this month. This was over a link between departmental machines and UB’s Center for Computational Research (CCR) where my data is generated. The previous discussion has all been about greater power. The first new factor updates the idea of calibration:

  • Non-bias and fairness. More general than the societal context, which we discussed here, this means avoiding bias along functional lines that the model is not intended to distinguish.

I have a suite of cross-checking measures besides those tests that are expressly fitted to be unbiased estimators. They include checking how my model performs on various different types of positions, such as those with many choices as above, or the opposite: those having one standout move. For example, the model’s best-move projection in the many-choice positions by elite players, using the general settings for 2700 rating, is 31.07%. That’s within 0.28%. Another check is figuratively how much “probability money” my model wagered to get its 34.64% hit rate. The sum of {p_i} it projected on its own most-likely moves, in the 68.4% of the many-choice positions where it agreed with the computer’s favorite plus 31.6% where it did not, was 35.00%. If I fit all 282,060 positions by these players, rather than use “2700,” and then re-select the subset, the model comes within 0.01% on the first-move projection and 0.11% on its own betting forecasts. I will say more about the cross-checks, use of predictionscoring metrics, and conformance to normal distribution at a later time. The relevant point is to ask:

How well does your model perform on pertinent tests besides those it was expressly trained for?

Beyond fairness, good wide-range calibration alleviates dangers of “mission creep.”

The second newer factor is:

  • Adversarial performance. This is apart from “natural safety.” Simply put: how resistant is the model to being deceived? Can it be gamed?

My impressions over the long haul of this work is that the new model’s more-powerful heart inevitably brings greater “springiness.” By dint of its being more sensitive to moves whose high value emerges only after deep search, it is possible to create shorter sequences of such moves that make it jump to conclusions. The z-score vetting turned up a few games that were agreed drawn after some “book” moves—openings known by heart to many professional players—whose entirety the model would flag, except for the standard procedure of identifying and removing book moves from cheating tests. These outliers came from over a hundred thousand games between evenly-matched players, so they still conformed to the natural rate, but one can be concerned about the “unnatural rate.” On the flip side, I believe my more-intensive conditioning of values has made the model more robust against being gamed by cheaters playing a few inferior moves to cover their tracks.

In general, and especially with nonlinear models, the caveat is that amplifying statistical power brings greater susceptibility to adversarial conditions. Trying to “beat the bookie” requires more model introspection. My model retains its explainability and ability to provide audits for its determinations.

Open Problems

What lessons from similar situations with other predictive models can be identified and brought to bear on this one?

by KWRegan at August 16, 2019 12:52 AM UTC

by David Eppstein at August 15, 2019 04:07 PM UTC

Solving the runtime selection problem

Composite from src1, src2

Brendan Lucier and Csaba Szepesvári were consecutive speakers at this week’s workshop at the Toyota Technological Institute in Chicago on “Automated Algorithm Design.”

Today I will discuss their talks, which I enjoyed greatly at the workshop.

The workshop’s general theme was the use of machine-learning techniques to improve the tuning of algorithms. Indeed, the goal is to have the algorithm tune itself by selecting strategies from a collection of possible ones. Besides better performance with less human work, this promises better logical reliability than with hand-tuning programs and debugging.

A common component of this work used in both talks is an interesting oracle model. Since oracles are familiar to many of us theorists this will give us an avenue into the talks and the two recent papers they were based on.

Runtime Oracles

The oracle has a nonnegative matrix {R} of shape {n \times M}. Think of the entry {R(i,j)} as the runtime of the {i^{th}} program on the {j^{th}} input. You know {n} and {M}, but must ask the oracle for information about {R(i,j)}. You may ask the oracle a question {(i,j,t)}:

Is the entry {R(i,j)} less than or equal to {t}?

Here {t} is a time bound. Further they assume that the oracle charges you

\displaystyle  \min(R(i,j),t).

Thus an algorithm is charged in total

\displaystyle  \sum_{t} \min(R(i,j),t),

where the sum is over all questions {(i,j,t)} you asked. The rationale is that the oracle can run a program {i} on a task {j} and stop after at most {t} steps. Thus returning the minimum is realistic. Since their research is motivated by practical problems, this is a useful model for studying questions about program performance.

In passing I believe the reason this oracle model is new is that theorists do not often think about running arbitrary programs. Well those in recursion type did, but those designing classic algorithms do not.

A Selection Problem

This model to used to study a selection problem. Assume {R} is as above. The task is to find the row with the smallest row sum {s_{i}}:

\displaystyle  s_{i} = \sum_{j=1}^{M} R(i,j).

That is to find the program that takes the least total time on the inputs—hence the least average time.

The trouble is that in the worst case this can require examining all the entries. So they introduce approximations to make the problem doable, but still useful in practice:

  1. The rows need only be a {(1+\epsilon)} approximation to the smallest row.

  2. The entries can be assumed to be truncated by a value {\tau}. That is you can assume that

    \displaystyle  R(i,j) \le \tau.

  3. The governing algorithm can be randomized and need only meet the performance goal with high probability.

Now we can state the problem formally. It is parameterized by {(\epsilon, \tau)}:

Runtime Selection Problem (RSP): Given a matrix {R} with all entries bounded by {\tau}, find a row {i} so that

\displaystyle  s_{i} \le (1 + \epsilon)s_{k},

for all {k}, while minimizing the oracle charges for all questions “is {R(i,j) \leq t}?” that are asked.

The talks gave a variety of randomized algorithms that solve such problems for various parameter assumptions.

Fast and Slow

See their papers for details on the results. The algorithms they have are interesting not only from a theory viewpoint but also in practice. Indeed, they not only prove theorems but also give experimental timings.

In order to establish some intuition, let’s look at the following simple case. Assume that all entries {R(i,j)} are either {1} or some {\alpha \gg 1}. This is the “fast-or-slow” running time stipulation. As theorists we often are drawn to binary values, so this might be a good case to look at initially.

The first observation is that we will always ask questions of the form

\displaystyle  (i, j, 1.0001).

This gives us the value of the entry {R(i,j)} for almost unit cost: if it is the {R(i,j) \gg 1} case then we are only charged {1.0001}. Thus, we have reduced the original problem to a classic oracle problem. There is no complicated oracle cost measure: the cost is just the number of entries we read. Well okay the cost is slightly higher, but we can make it as close as we wish.

The selection problem then comes down to this:

  • We have a {n \times M} nonnegative matrix {R} whose entries are all either {1} or {\alpha}.

  • We must find a row sum that is within a factor of {1+\epsilon} of the smallest.

I believe that the analysis of this problem should be generally known. In any event it seems clear that randomly sampling each row is a good start.

Open Problems

Are there other natural problems where the runtime cost oracle is useful?

[Edited typo]

by RJLipton+KWRegan at August 12, 2019 02:34 AM UTC

We present a deterministic algorithm for reconstructing multilinear $\Sigma\Pi\Sigma\Pi(k)$ circuits, i.e. multilinear depth-$4$ circuits with fan-in $k$ at the top $+$ gate. For any fixed $k$, given black-box access to a polynomial $f \in \mathbb{F}[x_{1},x_{2},\ldots ,x_{n}]$ computable by a multilinear $\Sigma\Pi\Sigma\Pi(k)$ circuit of size $s$, the algorithm runs in time quasi-poly($n,s,{|\mathbb{F}|}$) and outputs a multilinear $\Sigma\Pi\Sigma\Pi(k)$ circuit of size quasi-poly($n,s$) that computes $f$. Our result solves an open problem posed in \cite{GKL12} (STOC, 2012). Indeed, prior to our work, efficient reconstruction algorithms for multilinear $\Sigma\Pi\Sigma\Pi(k)$ circuits were known only for the case of $k=2$ \cite{GKL12, Volkovich17}.

at August 11, 2019 04:14 AM UTC

Lifting theorems are theorems that relate the query complexity of a function $f:\left\{ 0,1 \right\}^n\to \left\{ 0,1 \right\}$ to the communication complexity of the composed function $f\circ g^n$, for some “gadget” $g:\left\{ 0,1 \right\}^b\times \left\{ 0,1 \right\}^b\to \left\{ 0,1 \right\}$. Such theorems allow transferring lower bounds from query complexity to the communication complexity, and have seen numerous applications in the recent years. In addition, such theorems can be viewed as a strong generalization of a direct-sum theorem for the gadget $g$. We prove a new lifting theorem that works for all gadgets $g$ that have logarithmic length and exponentially-small discrepancy, for both deterministic and randomized communication complexity. Thus, we significantly increase the range of gadgets for which such lifting theorems hold. Our result has two main motivations: First, allowing a larger variety of gadgets may support more applications. In particular, our work is the first to prove a randomized lifting theorem for logarithmic-size gadgets, thus improving some applications of the theorem. Second, our result can be seen as a strong generalization of a direct-sum theorem for functions with low discrepancy.

at August 11, 2019 04:04 AM UTC

Sign up for the new posts via the RSS feed.

Below I show a neat application of perfect hashing, which is one of my favorite (cluster of) algorithms. Amazingly, we use it to obtain a purely information-theoretic (rather than algorithmic) statement.

Suppose we have a finite universe $U$ of size $n$ and a $k$-element subset of it $S \subseteq U$ with $k \ll n$. How many bits do we need to encode it? The obvious answer is $\log_2 \binom{n}{k} = \Theta(k \cdot \log(n / k))$.
Can we, however, improve this bound if we allow some approximation?

Even if $n = 2k$, it is not difficult to show the lower bound of $k \cdot \log_2(1 / \delta)$ bits if we allow to be wrong when answering queries “does $x$ belong to $S$?” with probability at most $\delta$ (hint: $\varepsilon$-nets). Can we match this lower bound?

One approach that does not quite work is to hash each element of $S$ to an $l$-bit string using a sufficiently good hash function $h \colon U \to \{0, 1\}^l$, and, when checking if $x$ lies in $S$, compute $h(x)$ and check if this value is among the hashes of $S$. To see why it does not work, let us analyze it: if $x \notin S$, then the probability that $h(x)$ coincides with at least one hash of an element of $S$ is around $k \cdot 2^{-l}$. To make the latter less than $\delta$, we need to take $l = \log_2(k / \delta)$ yielding the overall bound of $k \cdot \log_2(k / \delta)$ falling short of the desired size.

To get the optimal size, we need to avoid using the union bound in the above argument. In order to accomplish this, let us use perfect hashing on top of the above hashing scheme! It is convenient to use a particular approach to perfect hashing called Cuckoo hashing. In short, there is a way to generate two simple hash functions $h_1, h_2 \in U \to [m]$ for $m = O(k)$ and place the elements of our set $S$ into $m$ bins without collisions so that for every $x \in S$, the element $x$ is placed either in $h_1(x)$ or in $h_2(x)$. Now, to encode our set $S$, we build a Cuckoo hash table for it, and then for each of the $m$ bins, we either store one bit indicating that it’s empty, or store an $l$-bit hash of an element that is placed into it. Now we can set $l = \log_2(2 / \delta)$, since we compare the hash of a query to merely two hashes, instead of $k$. This gives the overall size $m + k \cdot \log_2 (2 / \delta) = k \cdot (\log_2(1 / \delta) + O(1))$, which is optimal up to a low-order term. Of course, the encoding should include $h_1$, $h_2$ and $h$, but it turns out they can be taken to be sufficiently simple so that their size does not really matter.

Two remarks are in order. First, in this context people usually bring up Bloom filters. However, they require space, which is $1.44$ times bigger, and, arguably, they are more mysterious (if technically simple). Second, one may naturally wonder why anyone would care about distinguishing bounds like $k \cdot \log_2 (1 / \delta)$ and $k \cdot \log_2(k / \delta)$. In my opinion, there are two answers to this. First, it is just a cool application of perfect hashing (an obligatory link to one of my favorite comic strips). Second, compressing sets is actually important in practice and constant factors do matter, for instance when we are aiming to transfer the set over the network.

Update Kasper Green Larsen observed that we can combine the naive and not-quite-working solutions to obtain the optimal bound. Namely, by hashing everything to $\log_2(k / \delta)$ bits, we effectively reduce the universe size to $n’ = k / \delta$. Then, the naive encoding takes $\log_2 \binom{n’}{k} \approx H(\delta) \cdot n’ = H(\delta) \cdot k / \delta \approx
k \cdot \log_2 (1 / \delta)$ bits.

at August 11, 2019 01:21 AM UTC

We consider two versions of the problem of testing graph isomorphism in the bounded-degree graph model: A version in which one graph is fixed, and a version in which the input consists of two graphs. We essentially determine the query complexity of these testing problems in the special case of $n$-vertex graphs with connected components of size at most $\poly(\log n)$. This is done by showing that these problems are computationally equivalent (up to polylogarithmic factors) to corresponding problems regarding isomorphism between sequences (over a large alphabet). Ignoring the dependence on the proximity parameter, our main results are: \begin{enumerate} \item The query complexity of testing isomorphism to a fixed object (i.e., an $n$-vertex graph or an $n$-long sequence) is ${\widetilde{\Theta}(n^{1/2})$. \item The query complexity of testing isomorphism between two input objects is ${\widetilde{\Theta}}(n^{2/3})$. \end{enumerate} Testing isomorphism between two sequences is shown to be related to testing that two distributions are equivalent, and this relation yields reductions in three of the four relevant cases. Failing to reduce the problem of testing the equivalence of two distribution to the problem of testing isomorphism between two sequences, we adapt the proof of the lower bound on the complexity of the first problem to the second problem. This adaptation constitutes the main technical contribution of the current work. Determining the complexity of testing graph isomorphism (in the bounded-degree graph model), in the general case (i.e., for arbitrary bounded-degree graphs), is left open.

at August 10, 2019 04:22 PM UTC

After WADS, I stayed in Edmonton for CCCG. The two conferences have not always been in the same places, but this year they were co-located, and the plan is to continue that pattern in odd years (when WADS is held). As far as I know there are no plans to move CCCG to Scandinavia for SWAT in the even years.

Like WADS, CCCG had three invited speakers. In past years, two were named the Paul Erdős Memorial Lecture and the Ferran Hurtado Memorial Lecture. This year, sadly, the third one has also been named, as the Godfried Toussaint Memorial Lecture.

  • The Erdős Lecture was by Vida Dujmović, who spoke on her breakthrough work with several other Barbados workshop participants showing that every planar graph is a subgraph of a strong product of a path graph and a bounded-pathwidth graph, from which it follows that these graphs have bounded queue number, that they can be embedded into 3d grids of linear volume, and many other nice properties. The timing of the lecture invitation to Vida was good, as the breakthrough happened after she had already agreed to speak!

  • The Toussaint Lecture was by Joseph O’Rourke. Joe spoke on polyhedral unfolding, the problem of cutting the boundary of a polyhedron into a surface that can unfold into a simple polygon in the plane. One of the points of his talk was to rationalize some of the terminology in this area. The standard version of the problem asks for (in his new terminology) an edge-unfolding, a set of cuts along edges of the polyhedron, forming a spanning tree for its vertices, such that the resulting cut surface unfolds to a flat polygon. But one can also ask for an anycut-unfolding, using cuts that do not have to follow the edges. Or one can ask for an edge-unzipping or anycut-unzipping, in which the cuts must form a single (Hamiltonian) path through the vertices of the polyhedron. In this terminology Dürer’s conjecture becomes the statement that every convex polyhedron has an edge-unfolding, and the example I recently posted of a zipless polycube shows that not every polycube has an edge-unzipping. Another well-known open question in this area asks whether every polycube whose boundary forms a topological sphere has an edge-unfolding. Joe conjectured that with high probability the convex hull of many random points on a sphere does not have an anycut-unzipping.

  • Mark de Berg presented the Hurtado Lecture. His topic involved subexponential algorithms for disk intersection graphs and unit disk graphs. At STOC 2018 he had a paper on finding the maximum independent set in disk graphs in time , matching the time for planar graphs. In planar graphs, you can use the planar separator theorem: for each of the independent subsets of the separator, recurse on both sides. This turns out to work in disk graphs by replacing the usual size bound on the separator (it should have vertices) with a decomposition into a union of cliques with . The separators can be found analogously to classical circle-packing methods for planar separators. Each clique can contribute one vertex to any independent set from which it follows that the separator again has independent subsets. The same idea works for other problems like dominating sets in unit disk graphs (where the unit assumption is used to get a bounded contribution from each clique), and generalizes to fat objects in higher dimensions. The time bound is optimal assuming the exponential time hypothesis. And in FOCS 2018 de Berg obtained similar ETH-tight time bounds for the Euclidean traveling salesperson problem by using separators of point sets with the property that few points are very close to the separator boundary.

I can’t find links for all the contributed papers, but you can find them in the complete proceedings. Among them:

  • In “Three-Coloring Three-Dimensional Uniform Hypergraphs”, Biniaz, Bose, Cardinal, and Payne prove that, for points in the plane and a fixed triangle shape, one can -color the points so that every scaled and translated copy of the triangle containing six or more points has more than one color. It was already known that if you change “six or more” to “two or more” you need four colors, and if you change it to “nine or more” you need only two colors.

  • Audrey St. John’s talk on “Redundant Persistent Acyclic Formations for Vision-Based Control of Distributed Multi-Agent Formations” (with Burns, Klemperer, and Solyst) was beset by technical difficulties, but from it I learned that there is a theory of directed bar-and-joint frameworks, analogous to undirected rigidity theory, called “persistence theory”, and that the pebble game for testing rigidity of an undirected framework produces an orientation of the network that is persistent. She used the analogy of a flock of geese, walking in formation: each goose pays attention only to the other geese in front, but the whole formation can keep its shape as the leading goose moves arbitrarily. Her goal is to get robots to do the same thing.

  • In “Chirotopes of Random Points in Space are Realizable on a Small Integer Grid”, Cardinal, Fabila-Monroy and Hidalgo-Toscano prove that, with high probability, random point sets in can be rounded to a grid of polynomial size without changing their order type.

  • We had enough folding and unfolding papers to spill out over more than one section. Among them, I particularly liked “Efficient Segment Folding is Hard” by Klute, Parada, Horiyama, Korman, Uehara and Yamanaka. The question they asked is: given disjoint line segments on a piece of paper, when can you make a sequence of simple folds (that is, for a given fold line, folding all the layers of the paper that are crossed by the line), with each fold on a line through one of the segments that misses all the other segments? It turns out to be -complete. If you do allow fold lines to pass through other segments, folding sequences can be infinite, and it’s unknown whether every set of segments has a finite sequence.

  • Pilar Cano spoke on generating triangulations of point sets in an affine-invariant way (“Affine invariant triangulations” with Bose and Silveira). The main trick is to use covariance to choose a canonical affine transformation for the points, after which you can use Delaunay, minimum weight, or your favorite other triangulation algorithm. But there are necessarily some general position assumptions (as there already are for using Delaunay triangulation without the affine invariance): for points in a parallelogram, there is no affine-invariant way of choosing which diagonal to use.

The excursion was to the Royal Alberta Museum, where I skipped the special exhibit on Vikings (having gone to museum exhibits on them in Copenhagen a year earlier) and instead learned much about Great Plains geology and the historical mistreatment of the Métis, local people descending both from First Nations and Europeans. (The First Nations themselves were of course also badly mistreated, but I had more of an idea of that already.)

From the business meeting, we heard that the acceptance ratio was a little higher than last year, but still approximately . Two papers were withdrawn because the authors had visa issues, double the number from last year, and several others were presented by non-authors after their authors were unable to attend. One possible improvement would be to move the submission and acceptance dates earlier to provide authors more time to obtain visas. The main topic of discussion was the conference’s status as a conference: should papers at CCCG continue to count as publications (in which case why are they still limited to only six pages) or should they be considered as preliminary announcements of papers that can still be sent to other more prestigious symposia? One possible compromise involves giving authors a choice: either publish your paper in the proceedings or give up the proceedings slot but still present your work in some other way (possibly as a poster, as GD does).

(Discuss on Mastodon)

by David Eppstein at August 10, 2019 03:34 PM UTC

The WADS excursion was to the University of Alberta Botanic Gardens. Here are a few photos I took there:

University of Alberta Botanic Gardens, Aga Khan Garden, Source Fountain University of Alberta Botanic Gardens, Aga Khan Garden, Jilau Khana
University of Alberta Botanic Gardens, Aga Khan Garden, Mahtabi University of Alberta Botanic Gardens, Wetland Walk, May's Dock

(Discuss on Mastodon)

by David Eppstein at August 09, 2019 11:46 AM UTC

I am very happy to announce two quantum events. First, I would like to announce a course “Computation, quantization, symplectic geometry, and information” in the first 2019/2020 semester  at the Hebrew University of Jerusalem (HUJI). The course will by on Sundays 14:00-16:00. Second, I would also like to announce The 4th Advanced School in Computer Science and Engineering on The Mathematics of Quantum Computation  on December 15 – December 19, 2019, at IIAS HUJI.

Emmy Noether (left) Grete Hermann (right)

A quantum “Kazhdan’s seminar” at HUJI: Computation, quantization, symplectic geometry, and information.

“In the fall of 2019 Dorit Aharonov, Gil Kalai, Guy Kindler and Leonid Polterovich intend to run a new one semester course (as a Kazhdan seminar) attempting to connect questions about noise and complexity in quantum computation, with ideas and methods about “classical-quantum correspondence”.that are well studied in symplectic geometry. The course will be highly research-oriented, and will attempt to teach the basics in both areas, and define and clarify the interesting research questions related to the connections between these areas, with the hope that this will lead to interesting insights.  The course is oriented to grad students (and faculty), with reasonable background in mathematics, and with interest in the connections between mathematical and computational aspects of quantum mechanics. (See below for a full description.)”

The course will by on Sundays 14:00-16:00 in Ross building.

See also the post  Symplectic Geometry, Quantization, and Quantum Noise from January 2013. (The seminar was initially planned to 2014 but some bumps in the road delayed it to 2019.)

A winter school at IIAS: The Mathematics of Quantum Computation

The Mathematics of Quantum Computation
The 4th Advanced School in Computer Science and Engineering
Event date: December 15 – December 19, 2019

“We will be organizing a math-oriented quantum computation school in the IIAS at the Hebrew university. No prior knowledge on quantum will be assumed.  The school will introduce TCS and math students and faculty, who are interested in the more mathematical side of the area, to the beautiful and fascinating mathematical open questions in quantum computation, starting from scratch. We hope to reach a point where participants gain initial tools and basic perspective to start working in this area. (See below for a full description.)

Organizers: Dorit Aharonov, Zvika Brakerski, Or Sattath, Amnon Ta-Shma,

Main (confirmed) Speakers: Sergey Bravyi, Matthias Christandl, Sandy Irani, Avishay Tal, Thomas Vidick, (1-2 additional speakers may be added later).

Additional (confirmed) lectures will be given by: Dorit Aharonov,  Zvika Brakerski,  and/ Or Sattath. (1-2 additional speakers may be added later).”

The Isreali Institute of Advanced Study hosted already a 2014 school about quantum information as part of its legendary physics series of schools, and also hosted QSTART in 2013.

More details

Kazhdan’s seminar: Computation, quantization, symplectic geometry, and information.

In the fall of 2019 Aharonov, Kalai, Kindler and Polterovich intend to run a new
one semester course (as a Kazhdan seminar) attempting to connect questions about noise and complexity in quantum computation, with ideas and methods about “classical-quantum correspondence”.that are well studied in symplectic geometry. The course will be
highly research-oriented, and will attempt to teach the basics in both areas, and define and clarify the interesting research questions related to the connections between these areas, with the hope that this will lead to interesting insights.

The course is oriented to grad students (and faculty), with reasonable background in mathematics, and with interest in the connections between mathematical and computational aspects of quantum mechanics. Students who attend it will be awarded two N”Z after passing an exam. The goal of the course is to initiate and lead to new connections between the seemingly unrelated areas of quantum computation and symplectic geometry.

The topics will include:

– Introduction to quantum computation, quantum universality, quantum algorithms
and quantum computational complexity classes such as BQP and Quantum NP (QMA)

– quantum measurement and quantum noise explained using the standard quantum computational model.

– questions about quantum error correction and quantum noise – fault tolerance,
quantum error correcting codes, and the breakdown of robustness when the locality of
the noise does not hold.

– quantum measurement/quantum information (noise and speed limit) having classical counterparts, studied from the symplectic geometry perspective.

– Konsevich theorem and quantization,

– towards a the semi classical approximation of quantum computers.

Examples of questions we would like to initiate research on are:

1) what would be a semi classical model of quantum computation, and what would be
its computational power?

2) what is a good notion of complexity in a symplextic geometry computational model?

3) What can we learn from basic symplectic geometry results (such as non squeezing)
about the limitations on quantum computation in the semi classical limit?

4) Can noise in quantum computation be related in any way with the semi classical limit
of quantum computing systems?

5) can we learn anything about the possible noise models in quantum computers,
using our knowledge from symplectic geometry?

Hope to see you in the course!

The Mathematics of Quantum Computation -The 4th Advanced School in Computer Science and Engineering

On 15-19 December 2019, we will be organizing a math-oriented quantum computation school in the IIAS at the Hebrew university. No prior knowledge on quantum will be assumed.  The school will introduce TCS and math students and faculty, who are interested in the more mathematical side of the area, to the beautiful and fascinating mathematical open questions in quantum computation, starting from scratch. We hope to reach a point where participants gain initial tools and basic perspective to start working in this area.

To achieve this, we will have several mini-courses, each of two or three hours, about central topics in the area.   These will include quantum algorithms, quantum error correction, quantum supremacy, delegation and verification, interactive proofs, cryptography, and Hamiltonian complexity. We will emphasize concepts, open questions, and links to mathematics. We will have daily TA sessions with hands-on exercises, to allow for a serious process of learning.

There will be two rounds of registration. The first deadline is 23rd of August. If there is room, there will be another deadline sometime in October; please follow this page for further announcements.

Hope to see you this coming December!

by Gil Kalai at August 09, 2019 08:58 AM UTC

The SFI Complexity Postdoctoral Fellowships offer early-career scholars the opportunity to join a collaborative research community that nurtures creative, transdisciplinary thought in pursuit of key insights about the complex systems that matter most for science & society. SFI offers a competitive salary, discretionary research/travel funds, paid family leave, & professional development.


by shacharlovett at August 08, 2019 07:38 PM UTC

We are looking for faculty in the areas of Optimization, Algorithms, Machine learning, Data Sciences, Cryptography, Quantum computing/ information, Complexity theory, Formal Verification, Logic and Automata.


by shacharlovett at August 08, 2019 03:48 AM UTC

In Samuel Wagstaff's excellent book The Joy of Factoring (see here for a review) there is a discussion towards the end about why factoring algorithms have not made much progress recently. I
paraphrase it:


The time complexities of the fastest known algorithms can be expressed as a formula of the following form (where N is the number to be factored):

(*) exp(c(ln N)^t (ln(ln N))^{1-t})

for some constants c and for 0 < t < 1. For the Quadratic Sieve (QS) and Elliptic Curve Method (ECM) t=1/2. For the Number Field Sieve (NFS) t=1/3. The reason for this shape for the time complexity is the requirement of finding one or more smooth numbers (numbers that have only small primes as factors).


This got me thinking: Okay, there may not be a drastic improvement anytime soon but what about just improving t? Is there a mathematical reason
why an algorithm with (say) t=1/4 has not been discovered? In an earlier era I would have had to write a letter to Dr. Wagstaff to ask him. Buy an envelope, buy a stamp, find his address, the whole nine yards (my younger readers should ask their grandparents what envelopes and stamps were). In the current era I emailed him. And got a response.

Samuel Wagstaff:

The fastest known factoring algorithms find smooth numbers subject to parameter choice(s). In all these algorithms, one parameter choice is the smoothness bound B: a number is smooth if all its prime factors are < B. The NFS has the degree of a polynomial as an additional parameter.

One analyzes the complexity of these algorithms by estimating the total work required (to find enough smooth numbers) for an arbitrary parameter choice using Dickman's function to predict the density of smooth numbers. Then one uses calculus to find the parameter choice(s) that minimize the total work function. Calculus also yields the optimal values for the parameter(s).

If you have k parameters to choose, you will get the time complexity (*) with t = 1/(1+k). If you have no parameters (k = 0),you get (*) with t = 1, basically exponential time N^c. With one parameter to optimize, as in CFRAC (continued fractions algorithm) and QS, you get t = 1/2. NFS has two parameters, so t = 1/3. ECM also has t = 1/2 because it uses only one parameter, the smoothness bound B. If you want to get t = 1/4, you have to find a third parameter to optimize. No one has found one yet. That is the answer to your question.

Note that some choices made in some factoring algorithms don't count as parameters. For example, the number of polynomials used in the multiple-polynomial quadratic sieve, and the upper bound on large primes kept, don't affect t. They affect the running time only in making c smaller. So you have to find a third parameter that matters in order to get (*) with t = 1/4. Or find three completely different new parameters.

by GASARCH ( at August 07, 2019 07:27 PM UTC

I’m in Edmonton, Canada for WADS, which just finished, and CCCG, which is just about to begin.

The three invited talks at WADS were by Rasmus Pagh, Bob Tarjan, and me. Pagh spoke on methods for representing sets of elements by concise sketches so that the size of intersections or unions of the sets could be rapidly and accurately estimated. A famous method for this is MinHash, in which one represents a set by the elements with the smallest values of some hash function; the size of the overlap in representations is then an accurate estimator for the Jaccard similarity of pairs of sets. New to me were -bit variations of MinHash, in which you can get almost as accurate a representation in much less space by mapping each element of the MinHash set to by another hash function. This works well when the Jaccard similarity is bounded away from both and , and Pagh spoke about some recent research he and others had done on finding even more accurate methods when it is near or near .

Tarjan spoke about parallel algorithms for connected components in graphs, an old area but one in which apparently there have been frequent published mistakes. He presented a modular analysis of the algorithms in this area according to some basic operations they perform (hooking together roots of trees on components, propagating that root information downwards through the trees, flattening the trees to make the information propagate more quickly, and the like) and showed that simple combinations of these operations lead to new, simple, efficient and more importantly provably-correct algorithms.

My talk, “Graphs in Nature”, was about finding graph-theoretic characterizations of surface-embedded graphs arising in natural processes, and using those characterizations to find algorithms to reconstruct synthetic geometric structures of the same type from their graphs. I also gave roughly the same talk a month earlier, at the Symposium on Geometry Processing in Milan. I’ve put my talk slides online in case anyone else is interested.

The best paper award went to Hüseyin Acan, Sankardeep Chakraborty, Seungbum Jo and Srinivasa Rao Satti for their paper “Succinct Data Structures for Families of Interval Graphs”. I can’t tell you much about the talk because, unfortunately, I missed it. I didn’t know it was the best paper until the business meeting that evening, so I went to the other parallel session instead.

I think the contributed talk from Tuesday that most stood out to me was Bryce Sandlund’s, on offline dynamic graph algorithms. This is a type of problem I worked on long ago for minimum spanning trees in which you get as input a whole sequence of edge insertions and deletions in a graph, and must produce as output the sequence of changes to the solution to whatever you’re trying to solve. Bryce’s new paper with Peng and Sleator solves similar problems for higher-order graph connectivity. The main idea is to hierarchically decompose the update sequence into intervals, and then represent the non-dynamic part of the graph within each interval by a smaller equivalent replacement graph whose size is proportional to the interval length. At the end of his talk, Bryce hinted that he could also solve incremental problems (where the updates are given one at a time rather than all in advance, but are only insertions) using similar methods in a forthcoming paper.

I was inspired by Caleb Levy’s talk on splay trees (in which he showed that inserting elements in either the preorder or postorder of another binary search tree takes linear time) to ask the following question: we know either by time-reversing the tree rotation operations or from the geometric model of dynamic search trees that any given access sequence should have the same optimal cost as its reverse. So from the dynamic optimality conjecture it should also be true that (up to constant factors) splay trees have the same performance on the reverse of any access sequence as they do on the unreversed sequence. Can this be proven?

From the business meeting, we learned that attendance and paper submissions were down by around 15% from the previous WADS. The acceptance rate is roughly the same, just under 50%. I suspect the smaller size is because the location is not as appealing, but it turns out to be a perfectly pleasant place to have a conference: the weather in Edmonton is pleasant this time of year (except for the thunderstorm), and there are abundant restaurants, good coffee shops, and lodging within walking distance of the conference center. WADS alternates with SWAT, which next year will be in the Faroe Islands. And then WADS 2021 (and CCCG 2021) will be in Halifax, Nova Scotia, which is both more touristy than Edmonton and easier to reach from the east coast and Europe. So I suspect the numbers will improve again.

WADS is moving towards a more democratically elected steering committee formed from some combination of past PC chairs and at-large elections. They have already started implementing the SafeTOC recommendations for combatting harassment and discrimination in theory conferences. And in a show of hands at the business meeting, the attendees were strongly in favor of moving towards double blind peer review for submissions. The conference is not really open access, though (its proceedings are published by Springer LNCS with special issues in Springer’s Algorithmica and Elsevier’s Computational Geometry) and there seems to be little pressure for that to change any time soon.

On to CCCG!

(Discuss on Mastodon)

by David Eppstein at August 07, 2019 05:27 PM UTC

So you think you have a proof that P=NP

Randi 2014 documentary source

James Randi is a magician who has challenged paranormal claims of all kinds.

Today Ken and I want to make a suggestion to those who claim they have proved P=NP.

No the claim to have a proof that P=NP is not a paranormal claim. But such claims are related to Randi—or the Amazing Randi as he is called. We talked about him before here.

Randi once helped run a contest to see who could find gold with their dowsing rod. He explained why he doubted one contestant:

If they really could find gold, why were they dressed so poorly, and why were they so interested in winning the prize?

I have the same question about those who claim that they have a proof that P=NP. Usually the proof is constructive and I agree with Randi:

If they really could solve P=NP, why {\dots}

You get the idea.

Ken adds the obvious remark that if a foreign power or intelligence agency discovered P=NP, or factoring in P, they would still keep the lean-and-hungry look. But they are not the kind we are addressing here.

Coding Helps

Let’s look at a claims that P=NP is resolved. Yes, such a result is unlikely—many would say impossible. But we do get claims like this:

The following {\cal H} is known to be a NP-complete problem; the following {\cal E} is known to be a polynomial time problem. I can reduce {\cal H} to {\cal E} in polynomial time.

Usually the reduction is the reason their proof fails. Their claims about {\cal H} and {\cal E} are usually correct, since they are in the literature.

The reduction is often complicated, often poorly defined, often defined by example. Giving a precise definition for the reduction is critical. This is the reason we suggest the following:

Write the reduction down in code.

Even better, write it as a program in a real language such as Python.

There are two advantages in doing this.

  • Writing it as a program in a real language will likely force one to define it precisely.

  • Writing it down will also allow one to run the program on examples.

The later point is the key point. Even trying your method on tiny examples is useful. Even better if you can say the following we might read the proof:

I have tried my code on the following public set of difficult SAT problems. The code solved all in less than three minutes each.

This claim would greatly improve the likelihood that people might take your claims seriously. That your code worked correctly, forgetting the running time, would improve confidence. Greatly.

The Animal Farm View{\dots}

Ken worries that some NP-complete problems are more equal than others. That is some problems, even though they are NP-complete may require reductions that blow up when encoding SAT.

We wrote about this before regarding the “Power Index” idea of Richard Stearns and Harry Hunt III. In their paper they gave evidence that the reductions from SAT to many familiar NP-complete problems must expand the size of instances quadratically, insofar as those problems have power index {0.5}. This was based on their “SAT Hypothesis” which anticipated current forms of the Exponential Time Hypothesis, which we have discussed.

Ken ponders a related issue. Even problems with power index {1} run into the success of practical solvers. This means:

Anyone citing algorithmic success as evidence toward a claim of P=NP must compete with the real-world success of algorithms that do not represent claims of P=NP.

We have several times discussed the practical success of SAT-solvers on myriad real-world instances.

This situation has become real in the argument over achieving quantum supremacy. One who claims that quantum is superior to classic must worry that that classical algorithms can improve without making P=NP. A headline example from last year was when Ewin Tang—as a high-school senior—found a classical way to remove a plausible quantum advantage in a matrix-completion problem that underlies recommender systems. There are many “industrial strength” examples in this argument—see this May 2019 story for a start.


Ken’s insightful comments aside, the key point is still:

Coding up your claimed algorithm for that NP-complete problem will still enhance belief.

This will happen even if the algorithm only succeeds on tiny examples. Indeed, if you cannot do this then I suggest that you will have an impossible time getting anyone to listen.

Open Problems

How useful is this advice for the vast majority of us who are not claiming P=NP or the opposite?

by RJLipton+KWRegan at August 06, 2019 08:45 PM UTC

September 2-4, 2019 Trinity College Dublin, Dublin, Ireland The Summer School will include a technical track with lectures and workshops on various aspects of Machine Learning, as well as associated training in areas such as research ethics, career development, communicating research, innovation, and public engagement. It will also include social and networking events. Speakers … Continue reading Machine Learning for Communication Systems and Networks

by shacharlovett at August 06, 2019 06:49 PM UTC

We give new algorithms in the annotated data streaming setting---also known as verifiable data stream computation---for certain graph problems. This setting is meant to model outsourced computation, where a space-bounded verifier limited to sequential data access seeks to overcome its computational limitations by engaging a powerful prover, without needing to trust the prover. As is well established, several problems that admit no sublinear-space algorithms under traditional streaming do allow protocols using a sublinear amount of prover/verifier communication and sublinear-space verification. We give algorithms for many well-studied graph problems including triangle counting, its generalization to subgraph counting, maximum matching, problems about the existence (or not) of short paths, finding the shortest path between two vertices, and testing for an independent set. While some of these problems have been studied before, our results achieve new tradeoffs between space and communication costs that were hitherto unknown. In particular, two of our results disprove explicit conjectures of Thaler (ICALP, 2016) by giving triangle counting and maximum matching algorithms for $n$-vertex graphs, using $o(n)$ space and $o(n^2)$ communication.

at August 05, 2019 04:44 PM UTC

The Department of Computer Science at the University of Warwick and the Warwick Mathematics Institute invite applications for a full Professor post in the areas on the interface of Computer Science and Mathematics, including Discrete Mathematics, Algorithms and Complexity, Theoretical Computer Science.


by shacharlovett at August 05, 2019 10:31 AM UTC

The Department of Mathematics and Computer Science at the Open University of Israel invites applications in all areas of computer science for several senior faculty positions at all ranks (assistant professor, associate professor, professor)


by shacharlovett at August 05, 2019 05:39 AM UTC

Perhaps conferences made sense fifty years ago. We did not have internet, and the pollution was not as bad. Today, we can have effective virtual meetings, while the pollution has reached a level of crisis, see this moving talk by Greta_Thunberg. Moving to a system of virtual conferences is I believe a duty of every scientist. Doing so will cut the significant air travel emissions that come from shipping scientists across the world. To attend a climate summit in the USofA, Greta will sail across the Atlantic ocean on a zero emission boat.

We can keep everything the way it is, but simply give the talks online. This change doesn’t involve anybody higher up in the political ladder. It only involves us, the program chairs, the steering committees.

While we wait for that, we can begin by virtualizing the physical STOC/FOCS PC meetings, whose added value over a virtual meeting, if any, does not justify the cost; and by holding conferences where the center of mass is, instead of exotic places where one can combine the trip with a vacation at the expense of tax payers’ money and everybody’s health. And that is also why I put a bid to hold the 2021 Conference on Computational Complexity in Boston.

NSF panels, making decisions worth millions, routinely have virtual panelists (I was the last few times). So why do we insist on shipping scientists across the globe multiple times a year to give 15-minute talks which to most people are less useful than spending 20 minutes reading the paper on the arxiv?

by Manu at August 04, 2019 10:58 PM UTC


The cover of Avi Wigderson’s book “Mathematics and computation” as was first exposed to the public in Avi’s Knuth Prize videotaped lecture. (I had trouble with 3 of the words: What is EGDE L WONK 0?  what is GCAAG?GTAACTC ?  TACGTTC ?  I only figured out the first.)

Avi Wigderson’s book  “Mathematics and computation, a theory revolutionizing technology and science” is about to be published by Princeton University Press. The link is to a free copy of the book which will always be available on Avi’s homepage. (See also this re-blogged post here of Boaz Barak.)

One main theme of the book is the rich connection between the theory of computing and other areas (in fact, most areas) of mathematics. See also this self contained survey (based on Chapter 13 of the book) by Avi Interactions of Computational Complexity Theory and Mathematics, which in 27 pages overviews relations to number theory, Geometry, Operator Theory, Metric Geometry, Group Theory, Statistical Physics, Analysis and Probability, Lattice Theory and Invariant Theory. Of course, Avi himself is among the main heroes in finding many paths between mathematics and the theory of computing over the last four decades.

Another theme of the book and of several talks by Avi is that the theory of computing has revolutionary connections with many fields of science and technology. Again, this theme is present in the entire book and is emphasized in Chapter 20, which also appeared as a self contained paper  “On the nature of the Theory of Computation (ToC).”  Let me quote one sentence from Avi’s book that I propose for discussion. (For the complete quote see the end of the post.)


The intrinsic study of computation transcends human-made artifacts, and its expanding connections and interactions with all sciences, integrating computational modeling, algorithms,  and complexity into theories of nature and society, marks a new scientific revolution!

Of course, similar ideas were also expressed by several other prominent scientists, and let me mention Bernard Chazelle’s essay: The Algorithm: Idiom of Modern Science. (Feel free to give further examples and links in the comment section.)

Let’s discuss:  Integrating computational modeling, algorithms,  and complexity into theories of nature, marks a new scientific revolution!

I propose to discuss in the comment section the idea that the theory of computing offers a scientific revolution. Very nice cases to examine are the computational study of randomness and connections to statistics, connections with economy and connections with biology. Comments on the relations between the theory of computation and other areas of mathematics are also very welcome.

Avi’s concluding slide compared these three great theories of human understanding.

 (Previous attempts of open discussions were made in the following posts on this blog: 10 Milestones in the History of Mathematics according to Nati and Me; Why is mathematics possible? (and a follow up post); When it rains it poursIs it Legitimate/Ethical for Google to close Google+?; An Open Discussion and Polls: Around Roth’s Theorem; Is Mathematics a Science?)

Avi promotes the idea of the central place of the theory of computing in his talks and writings

And at the same time he is also humorously skeptical about it. (And mainly emphasizes that his far reaching claim requires careful discussion and ample evidence.)   


The full quote of Avi:

The Theory of Computation is as revolutionary, fundamental and beautiful as major theories of mathematics, physics, biology, economics… that are regularly hailed as such. Its impact has been similarly staggering. The mysteries still baffling ToC are as challenging as those left open in other fields. And quite uniquely, the theory of computation is central to most other sciences. In creating the theoretical foundations of computing systems ToC has already played, and continues to play a major part in one of the greatest scientific and technological revolutions in human history. But the intrinsic study of computation transcends man-made artifacts. ToC has already established itself as an important mathematical discipline, with growing connections to nearly all mathematical areas. And its expanding connections and interactions with all sciences, naturally integrating computational modeling, algorithms and complexity into theories of nature and society, marks the beginning of another scientific revolution!

More related material:

  1. Avi’s talk Scientific revolutions, ToC and PCP  at the Tel Aviv PCP meeting and an interview of Avi by Alon Rosen.
  2.  A talk by Avi on the Stepanov method
  3. The recent works on polytopes arising from moment maps and related optimization problems and algorithmic aspects.  Avi’s Knuth prize videotaped lecture; Avi’s lecture  Complexity, Optimization and Math (or, Can we prove that P != NP by gradient descent?) in the recent conference honoring Bill Cook. (I plan to come back to this fascinating topic.)
  4. An essay by Oded Goldreich and Avi Wigderson (essentially from 1996) “The theory of computing –  a scientific perspective.”

The volume of comments in the first decade of this blog was modest. I recently read, however, wordpress’s advice on how to reply to blog comments like a pro. 

And finally, EGDE L WONK 0 is 0-knowledge.

by Gil Kalai at August 04, 2019 08:14 PM UTC

We saw quite an activity last month, with a total of 9 papers on property testing or related areas.

Let’s start with a paper on PCPs:
Revisiting Alphabet Reduction in Dinur’s PCP, by Venkatesan Guruswami, Jakub Opršal, and Sai Sandeep (ECCC). As mentioned in the title, this work provides an alternate proof of one of the two parts of the Dinur’s PCP theorem proof. In that proof, the argument goes by alternating two main steps: (i) amplifying the soundness gap, at the price of increasing the alphabet size (and the instance size); (ii) reducing the alphabet size, at the price of increasing the instance size. From the observation that step (i) creates some structure in the instances that can be leveraged, the authors provide an alternative construction for (ii), leading to an arguably simpler analysis of the soundness of the underlying test. A nice throwback to some of the roots of property testing.

Then, four papers on distribution testing:
Testing Mixtures of Discrete Distributions, by Maryam Aliakbarpour, Ravi Kumar, Ronitt Rubinfeld (arXiv). One of the most studied problems in distribution testing is identity testing (a.k.a., goodness-of-fit): given a reference distribution \(q\) over a domain of size \(n\) and i.i.d. samples from an unknown \(p\), decide between \(p=q\) and \(d_\textrm{TV}(p,q) > \varepsilon\). While this is now well-understood and can be done with a strongly sublinear number of samples (namely, \(\sqrt{n})\)), the case of tolerant testing is much sadder: Valiant and Valiant proved that \(\Theta(n/\log n)\) samples were required for distinguishing \(d_\textrm{TV}(p,q) < \varepsilon/2\) from \(d_\textrm{TV}(p,q) > \varepsilon\). So, noise-tolerance is almost as hard as it gets… except, as this works suggests and studies, if one has some guarantees on the noise! What if the noise in the completeness case was just that \(q\) is perturbed by some (known, or available via samples as well) noise coming from a second distribution \(\eta\)? I.e., the question is now to test whether \(p\) if of the form \(\alpha q + (1-\alpha) \eta\) for some \(\alpha\in [0,1]\), or if it is \(\varepsilon\)-far from all mixtures of \(q\) and \(\eta\). The authors have various results on this question and some extensions, but the upshot is: “with structured noise, strongly sublinear sample complexity is once again possible.”

Can Distributed Uniformity Testing Be Local?
, by Uri Meir, Dor Mintzer, and Rotem Oshman (ACM PODC Proceedings). More on identity testing, specifically uniformity testing. No noise here, but another constraint: the samples are distributed. There are \(k\) players, each holding \(q\) i.i.d. samples from a distribution \(p\) over \([n]\); each can send \(\ell=1\) bit to a central referee, in a non-interactive fashion. Is \(p\) the uniform distribution, or is it \(\varepsilon\)-far from it? The authors here consider the case where \(k,n,\varepsilon\) are fixed, and prove lower bounds on the number \(q=q(k,n,\varepsilon)\) of samples each player must hold in order to perform this uniformity testing task. (Note: their lower bounds hold even in the public-randomness setting.) Motivated by the LOCAL model, they focus on 3 cases of interest, for which they obtain tight or near-tight bounds on \(q\) (in view of upper bounds from previous work of a subset of the authors): (1) when the referee’s decision has to be the \(\textsf{AND}\) of all \(n\) bits she receives; (2) when the referee can do something more involved, and use a threshold function; and (3) when the referee can use an arbitrary function of those \(n\) messages. Underlying those lower bounds is a neat application of Boolean Fourier analysis, recasting the analysis of the standard “Paninski” lower bound instance and the player’s decision functions as a question on Boolean functions.

Domain Compression and its Application to Randomness-Optimal Distributed Goodness-of-Fit, by Jayadev Acharya, Clément Canonne, Yanjun Han, Ziteng Sun, and Himanshu Tyagi (arXiv,ECCC). Remember the paper just above? Now, focus on the case (3), where the referee can use any function of the message, allow each player to send \(\ell\) bits instead of one, but give only one sample (instead of \(q\)) to each player. This becomes a setting we have talked about before on this blog (specifically, with three papers on these three months). Those three papers established the optimal number of player \(k\), in terms of the domain size \(n\), the distance parameter \(\varepsilon\), and the number of bits per player \(\ell\), in both the private-coin and public-coin settings. This new work interpolates between those two extremes, and gives the tight answer to the general question: if there are \( 0\leq s \leq \log n\) bits of public randomness available, what is the optimal number of players to perform identity testing? Behind the upper bounds is a new notion of (randomness-efficient) domain compression they introduce: how to reduce the domain size of the probability distributions while preserving the \(\ell_1\) distances between them as much as possible?

Towards Testing Monotonicity of Distributions Over General Posets, by Maryam Aliakbarpour, Themis Gouleakis, John Peebles, Ronitt Rubinfeld, and Anak Yodpinyanee (arXiv). Away from identity testing, let’s now consider another central question, testing monotonicity of distributions. Specifically, a distribution \(p\) over a given poset \((\mathcal{X}, \prec)\) is said to be monotone if \(p(x) \leq p(y)\) whenever \(x\prec y\). While the question is fully understood for the familiar case of the line \(\mathcal{X} = \{1,2,\dots,n\}\), the case of general posets is much murkier, and more or less uncharted. This paper initiates a general study of the question, improving on previously known bounds for the matching poset and Boolean hypercube, and providing several new results and techniques for the general case, to establish both upper and lower bounds on the sample complexity.

And now… graphs!
Expansion Testing using Quantum Fast-Forwarding and Seed Sets, by Simon Apers (arXiv). This paper is concerned with (bicriteria) testing of vertex expansion of a given graph \(G=(V,E)\) in the bounded-degree graph model. Loosely speaking, given a value \(\Phi\) and query access (i.e., sampling a node u.a.r., querying the degree of a node, and querying the \(i\)-th neighbor of a node) to a graph \(G\) with maximum degree \(d=\Theta(1)\), the goal is to distinguish between (i) \(G\) has vertex expansion at most \(\Phi\) and (ii) \(G\) is \(\varepsilon\)-far from any \(G’\) with vertex expansion at most \(\Phi^2\) (simplifying and dropping some technicalities). The previous state-of-the-art was a \(\tilde{O}_{\varepsilon}(n^{1/2}/\Phi^2)\) query complexity for classical algorithms, and a \(\tilde{O}_{\varepsilon}(\min(n^{1/3}/\Phi^2),n^{1/2}/\Phi^2))\) query complexity upper bound for quantum testers. The current work improves on this by providing a \(\tilde{O}_{\varepsilon}(n^{1/3}/\Phi)\) quantum tester. Graphs expand—and so does the gap between classical and quantum testing.

Walking Randomly, Massively, and Efficiently, by Jakub Łącki, Slobodan Mitrović, Krzysztof Onak, and Piotr Sankowski (arXiv). One very useful primitive, in many graph property testing results and of course well beyond property testing, is the ability to perform random walks on graphs. In the case of large, distributed (or merely impractically massive) graphs, however, it is not clear how to implement this primitive efficiently. This work addresses this question in the MPC (Massive Parallel Computation) model, and shows how to perform independently, from all of the \(n\) nodes of a graph (i.e., machine), an \(\ell\)-length random walk with only \(O(\log \ell)\) rounds of communication and \(O(n^\alpha)\) space per machine, for any constant \(\alpha \in(0,1)\). This round complexity matches a known (conditional) lower bound, and thus the result is as good as it gets — further, the authors obtain such MPC algorithms for both undirected and directed graph, somehow leveraging the latter case to handle the (significantly more complicated) directed case. As an application of this efficient random walk primitive, they provide MPC property testing algorithms for both bipartiteness and vertex expansion (same definition as in the paper above: vertex expansion \(\Phi\) vs. \(\varepsilon\)-far from any \(G’\) with vertex expansion \(\Phi^2\)) in the general graph model. (Beyond property testing, another application—the main one in the paper— is computing PageRank in both graphs and directed graphs.)

A Lower Bound on Cycle-Finding in Sparse Digraphs
, by Xi Chen, Tim Randolph, Rocco Servedio, and Timothy Sun (arXiv). In this paper, the authors tackle the problem of one-sided testing of acyclicity of directed graphs, in the bounded-degree graph model (equivalently, on the task of finding a cycle in a digraph promised to be far from acyclic). Previously, an \(\Omega(\sqrt{n})\) lower bound for this task had been shown by Bender and Ron, based on a birthday-paradox-type argument; this work improves on this hardness result, showing that \(\tilde{\Omega}(n^{5/9})\) queries are necessary. Interestingly, whether achieving any \(o(n)\) query complexity is possible remains open.

Constant-Time Dynamic \((\Delta+1)\)-Coloring and Weight Approximation for Minimum Spanning Forest: Dynamic Algorithms Meet Property Testing, by Monika Henzinger and Pan Peng (arXiv). Finally, the last paper covered this month is concerned with dynamic graph algorithms: where updates to a graph (which can be both edge additions and deletions) come sequentially, and the objective is to maintain, e.g., a coloring, or an approximation of some function of the graph, at any time using as little time per update. This paper advocates the use of techniques from property testing (loosely speaking, as their locality and query-efficiency translates to fast update times) to design better dynamic graph algorithms. It illustrates this idea by obtaining several new results, in particular an (amortized) \(O(1)\)-update-time randomized algorithm to maintain a \((\Delta+1)\)-coloring in a graph with degree bound \(\Delta\). The algorithm relies on a random ranking of the nodes which, combined with a suitable randomized update rule for recoloring a vertex when needed, ensure that the recolorings will not (with high probability) “propagate” to much. This is the first time that this idea, previously used in several testing and sublinear-time papers, is used in the context of dynamic graph algorithms—adding an edge between the two areas, hopefully only the first of many.

Update: we did miss a paper! Make it ten in July…
Nearly optimal edge estimation with independent set queries, by Xi Chen, Amit Levi, Erik Waingarten (arXiv). Consider the following type of access to an unknown graph \(G=(V,E)\) on \(n=|V|\) nodes: on query \(S\subseteq V\), the oracle returns 1 if \(S\) is is an independent set, and 0 otherwise. (This sounds quite a bit like what is allowed in the group testing literature.) How many such queries are necessary to get a multiplicative estimate of the number of edges, \(m=|E|\)? In this paper, the authors improve on the previous bound of \(\min(\sqrt{m}, n^2/m)\) (ignoring logarithmic factors), obtaining an \(\min(\sqrt{m}, n/m)\) upper bound — complemented by a (nearly, up to those pesky log factors) matching lower bound.

If we missed a paper this month, or represented some result above, please mention it in the comments.

by Clement Canonne at August 04, 2019 12:56 AM UTC

A old unpublished result, some new published results

[ Playbill ]

Alexander Hamilton was a framer of the U.S. Constitution. He wrote the bulk of the Federalist Papers (FP) defending the Constitution. Today he is best known for the playbill—the musical on his life—and the bill, the US ten dollar bill.

Today I thought we would discuss the U.S. electoral college (EC).

We are in the midst of the run-up to next year’s President election. An on-going discussion is the issue of the EC. Should it be modified? Should it be replaced? Is it a good idea?

So let’s recall how the EC works. Then we will look at it from a theory viewpoint.

The College

The electoral college is how every four years we elect the President of the United States. It is not a direct popular vote. The Constitution created it as a compromise between a direct popular vote and a vote by the members of Congress. Back then, the framers of the Constitution, including Hamilton, did not trust the electorate. Hence, the rationale for the EC.

Today the EC consists of 538 electors. Voters in each state pick electors, who then vote in EC for the President. Thus by high math, 270 electors are required to win. A state gets one electoral vote for each member in the House of Representatives plus two. The latter rule ensures that no state gets too few votes. It is some times called the “two-plus rule”.

The arguments for the EC are distilled in FP No. 68. Although the collaboration/authorship status of numerous FP remains unclear, Hamilton’s claim in his last testament to sole authorship of FP 68 is not seriously disputed. Quoting Wikipedia:

Entitled “The Mode of Electing the President”, No. 68 describes a perspective on the process of selecting the Chief Executive of the United States. In writing this essay, the author sought to convince the people of New York of the merits of the proposed Constitution. Number 68 is the second in a series of 11 essays discussing the powers and limitations of the Executive branch and the only one to describe the method of selecting the president.

Opponents today argue against the EC. They point out that it allows one to win without getting the most votes. This has happened in two of the last five elections, in 2000 and 2016. The EC rewards uneven allocations of campaigning to the few “swing-states”. It also gives voters in less populated states more voting power. A vote from Wyoming has over three times the influence on the EC tally as a vote from California. The battle over FP 68 has even been internationalized.

My College

Years ago. Decades ago. Eons ago. When I was in college, I almost flunked a required one-credit course in my senior year. The course was on issues of the election that year of the President. No it did not involve Hamilton.

The grade of the course was based on a term paper. Mine, which got a {\cal D}, was based on an argument for the EC. Thankfully, the grade was just enough to get me a pass in the course, and allow me to graduate. I did not take the course seriously—my attendance was spotty, at best.

My idea was that there was an argument for the EC based on a connection with the ability to manage elections. My central thesis was:

The ability to accurately predict the outcome of a Presidential election is inherently undesirable.

Let’s agree that we will call this the Prediction Assumption (PA). Predicting the outcome of elections may not be a good idea. If predictions could be accurate, then one could argue that this would allow candidates to manipulate the election. I think you could make the case that this could be a problem. Candidates would be able to manage their opinions to optimize their chances of winning the election.

In any event I then proved a result that showed that given PA, one could argue that the EC was better than a popular election. Note, the usual math arguments against the EC are based on the power of individual voters. See here and here for some of their insights.

My College Paper

The central point of my paper was informally this:

Theorem: Prediction of an election using EC is more difficult than one using the popular vote.

A simple example should help. Imagine an election with three states: Northeast, West, and South. Let them each have one electoral vote. Clearly {2} are needed to win. Suppose the states are arranged like this:

  • Northeast: Almost all for A;

  • West: Almost all for B;

  • South: Close between A and B.

Then prediction requires the polling to be able to tell the outcome of the South vote. The point is:

The smaller the number of voters in the ensemble being predicted, the more uncertain the prediction.

Ken argues that simply having a multiplicity of component elections—one in each state plus DC—also increases the uncertainty. This may happen technically just because the result is a kind of average over unequal-sized averages.

Their Papers

Modern results in Boolean function theory actually have studied the noise sensitivity of the EC. They have studied how errors in voting can flip an election. Look at Gil Kalai’s 2010 paper, “Noise Sensitivity And Chaos In Social Choice Theory.” He shows that majority is more stable in the presence of noise than the EC. Look at Ryan O’Donnell’s paper, “Some Topics in Analysis of Boolean Functions.” He shows a related point that errors in EC—in a simple model—can increase the chance that errors flip the election factor of about {5.7}.

Neither paper strikes me as studying whether predictions are easier with the simple majority rule than with the EC.
I believe that their new results can be used to prove the same type of theorems on prediction.

Open Problems

Did I deserve a better grade than a {\cal D}? Or should I have flunked? Should I have published something?

For comparison, the college term paper which eventually became the 27th Amendment to the Constitution received a better grade: {\cal C}. Oh well.

by rjlipton at August 02, 2019 02:32 PM UTC