Theory of Computing Blog Aggregator

My vision is blurry right now, because yesterday I had a procedure called corneal cross-linking, intended to prevent further deterioration of my eyes as I get older. But I can see clearly enough to tap out a post with random thoughts about the world.

I’m happy that the Netanyahu era might finally be ending in Israel, after which Netanyahu will hopefully face some long-delayed justice for his eye-popping corruption. If only there were a realistic prospect of Trump facing similar justice. I wish Benny Gantz success in putting together a coalition.

I’m happy that my two least favorite candidates, Bill de Blasio and Kirsten Gillibrand, have now both dropped out of the Democratic primary. Biden, Warren, Yang===I could enthusiastically support pretty much any of them, if they looked like they had a good chance to defeat the madman. Let’s hope.

Most importantly, I wish to register my full-throated support for the climate strikes taking place today all over the world, including here in Austin. My daughter Lily, age 6, is old enough to understand the basics of what’s happening and to worry about her future. I urge the climate strikers to keep their eyes on things that will actually make a difference (building new nuclear plants, carbon taxes, geoengineering) and ignore what won’t (banning plastic straws).

As for Greta Thunberg: she is, or is trying to be, the real-life version of the Comet King from Unsong. You can make fun of her, ask what standing or expertise she has as some random 16-year-old to lead a worldwide movement. But I suspect that this is always what it looks like when someone takes something that’s known to (almost) all, and then makes it common knowledge. If civilization makes it to the 22nd century at all, then in whatever form it still exists, I can easily imagine that it will have more statues of Greta than of MLK or Gandhi.

On a completely unrelated and much less important note, John Horgan has a post about “pluralism in math” that includes some comments by me.

Oh, and on the quantum supremacy front—I foresee some big news very soon. You know which blog to watch for more.

by Scott at September 20, 2019 07:30 PM UTC

Authors: Eryk Lipka
Download: PDF
Abstract: We will consider some extensions of the polygonal art gallery problem. In a recent paper Morrison has shown the smallest (9 sides) example of an art gallery that cannot be observed by guards placed in every third corner. Author also mentioned two related problems, for which the minimal examples are not known. We will show that a polygonal fortress such that its exterior cannot be guarded by sentries placed in every second vertex has at least 12 sides. Also, we will show an example of three-dimensional polyhedron such that its inside cannot be covered by placing guard in every vertex which has both fewer vertices and faces than previously known.

at September 20, 2019 01:23 AM UTC

Authors: Daniele Di Tullio, Ankan Pal
Download: PDF
Abstract: In this paper, we intend to study the geometric meaning of the discrete logarithm problem defined over an Elliptic Curve. The key idea is to reduce the Elliptic Curve Discrete Logarithm Problem (EC-DLP) into a system of equations. These equations arise from the interesection of quadric hypersurfaces in an affine space of lower dimension. In cryptography, this interpretation can be used to design attacks on EC-DLP. Presently, the best known attack algorithm having a sub-exponential time complexity is through the implementation of Summation Polynomials and Weil Descent. It is expected that the proposed geometric interpretation can result in faster reduction of the problem into a system of equations. These overdetermined system of equations are hard to solve. We have used F4 (Faugere) algorithms and got results for primes less than 500,000. Quantum Algorithms can expedite the process of solving these over-determined system of equations. In the absence of fast algorithms for computing summation polynomials, we expect that this could be an alternative. We do not claim that the proposed algorithm would be faster than Shor's algorithm for breaking EC-DLP but this interpretation could be a candidate as an alternative to the 'summation polynomial attack' in the post-quantum era.

at September 20, 2019 01:23 AM UTC

Authors: Sevag Gharibian, Ojas Parekh
Download: PDF
Abstract: Approximation algorithms for constraint satisfaction problems (CSPs) are a central direction of study in theoretical computer science. In this work, we study classical product state approximation algorithms for a physically motivated quantum generalization of Max-Cut, known as the quantum Heisenberg model. This model is notoriously difficult to solve exactly, even on bipartite graphs, in stark contrast to the classical setting of Max-Cut. Here we show, for any interaction graph, how to classically and efficiently obtain approximation ratios 0.649 (anti-ferromagnetic XY model) and 0.498 (anti-ferromagnetic Heisenberg XYZ model). These are almost optimal; we show that the best possible ratios achievable by a product state for these models is 2/3 and 1/2, respectively.

at September 20, 2019 12:00 AM UTC

Authors: Samuel M. Fischer
Download: PDF
Abstract: Route choice is often modelled as a two-step procedure in which travellers choose their routes from small sets of promising candidates. Many methods developed to identify such choice sets rely on assumptions about the mechanisms behind the route choice and require corresponding data sets. Furthermore, existing approaches often involve considerable complexity or perform many repeated shortest path queries. This makes it difficult to apply these methods in comprehensive models with numerous origin-destination pairs. In this paper, we address these issues by developing an algorithm that efficiently identifies locally optimal routes. Such paths arise from travellers acting rationally on local scales, whereas unknown factors may affect the routes on larger scales. Though methods identifying locally optimal routes are available already, these algorithms rely on approximations and return only few, heuristically chosen paths for specific origin-destination pairs. This conflicts with the demands of route choice models, where an exhaustive search for many origins and destinations would be necessary. We therefore extend existing algorithms to return (almost) all admissible paths between a large number of origin-destination pairs. We test our algorithm on a road network modelling the Canadian province British Columbia and analyze the distribution of locally optimal paths in the province.

at September 20, 2019 01:22 AM UTC

Authors: Dibyayan Chakraborty, Florent Foucaud, Harmender Gahlawat, Subir Kumar Ghosh, Bodhayan Roy
Download: PDF
Abstract: In this paper, we study the computational complexity of finding the \emph{geodetic number} of graphs. A set of vertices $S$ of a graph $G$ is a \emph{geodetic set} if any vertex of $G$ lies in some shortest path between some pair of vertices from $S$. The \textsc{Minimum Geodetic Set (MGS)} problem is to find a geodetic set with minimum cardinality. In this paper, we prove that solving the \textsc{MGS} problem is NP-hard on planar graphs with a maximum degree six and line graphs. We also show that unless $P=NP$, there is no polynomial time algorithm to solve the \textsc{MGS} problem with sublogarithmic approximation factor (in terms of the number of vertices) even on graphs with diameter $2$. On the positive side, we give an $O\left(\sqrt[3]{n}\log n\right)$-approximation algorithm for the \textsc{MGS} problem on general graphs of order $n$. We also give a $3$-approximation algorithm for the \textsc{MGS} problem on the family of solid grid graphs which is a subclass of planar graphs.

at September 20, 2019 01:22 AM UTC

Authors: Artem Lutov, Mourad Khayati, Philippe Cudré-Mauroux
Download: PDF
Abstract: Clustering is a crucial component of many data mining systems involving the analysis and exploration of various data. Data diversity calls for clustering algorithms to be accurate while providing stable (i.e., deterministic and robust) results on arbitrary input networks. Moreover, modern systems often operate with large datasets, which implicitly constrains the complexity of the clustering algorithm. Existing clustering techniques are only partially stable, however, as they guarantee either determinism or robustness. To address this issue, we introduce DAOC, a Deterministic and Agglomerative Overlapping Clustering algorithm. DAOC leverages a new technique called Overlap Decomposition to identify fine-grained clusters in a deterministic way capturing multiple optima. In addition, it leverages a novel consensus approach, Mutual Maximal Gain, to ensure robustness and further improve the stability of the results while still being capable of identifying micro-scale clusters. Our empirical results on both synthetic and real-world networks show that DAOC yields stable clusters while being on average 25% more accurate than state-of-the-art deterministic algorithms without requiring any tuning. Our approach has the ambition to greatly simplify and speed up data analysis tasks involving iterative processing (need for determinism) as well as data fluctuations (need for robustness) and to provide accurate and reproducible results.

at September 20, 2019 01:20 AM UTC

Authors: Thomas Kesselheim, Alexandros Psomas, Shai Vardi
Download: PDF
Abstract: We study a generalization of the secretary problem, where decisions do not have to be made immediately upon candidates' arrivals. After arriving, each candidate stays in the system for some (random) amount of time and then leaves, whereupon the algorithm has to decide irrevocably whether to select this candidate or not. The goal is to maximize the probability of selecting the best candidate overall. We assume that the arrival and waiting times are drawn from known distributions.

Our first main result is a characterization of the optimal policy for this setting. We show that when deciding whether to select a candidate it suffices to know only the time and the number of candidates that have arrived so far. Furthermore, the policy is monotone non-decreasing in the number of candidates seen so far, and, under certain natural conditions, monotone non-increasing in the time. Our second main result is proving that when the number of candidates is large, a single threshold policy is almost optimal.

at September 20, 2019 01:22 AM UTC

We study the stability of covers of simplicial complexes. Given a map f:Y?X that satisfies almost all of the local conditions of being a cover, is it close to being a genuine cover of X? Complexes X for which this holds are called cover-stable. We show that this is equivalent to X being a cosystolic expander with respect to non-abelian coefficients. This gives a new combinatorial-topological interpretation to cosystolic expansion which is a well studied notion of high dimensional expansion. As an example, we show that the 2-dimensional spherical building A3(????q) is cover-stable. We view this work as a possibly first example of "topological property testing", where one is interested in studying stability of a topological notion that is naturally defined by local conditions.

at September 19, 2019 07:04 PM UTC

The first position is in theoretical computer science, with a special focus on cryptography and algorithms.

The second position is open to all subfields of computer science including but not limited to systems, networking, graphics, artificial intelligence, machine learning, human-computer interaction, and visualization.


by shacharlovett at September 19, 2019 02:59 PM UTC

Authors: Renos Karamanis, Eleftherios Anastasiadis, Panagiotis Angeloudis, Marc Stettler
Download: PDF
Abstract: Transportation Network Companies employ dynamic pricing methods at periods of peak travel to incentivise driver participation and balance supply and demand for rides. Surge pricing multipliers are commonly used and are applied following demand and estimates of customer and driver trip valuations. Combinatorial double auctions have been identified as a suitable alternative, as they can achieve maximum social welfare in the allocation by relying on customers and drivers stating their valuations. A shortcoming of current models, however, is that they fail to account for the effects of trip detours that take place in shared trips and their impact on the accuracy of pricing estimates. To resolve this, we formulate a new shared-ride assignment and pricing algorithm using combinatorial double auctions. We demonstrate that this model is reduced to a maximum weighted independent set model, which is known to be APX-hard. A fast local search heuristic is also presented, which is capable of producing results that lie within 1% of the exact approach. Our proposed algorithm could be used as a fast and reliable assignment and pricing mechanism of ride-sharing requests to vehicles during peak travel times.

at September 19, 2019 12:00 AM UTC

Authors: Leo Liberti
Download: PDF
Abstract: Data are often represented as graphs. Many common tasks in data science are based on distances between entities. While some data science methodologies natively take graphs as their input, there are many more that take their input in vectorial form. In this survey we discuss the fundamental problem of mapping graphs to vectors, and its relation with mathematical programming. We discuss applications, solution methods, dimensional reduction techniques and some of their limits. We then present an application of some of these ideas to neural networks, showing that distance geometry techniques can give competitive performance with respect to more traditional graph-to-vector mappings.

at September 19, 2019 11:30 PM UTC

Authors: Jonas Sauer, Dorothea Wagner, Tobias Zündorf
Download: PDF
Abstract: We study the problem of computing public transit traffic assignments in a multi-modal setting: Given a public transit timetable, an additional unrestricted transfer mode (in our case walking), and a set of origin-destination pairs, we aim to compute the utilization of every vehicle in the network. While it has been shown that considering unrestricted transfers can significantly improve journeys, computing such journeys efficiently remains algorithmically challenging. Since traffic assignments require the computation of millions of shortest paths, using a multi-modal network has previously not been feasible. A recently proposed approach (ULTRA) enables efficient algorithms with UnLimited TRAnsfers at the cost of a short preprocessing phase. In this work we combine the ULTRA approach with a state-of-the-art assignment algorithm, making multi-modal assignments practical. Careful algorithm engineering results in a new public transit traffic assignment algorithm that even outperforms the algorithm it builds upon, while enabling unlimited walking which has not been considered previously. We conclude our work with an extensive evaluation of the algorithm, showing its versatility and efficiency. On our real world instance, the algorithm computes over 15 million unique journeys in less than 17 seconds.

at September 19, 2019 11:27 PM UTC

Authors: Vangelis Th. Paschos
Download: PDF
Abstract: The paper presents a polynomial time approximation schema for the edge-weighted version of maximum k-vertex cover problem in bipartite graphs.

at September 19, 2019 11:20 PM UTC

Authors: Édouard Bonnet, Nicolas Bousquet, Stéphan Thomassé, Rémi Watrigant
Download: PDF
Abstract: Maximum Independent Set (MIS for short) is in general graphs the paradigmatic $W[1]$-hard problem. In stark contrast, polynomial-time algorithms are known when the inputs are restricted to structured graph classes such as, for instance, perfect graphs (which includes bipartite graphs, chordal graphs, co-graphs, etc.) or claw-free graphs. In this paper, we introduce some variants of co-graphs with parameterized noise, that is, graphs that can be made into disjoint unions or complete sums by the removal of a certain number of vertices and the addition/deletion of a certain number of edges per incident vertex, both controlled by the parameter. We give a series of FPT Turing-reductions on these classes and use them to make some progress on the parameterized complexity of MIS in $H$-free graphs. We show that for every fixed $t \geqslant 1$, MIS is FPT in $P(1,t,t,t)$-free graphs, where $P(1,t,t,t)$ is the graph obtained by substituting all the vertices of a four-vertex path but one end of the path by cliques of size $t$. We also provide randomized FPT algorithms in dart-free graphs and in cricket-free graphs. This settles the FPT/W[1]-hard dichotomy for five-vertex graphs $H$.

at September 19, 2019 12:00 AM UTC

Authors: Zhetong Dong, Hongwei Lin, Chi Zhou
Download: PDF
Abstract: Over the last decades, many attempts have been made to optimally integrate machine learning (ML) and topological data analysis. A prominent problem in applying persistent homology to ML tasks is finding a vector representation of a persistence diagram (PD), which is a summary diagram for representing topological features. From the perspective of data fitting, a stable vector representation, persistence B-spline grid (PB), is proposed based on the efficient technique of progressive-iterative approximation for least-squares B-spline surface fitting. Meanwhile, we theoretically prove that the PB method is stable with respect to the metrics defined on the PD space, i.e., the $p$-Wasserstein distance and the bottleneck distance. The proposed method was tested on a synthetic dataset, datasets of randomly generated PDs, data of a dynamical system, and 3D CAD models.

at September 19, 2019 11:31 PM UTC

Authors: Shimon Bitton, Yuval Emek, Taisuke Izumi, Shay Kutten
Download: PDF
Abstract: A new \emph{spanner} construction algorithm is presented, working under the \emph{LOCAL} model with unique edge IDs. Given an $n$-node communication graph, a spanner with a constant stretch and $O (n^{1 + \varepsilon})$ edges (for an arbitrarily small constant $\varepsilon > 0$) is constructed in a constant number of rounds sending $O (n^{1 + \varepsilon})$ messages whp. Consequently, we conclude that every $t$-round LOCAL algorithm can be transformed into an $O (t)$-round LOCAL algorithm that sends $O (t \cdot n^{1 + \varepsilon})$ messages whp. This improves upon all previous message-reduction schemes for LOCAL algorithms that incur a $\log^{\Omega (1)} n$ blow-up of the round complexity.

at September 19, 2019 11:25 PM UTC

Authors: Alexander Mishchenko, Vladimir Manuilov, Chao You, Han Yang
Download: PDF
Abstract: We suggest a reduction of the combinatorial problem of hypergraph partitioning to a continuous optimization problem.

at September 19, 2019 11:23 PM UTC

Authors: Marco De Bortoli
Download: PDF
Abstract: Answer Set Programming (ASP) is a famous logic language for knowledge representation, which has been really successful in the last years, as witnessed by the great interest into the development of efficient solvers for ASP. Yet, the great request of resources for certain types of problems, as the planning ones, still constitutes a big limitation for problem solving. Particularly, in the case the program is grounded before the resolving phase, an exponential blow up of the grounding can generate a huge ground file, infeasible for single machines with limited resources, thus preventing even the discovering of a single non-optimal solution. To address this problem, in this paper we present a distributed approach to ASP solving, exploiting distributed computation benefits in order to overcome the just explained limitations. The here presented tool, which is called Distributed Answer Set Coloring (DASC), is a pure solver based on the well-known Graph Coloring algorithm. DASC is part of a bigger project aiming to bring logic programming into a distributed system, started in 2017 by Federico Igne with mASPreduce and continued in 2018 by Pietro Totis with a distributed grounder. In this paper we present a low level implementation of the Graph Coloring algorithm, via the Boost and MPI libraries for C++. Finally, we provide a few results of the very first working version of our tool, at the moment without any strong optimization or heuristic.

at September 19, 2019 11:23 PM UTC

Authors: Michael Morak
Download: PDF
Abstract: Epistemic Logic Programs (ELPs), an extension of Answer Set Programming (ASP) with epistemic operators, have received renewed attention from the research community in recent years. Classically, evaluating an ELP yields a set of world views, with each being a set of answer sets. In this paper, we propose an alternative definition of world views that represents them as three-valued assignments, where each atom can be either always true, always false, or neither. Based on several examples, we show that this definition is natural and intuitive. We also investigate relevant computational properties of these new semantics, and explore how other notions, like strong equivalence, are affected.

at September 19, 2019 11:21 PM UTC

Authors: David G. Harris
Download: PDF
Abstract: The Lovasz Local Lemma (LLL) is a keystone principle in probability theory, guaranteeing the existence of configurations which avoid a collection $\mathcal B$ of "bad" events which are mostly independent and have low probability. In its simplest "symmetric" form, it asserts that whenever a bad-event has probability $p$ and affects at most $d$ bad-events, and $e p d < 1$, then a configuration avoiding all $\mathcal B$ exists.

A seminal algorithm of Moser & Tardos (2010) gives nearly-automatic randomized algorithms for most constructions based on the LLL. However, deterministic algorithms have lagged behind. We address three specific shortcomings of the prior deterministic algorithms. First, our algorithm applies to the LLL criterion of Shearer (1985); this is more powerful than alternate LLL criteria and also removes a number of nuisance parameters and leads to cleaner and more legible bounds. Second, we provide parallel algorithms with much greater flexibility in the functional form of of the bad-events. Third, we provide a derandomized version of the MT-distribution, that is, the distribution of the variables at the termination of the MT algorithm.

We show applications to non-repetitive vertex coloring, independent transversals, strong coloring, and other problems. These give deterministic algorithms which essentially match the best previous randomized sequential and parallel algorithms.

at September 19, 2019 11:23 PM UTC

Authors: Yuanjing Shi, Zhaoxing Li
Download: PDF
Abstract: Sorting is the one of the fundamental tasks of modern data management systems. With Disk I/O being the most-accused performance bottleneck and more computation-intensive workloads, it has come to our attention that in heterogeneous environment, performance bottleneck may vary among different infrastructure. As a result, sort kernels need to be adaptive to changing hardware conditions. In this paper, we propose Leyenda, a hybrid, parallel and efficient Radix Most-Significant-Bit (MSB) MergeSort algorithm, with utilization of local thread-level CPU cache and efficient disk/memory I/O. Leyenda is capable of performing either internal or external sort efficiently, based on different I/O and processing conditions. We benchmarked Leyenda with three different workloads from Sort Benchmark, targeting three unique use cases, including internal, partially in-memory and external sort, and we found Leyenda to outperform GNU's parallel in-memory quick/merge sort implementations by up to three times. Leyenda is also ranked the second best external sort algorithm on ACM 2019 SIGMOD programming contest and forth overall.

at September 19, 2019 11:25 PM UTC

The next TCS+ talk will take place this coming Wednesday, September 25th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 17:00 UTC). Mark Sellke from Stanford will speak about “Chasing Convex Bodies” (abstract below).

Please make sure you reserve a spot for your group to join us live by signing up on the online form. As usual, for more information about the TCS+ online seminar series and the upcoming talks, or to suggest a possible topic or speaker, please see the website.

Abstract: I will explain our recent understanding of the chasing convex bodies problem posed by Friedman and Linial in 1991. In this problem, an online player receives a request sequence K_1,\dots,K_T of convex sets in d dimensional space and moves his position online into each requested set. The player’s movement cost is the length of the resulting path. Chasing convex bodies asks if there an online algorithm with cost competitive against the offline optimal path. This is both an challenging metrical task system and (equivalent to) a competitive analysis view on online convex optimization.

This problem was open for d>2 until last year but has recently been solved twice. The first solution gives a 2^{O(d)} competitive algorithm while the second gives a nearly optimal \min(d,\sqrt(d\log(T))) competitive algorithm for T requests. The latter result is based on the Steiner point, which is the exact optimal solution to a related geometric problem called Lipschitz selection and dates from 1840. In the talk, I will briefly outline the first solution and fully explain the second.

Partially based on joint works with Sébastien Bubeck, Bo’az Klartag, Yin Tat Lee, and Yuanzhi Li.


by plustcs at September 18, 2019 07:57 PM UTC

[Guest post by David Steurer – seems like a great opportunity! –Boaz]

The Swiss Winter School on Lower Bounds and Communication Complexity (10-14 February 2020, ) is the first in a series of annual winter schools in Theoretical Computer Science jointly organized by EPFL and ETH Zurich. The goal of the school is to educate top international theory PhD students about exciting recent developments in the field. The winter school will be held in Zinal, a mountain village in the Swiss Alps that has a long tradition of hosting academic workshops and that allows for nice excursions and stimulating discussions in a relaxed atmosphere.

This year’s installment features an exciting trinity of speakers: Kasper Green Larsen (Data Structure Lower Bounds), Raghu Meka (Communication Complexity) and Amir Shpilka (Algebraic complexity).

The application deadline is November 15th 2019, and acceptance notifications will be sent by December 1st 2019. The application form is available at Attendance of the winter school is free of charge and includes room and board (shared rooms).

Organizers: Michael Kapralov (EPFL), David Steurer (ETH) , Ola Svennson (EPFL)

by Boaz Barak at September 18, 2019 12:36 PM UTC

In this paper, we prove a general hardness amplification scheme for optimization problems based on the technique of direct products. We say that an optimization problem $\Pi$ is direct product feasible if it is possible to efficiently aggregate any $k$ instances of $\Pi$ and form one large instance of $\Pi$ such that given an optimal feasible solution to the larger instance, we can efficiently find optimal feasible solutions to all the $k$ smaller instances. Given a direct product feasible optimization problem $\Pi$, our hardness amplification theorem may be informally stated as follows: If there is a distribution $\mathcal{D}$ over instances of $\Pi$ of size $n$ such that every randomized algorithm running in time $t(n)$ fails to solve $\Pi$ on $\frac{1}{\alpha(n)}$ fraction of inputs sampled from $\mathcal{D}$, then, assuming some relationships on $\alpha(n)$ and $t(n)$, there is a distribution $\mathcal{D}'$ over instances of $\Pi$ of size $O(n\cdot \alpha(n))$ such that every randomized algorithm running in time $\frac{t(n)}{poly(\alpha(n))}$ fails to solve $\Pi$ on $\frac{99}{100}$ fraction of inputs sampled from $\mathcal{D}'$. As a consequence of the above theorem, we show hardness amplification of problems in various classes such as NP-hard problems like Max-Clique, Knapsack, and Max-SAT, problems in P such as Longest Common Subsequence, Edit Distance, Matrix Multiplication, and even problems in TFNP such as Factoring and computing Nash equilibrium.

at September 17, 2019 11:59 PM UTC

In this work, using methods from high dimensional expansion, we show that the property of $k$-direct-sum is testable for odd values of $k$ . Previous work of Kaufman and Lubotzky could inherently deal only with the case that $k$ is even, using a reduction to linearity testing. Interestingly, our work is the first to combine the topological notion of high dimensional expansion (called co-systolic expansion) with the combinatorial/spectral notion of high dimensional expansion (called colorful expansion) to obtain the result. The classical $k$-direct-sum problem applies to the complete complex; Namely it considers a function defined over all $k$-subsets of some $n$ sized universe. Our result here applies to any collection of $k$-subsets of an $n$-universe, assuming this collection of subsets forms a high dimensional expander.

at September 17, 2019 12:04 AM UTC

It is becoming increasingly important to understand the vulnerability of machine learning models to adversarial attacks. In this paper we study the feasibility of robust learning from the perspective of computational learning theory, considering both sample and computational complexity. In particular, our definition of robust learnability requires polynomial sample complexity. We start with two negative results. We show that no non-trivial concept class can be robustly learned in the distribution-free setting against an adversary who can perturb just a single input bit. We show moreover that the class of monotone conjunctions cannot be robustly learned under the uniform distribution against an adversary who can perturb $\omega(\log n)$ input bits. However if the adversary is restricted to perturbing $O(\log n)$ bits, then the class of monotone conjunctions can be robustly learned with respect to a general class of distributions (that includes the uniform distribution). Finally, we provide a simple proof of the computational hardness of robust learning on the boolean hypercube. Unlike previous results of this nature, our result does not rely on another computational model (e.g. the statistical query model) nor on any hardness assumption other than the existence of a hard learning problem in the PAC framework.

at September 17, 2019 12:00 AM UTC

We show that Gallager's ensemble of Low-Density Parity Check (LDPC) codes achieve list-decoding capacity. These are the first graph-based codes shown to have this property. Previously, the only codes known to achieve list-decoding capacity were completely random codes, random linear codes, and codes constructed by algebraic (rather than combinatorial) techniques. This result opens up a potential avenue towards truly linear-time list-decodable codes which achieve list-decoding capacity. Our result on list decoding follows from a much more general result: any local property satisfied with high probability by a random linear code is also satisfied with high probability by a random LDPC code from Gallager's distribution. Local properties are properties characterized by the exclusion of small sets of codewords, and include list-decoding, list-recovery and average-radius list-decoding. Along the way, we give a characterization of sets of codewords that are likely to appear in a random linear code, which may be of independent interest.

at September 16, 2019 11:58 PM UTC

The $\epsilon$-approximate degree of a function $f\colon X \to \{0, 1\}$ is the least degree of a multivariate real polynomial $p$ such that $|p(x)-f(x)| \leq \epsilon$ for all $x \in X$. We determine the $\epsilon$-approximate degree of the element distinctness function, the surjectivity function, and the permutation testing problem, showing they are $\Theta(n^{2/3} \log^{1/3}(1/\epsilon))$, $\tilde\Theta(n^{3/4} \log^{1/4}(1/\epsilon))$, and $\Theta(n^{1/3} \log^{2/3}(1/\epsilon))$, respectively. Previously, these bounds were known only for constant $\epsilon.$ We also derive a connection between vanishing-error approximate degree and quantum Merlin--Arthur (QMA) query complexity. We use this connection to show that the QMA complexity of permutation testing is $\Omega(n^{1/4})$. This improves on the previous best lower bound of $\Omega(n^{1/6})$ due to Aaronson (Quantum Information & Computation, 2012), and comes somewhat close to matching a known upper bound of $O(n^{1/3})$.

at September 16, 2019 11:55 PM UTC

A clever trick on combining automata

John Robson has worked on various problems including what is still the best result on separating words—the topic we discussed the other day. Ken first knew him for his proof than {N \times N} checkers is {\mathsf{EXPTIME}}-complete and similar hardness results for chess and Go.

Today I want to talk about his theorem that any two words can be separated by an automaton with relataivley few states.

In his famous paper from 1989, he proved an upper bound on the Separating Word Problem. This is the question: Given two strings {S} and {T}, how many states does a deterministic automaton need to be able to accept {S} and reject {T}? His theorem is:

Theorem 1 (Robson’s Theorem) Suppose that {S} and {T} are distinct strings of length {n}. Then there is an automaton with at most {O(n^{0.4}\log^{0.6} n)} states that accepts {S} and rejects {T}.

The story of his result is involved. For starters, it is still the best upper bound after almost three decades. Impressive. Another issue is that a web search does not quickly, at least for for me, find a PDF of the original paper. I tried to find it and could not. More recent papers on the separating word problem reference his 1989 paper, but they do not explain how he proves it.

Recall the problem of separating words is: Given two distinct words of length {n}, is there a deterministic finite automaton that accepts one and rejects the other? And the machine has as few states as possible. Thus his theorem shows that roughly the number of states grows at most like the square root of {n}.

I did finally track the paper down. The trouble for me is the paper is encrypted. Well not exactly, but the version I did find is a poor copy of the original. Here is an example to show what I mean:


[ An Example ]

So the task of decoding the proof is a challenge. A challenge, but a rewarding one.

A Cool Trick

Robson’s proof uses two insights. The first is he uses some basic string-ology. That is he uses some basic facts about strings. For example he uses that a non-periodic string cannot overlap itself too much.

He also uses a clever trick on how to simulate two deterministic machines for the price of one. This in general is not possible, and is related to deep questions about automata that we have discussed before here. Robson shows that it can be done in a special but important case.

Let me explain. Suppose that {\alpha} is a string. We can easily design an automaton that accepts {x} if and only if {x} is the string {\alpha}. The machine will have order the length of {\alpha} states. So far quite simple.

Now suppose that we have a string {S} of length {n} and wish to find a particular occurrence of the pattern {\alpha} in {S}. We assume that there are {\#(S,\alpha)} occurrences of {\alpha} in {S}. The task is to construct an automaton that accepts at the end of the {k^{th}} copy of {\alpha}. Robson shows that this can be done by a automaton that has order

\displaystyle  \#(S,\alpha) + |\alpha|

Here {|\alpha|} is the length of the string {\alpha}.

This is a simple, clever, and quite useful observation. Clever indeed. The obvious automaton that can do this would seem to require a cartesian product of two machines. This would imply that it would require

\displaystyle  \#(S,\alpha) \times |\alpha|

number of states: Note the times operator {\times} rather than addition. Thus Robson’s trick is a huge improvement.

Here is how he does this.

His Trick

Robson’s uses a clever trick in his proof of the main lemma. Let’s work through an example with the string {100}. The goal is to see if there is a copy of this string starting at a position that is a multiple of {3}.

The machine starts in state {(0,Y)} and tries to find the correct string {100} as input. If it does, then it reaches the accepting state {(3,Y)}. If while doing this it gets a wrong input, then it switches to states that have stopped looking for the input {100}. After seeing three inputs the machine reaches {(3,N)} and then moves back to the start state.

[ The automaton ]

The Lemmas

We will now outline the proof in some detail.


The first lemma is a simple fact about hashing.

Lemma 2 Suppose {1 \le r \le m} and

\displaystyle  1 \le k_{1} < \cdots < k_{m} \le n.

Then all but {{O}(m \log n)} primes satisfy

\displaystyle  k_{i} \equiv k_{r} \bmod p \text{ if and only if } i =r.

Proof: Consider the quantity {|k_{r} - k_{i}|} for {i} not equal to {r}. Call a prime bad if it divides this quantity. This quantity can be divisible by at most {\log n} primes. So there are at most {{O}(m\log n)} bad primes in total. \Box


We need some definitions about strings. Let {| \alpha |} be the length of the string {\alpha}. Also let {\#(S,\alpha)} be the number of occurrences of {\alpha} in {S}.

A string {\alpha} has the period {p} provided

\displaystyle  \alpha_{i} = \alpha_{i+p},

for all {i} so that {i+p} is defined. A string {\alpha} is periodic provided it has a period {p>0} that is less than half its length. Note, the shorter the period the more the string is really “periodic”: for example, the string

\displaystyle  10101010101010

is more “periodic” than

\displaystyle  10000001000000.

Lemma 3 For any string {u} either {u0} or {u1} is not periodic.

Proof: Suppose that {\beta=u\sigma} is periodic with period {p} where {\sigma} is a single character. Let the length of {\beta} equal {l}. So by definition, {1 \le p \le l/2}. Then

\displaystyle  \beta_{i} = \beta_{i+p},

for {1 \le i \le l-p}. So it follows that

\displaystyle  \beta_{l-p} = \beta_{l} = \sigma.

This shows that {u1} and {u0} cannot both be periodic, since

\displaystyle  1 \le l-p \le l/2 < l.


Lemma 4 Suppose that {\alpha} is not a periodic string. Then the number of copies of {\alpha} in a string {S} is upper bounded by {{O}(M)} where

\displaystyle  M = \frac{|S|}{|\alpha|}.

Proof: The claim follows once we prove that no two copies of {\alpha} in {S} can overlap more than {l/2} where {l} is the length of {\alpha}. This will immediately imply the lemma.

If {\alpha} has two copies in {S} that overlap then clearly

\displaystyle  \alpha_{i} = \alpha_{i+d},

for some {d>0} and all {i} in the range {1,\dots,l-d}. This says that {\alpha} has the period {l-d}. Since {\alpha} is not periodic it follows that {d > l/2}. This implies that the overlap of the two copies of {\alpha} are at most length {l/2}. Thus we have shown that they cannot overlap too much. \Box

Main Lemma

Say an automaton finds the {k^{th}} occurrence of {\alpha} in {S} provided it enters a special state after scanning the last bit of this occurrence.

Lemma 5 Let {S} be a string of length {n} and let {\alpha} be a non-periodic string.Then, there is an automaton with at most {\widetilde{O}(M)} states that can find the {k^{th}} occurrence of {\alpha} in {S} where

\displaystyle  M = \#(S,\alpha) + |\alpha|.

Here {\widetilde{O}(M)} allows factors that are fixed powers of {\log n}. This lemma is the main insight of Robson and will be proved later.

The Main Theorem

The following is a slightly weaker version of Robson’s theorem. I am still confused a bit about his stronger theorem, to be honest.

Theorem 6 (Robson’s Theorem) Suppose that {S} and {T} are distinct strings of length {n}. Then there is an automaton with at most {\widetilde{O}(\sqrt {n})} states that accepts {S} and rejects {T}.

Proof: Since {S} and {T} are distinct we can assume that {S} starts with the prefix {u1} and {T} starts with the prefix {u0} for some string {u}. If the length of {u} is less than order {\widetilde{O}(\sqrt {n})} the theorem is trivial. Just construct an automaton that accepts {u1} and rejects {u0}.

So we can assume that {u = w\alpha} for some strings {w} and {\alpha} where the latter is order {\widetilde{O}(\sqrt {n})} in length. By lemma we can assume that {\alpha1} is not periodic. So by lemma we get that

\displaystyle  \#(S,\alpha1) = \widetilde{O}(\sqrt{n}).

Then by lemma we are done. \Box

Proof of Main Lemma

Proof: Let {S} have length {n} and let {\alpha} be a non-periodic string in {S} of length {l}. Also let {\#(S,\alpha) = m}. By the overlap lemma it follows that {m} is bounded by {\widetilde{O}(|S|/|\alpha|)}.

Let {\alpha} occur at locations

\displaystyle  1 \le k_{1} < \cdots < k_{m} \le n.

Suppose that we are to construct a machine that finds the {r^{th}} copy of {\alpha}. By the hashing lemma there is a prime {p=\widetilde{O}(m)} so that

\displaystyle  k_{i} \equiv k_{r} \bmod p

if and only if {i=r}. Note we can also assume that {p > l}.

Let’s argue the special case where {k_{r}} is {0} modulo {p}. If it is congruent to another value the same argument can be used. This follows by having the machine initially skip a fixed amount of the input and then do the same as in the congruent to {0} case.

The automaton has states {(i,Y)} and {(i,N)} for {i=0,\dots,p}. The machine starts in state {(0,Y)} and tries to get to the accepting state {(l,Y)}. The transitions include:

\displaystyle  (0,Y) \underset{\alpha_{1}}{\rightarrow} (1,Y) \underset{\alpha_{2}}{\rightarrow} (2,Y) \underset{\alpha_{3}}{\rightarrow} \cdots \underset{\alpha_{l}}{\rightarrow} (l,Y).

This means that the machine keeps checking the input to see if it is scanning a copy of {\alpha}. If it gets all the way to the accepting state {(l,Y)}, then it stops.

Further transitions are:

\displaystyle  (1,N) \rightarrow (2,N) \rightarrow \cdots \rightarrow (p,N),


\displaystyle  (0,Y) \underset{\neg \alpha_{1}}{\rightarrow} (1,N), (1,Y) \underset{\neg \alpha_{2}}{\rightarrow} (2,N), \dots, (l-1,Y) \underset{\neg \alpha_{l}}{\rightarrow} (l,N).

The second group means that if a wrong input happens, then {(i,Y)} moves to {(i+1,N)}. Finally, the state {(p,N)} resets and starts the search again by going to the start state {(0,Y)} with an epsilon move.

Clearly this has the required number of states and it operates correctly. \Box

Open Problems

The open problem is: Can the SWP be solved with a better bound? The lower bound is still order {\log n}. So the gap is exponential.

by rjlipton at September 16, 2019 09:26 PM UTC

The artist behind Alef’s corner has a few mathematical designs and here are two new ones. (See Alef’s  website offering over 100 T-shirt designs.)


which was used for the official T-shirt for Jean-François Le Gall’s birthday conference.

See also this quanta magazine article by Kevin Hartness.

by Gil Kalai at September 16, 2019 09:23 PM UTC

I am teaching cryptography this semester for the second time (I taught it in Fall 2019) and will soon tell the students about the paper from 2015:
Imperfect Forward Secrecy: How Diffie-Hellman Fails in Practice. There are 14 authors.

The upshot is that as Diffie-Hellman was implemented in 2015, many cases were crackable. In summary (and probably too simple):

DH in a 512-bit group can be cracked by the authors

DH in a 1024-bit group they speculate can be cracked with nation-state resources.

Is this a big deal? If YES then what is being done, and if NOT then why not?

I have come up with some statements that I DO NOT KNOW if they are true, but I am ASKING you, to shed some light on the BIG DEAL or NO BIG DEAL question. (Note- Idea for a game show: BIG DEAL or NO BIG DEAL where contestants are asked if a news story is a BIG DEAL or not.)

So, please comment on the following question:

1) Since 2015 the people who use DH have upped their game and are now using bigger parameters. (I doubt this is true)

2) DH is mostly not used on things that hackers are not interested in, so this is not a big deal.

3) The expertise required to crack DH via this paper is rather difficult, so hackers don't have the skills.

4) This paper is not a problem for a bad reason: Hackers don't need to use the number field sieve DL algorithm when all they need to do is (1) guess that the pin numer is 1234 or the year the user was born (or close to it), (2) put on a uniform from Geek-Squad or some such organization and claim they are here to help, (3) exploit a known security flaw that the company has not bothered fixing.

5) The 14 authors have mysteriously disappeared. (I doubt this is true.)

(Misc: My spell checker thinks that Diffie and crackable are not words, but Hellman is.)

by GASARCH ( at September 16, 2019 01:10 PM UTC

by David Eppstein at September 15, 2019 05:51 PM UTC

An opening is available for an Assistant or Associate Professor in Theoretical Computer Science. Exceptional candidates at other ranks may be considered. As a campus with a continually growing diverse student body, we encourage applications from women, people of color, members of the LGBTQ community, and individuals with a commitment to mentoring individuals from underrepresented groups in CS.


by shacharlovett at September 15, 2019 04:28 PM UTC

One of the major open problems in complexity theory is proving super-logarithmic lower bounds on the depth of circuits (i.e., $\mathbf{P}\not\subseteq\mathbf{NC}^1$). Karchmer, Raz, and Wigderson (Computational Complexity 5, 3/4) suggested to approach this problem by proving that depth complexity behaves "as expected" with respect to the composition of functions $f \diamond g$. They showed that the validity of this conjecture would imply that $\mathbf{P}\not\subseteq\mathbf{NC}^1$. As a way to realize this program, Edmonds et. al. (Computational Complexity 10, 3) suggested to study the "multiplexor relation" $MUX$, which is a simplification of functions. In this note, we present two results regarding this relation: - The multiplexor relation is "complete" for the approach of Karchmer et. al. in the following sense: if we could prove (a variant of) their conjecture for the composition $f \diamond MUX$ for every function $f$, then this would imply $\mathbf{P}\not\subseteq\mathbf{NC}^1$. - A simpler proof of a lower bound for the multiplexor relation due to Edmonds et. al. Our proof has the additional benefit of fitting better with the machinery used in previous works on the subject.

at September 15, 2019 02:52 PM UTC

We study the following question: Is it easier to construct a hitting-set generator for polynomials $p:\mathbb{F}^n\rightarrow\mathbb{F}$ of degree $d$ if we are guaranteed that the polynomial vanishes on at most an $\epsilon>0$ fraction of its inputs? We will specifically be interested in tiny values of $\epsilon\ll d/|\mathbb{F}|$. This question was first asked by Goldreich and Wigderson (STOC 2014), who studied a specific setting geared for an application, and another specific setting was later studied by the third author (CCC 2017). In this work our main interest is a *systematic study of the problem itself*, in its general form, and we prove results that significantly extend and improve the two previously-known results. Our contributions are of two types: 1. Over fields of size $2\le|\mathbb{F}|\le poly(n)$, we show that the seed length of any hitting-set generator for polynomials of degree $d\le n^{.49}$ that vanish on at most $\epsilon=|\mathbb{F}|^{-t}$ of their inputs is at least $\Omega\left((d/t)\cdot\log(n)\right)$. 2. Over $\mathbb{F}_2$, we show that there exists a (non-explicit) hitting-set generator for polynomials of degree $d\le n^{.99}$ that vanish on at most $\epsilon=|\mathbb{F}|^{-t}$ of their inputs with seed length $O\left((d-t)\cdot\log(n)\right)$. We also show a polynomial-time computable hitting-set generator with seed length $O\left( (d-t)\cdot\left(2^{d-t}+\log(n)\right) \right)$. In addition, we prove that the problem we study is closely related to the following question: ``Does there exist a small set $S\subseteq\mathbb{F}^n$ whose degree-$d$ closure is very large?'', where the degree-$d$ closure of $S$ is the variety induced by the set of degree-$d$ polynomials that vanish on $S$. We then use our lower bounds on hitting-sets for polynomials that vanish rarely to deduce lower bounds for this question.

at September 15, 2019 02:51 PM UTC

In the Minimum Circuit Size Problem (MCSP[s(m)]), we ask if there is a circuit of size s(m) computing a given truth-table of length n = 2^m. Recently, a surprising phenomenon termed as hardness magnification by [Oliveira and Santhanam, FOCS 2018] was discovered for MCSP[s(m)] and the related problem MKtP of computing time-bounded Kolmogorov complexity. In [Oliveira and Santhanam, FOCS 2018], [Oliveira, Pich, and Santhanam, CCC 2019], and [McKay, Murray, and Williams, STOC 2019], it was shown that minor (n^{1+eps}-style) lower bounds for MCSP[2^o(m)] or MKtP[2^o(m)] would imply breakthrough circuit lower bounds such as NP is not in P/poly, NP is not in NC^1, or EXP is not in P/poly. We consider the question: What is so special about MCSP and MKtP? Why do they admit this striking phenomenon? One simple property is that all variants of MCSP (and MKtP) considered in prior work are sparse languages. For example, MCSP[s(m)] has 2^{O(s(m))} yes-instances of length n=2^m, so MCSP[2^o(m)] is 2^{n^o(1)}-sparse. We show that there is a hardness magnification phenomenon for all equally-sparse NP languages. Formally, suppose there is an eps > 0 and a language L in NP which is 2^{n^o(1)}-sparse, and L is not in Circuit[n^{1+eps}]. Then NP does not have n^k-size circuits for all k. We prove analogous theorems for De Morgan formulas, B_2-formulas, branching programs, AC^0[6] and TC^0 circuits, and more: improving the state of the art in NP lower bounds against any of these models by an eps factor in the exponent would already imply NP lower bounds for all fixed polynomials. In fact, in our proofs it is not necessary to prove a (say) n^{1+eps} circuit size lower bound for L: one only has to prove a lower bound against n^{1+eps}-time n^eps-space deterministic algorithms with n^eps advice bits. Such lower bounds are well-known for non-sparse problems. Building on our techniques, we also show interesting new hardness magnifications for search-MCSP and search-MKtP (where one must output small circuits or short representations of strings), showing consequences such as Parity-P (or PP, PSPACE, and EXP) is not contained in P/poly (or NC^1, AC^0[6], or branching programs of polynomial size). For instance, if there is an eps > 0 such that search-MCSP[2^{beta m}] does not have De Morgan formulas of size n^{3+eps} for all constants beta > 0, then Parity-P is not in NC^1.

at September 15, 2019 02:51 PM UTC

Below is a statement from the same administration which rigged elections to sell their town to the marijuana industry.  The latter spent more than $300,000 to win the rigged election, see expense report, a figure that does not include money spent on attorneys to lobby the administration.  In less than a month Newton residents will have their chance to hold the administration accountable.
A multi-state outbreak of severe pulmonary disease associated with e-cigarette and marijuana vaping devices has struck.  […] Some contained nicotine and others contained marijuana or related substances.
Newton Health and Human Services strongly urges residents to consider not using any e-cigarette or vaping products at this time.

by Manu at September 12, 2019 01:57 PM UTC

With the summer winding down, the Fall ’19 season of TCS+ is coming fast! We’re excited to bring to you, over the next few months, great speakers and amazing results, right into the comfort of your home institution (or home home, for that matter).

As a teaser, here are the first three talks of the season:

    • Mark Sellke (Stanford) will tell us, on September 25th, about his recent work on Chasing Convex Bodies.
    • Shachar Lovett (UCSD) will on October 9th present his new results on the Sunflower Lemma.
    • Hao Huang (Emory University) will talk on October 22nd (note: a Tuesday) about his recent proof of the Sensitivity Conjecture.

Stay tuned for those, and the following!

by plustcs at September 11, 2019 07:44 PM UTC