Theory of Computing Blog Aggregator

Authors: Daniel Padé, Stephen Fenner, Daniel Grier, Thomas Thierauf
Download: PDF
Abstract: We show that the quantum parity gate on $n > 3$ qubits cannot be cleanly simulated by a quantum circuit with two layers of arbitrary C-SIGN gates of any arity and arbitrary 1-qubit unitary gates, regardless of the number of allowed ancilla qubits. This is the best known and first nontrivial separation between the parity gate and circuits of this form. The same bounds also apply to the quantum fanout gate. Our results are incomparable with those of Fang et al. [3], which apply to any constant depth but require a sublinear number of ancilla qubits on the simulating circuit.

at May 26, 2020 12:00 AM UTC

Authors: Thomas Dybdahl Ahle
Download: PDF
Abstract: A Locality-Sensitive Hash (LSH) function is called $(r,cr,p_1,p_2)$-sensitive, if two data-points with a distance less than $r$ collide with probability at least $p_1$ while data points with a distance greater than $cr$ collide with probability at most $p_2$. These functions form the basis of the successful Indyk-Motwani algorithm (STOC 1998) for nearest neighbour problems. In particular one may build a $c$-approximate nearest neighbour data structure with query time $\tilde O(n^\rho/p_1)$ where $\rho=\frac{\log1/p_1}{\log1/p_2}\in(0,1)$. That is, sub-linear time, as long as $p_1$ is not too small. This is significant since most high dimensional nearest neighbour problems suffer from the curse of dimensionality, and can't be solved exact, faster than a brute force linear-time scan of the database.

Unfortunately, the best LSH functions tend to have very low collision probabilities, $p_1$ and $p_2$. Including the best functions for Cosine and Jaccard Similarity. This means that the $n^\rho/p_1$ query time of LSH is often not sub-linear after all, even for approximate nearest neighbours!

In this paper, we improve the general Indyk-Motwani algorithm to reduce the query time of LSH to $\tilde O(n^\rho/p_1^{1-\rho})$ (and the space usage correspondingly.) Since $n^\rho p_1^{\rho-1} < n \Leftrightarrow p_1 > n^{-1}$, our algorithm always obtains sublinear query time, for any collision probabilities at least $1/n$. For $p_1$ and $p_2$ small enough, our improvement over all previous methods can be \emph{up to a factor $n$} in both query time and space.

The improvement comes from a simple change to the Indyk-Motwani algorithm, which can easily be implemented in existing software packages.

at May 26, 2020 12:00 AM UTC

Authors: Felix Wiebe, Shivesh Kumar, Daniel Harnack, Malte Langosz, Hendrik Wöhrle, Frank Kirchner
Download: PDF
Abstract: Motion planning is a difficult problem in robot control. The complexity of the problem is directly related to the dimension of the robot's configuration space. While in many theoretical calculations and practical applications the configuration space is modeled as a continuous space, we present a discrete robot model based on the fundamental hardware specifications of a robot. Using lattice path methods, we provide estimates for the complexity of motion planning by counting the number of possible trajectories in a discrete robot configuration space.

at May 26, 2020 12:00 AM UTC

Authors: Markus Banagl, Tim Mäder, Filip Sadlo
Download: PDF
Abstract: Intersection homology is a topological invariant which detects finer information in a space than ordinary homology. Using ideas from classical simple homotopy theory, we construct local combinatorial transformations on simplicial complexes under which intersection homology remains invariant. In particular, we obtain the notions of stratified formal deformations and stratified spines of a complex, leading to reductions of complexes prior to computation of intersection homology. We implemented the algorithmic execution of such transformations, as well as the calculation of intersection homology, and apply these algorithms to investigate the intersection homology of stratified spines in Vietoris-Rips type complexes associated to point sets sampled near given, possibly singular, spaces.

at May 26, 2020 12:00 AM UTC

Authors: Robert D. Carr, Jennifer Iglesias, Giuseppe Lanciac, Benjamin Moseley
Download: PDF
Abstract: We introduce multiple symmetric LP relaxations for minimum cut problems. The relaxations give optimal and approximate solutions when the input is a Hamiltonian cycle. We show that this leads to one of two interesting results. In one case, these LPs always give optimal and near optimal solutions, and then they would be the smallest known symmetric LPs for the problems considered. Otherwise, these LP formulations give strictly better LP relaxations for the traveling salesperson problem than the subtour relaxation. We have the smallest known LP formulation that is a 9/8-approximation or better for min-cut. In addition, the LP relaxation of min-cut investigated in this paper has interesting constraints; the LP contains only a single typical min-cut constraint and all other constraints are typically only used for max-cut relaxations.

at May 26, 2020 12:00 AM UTC

Authors: Olivier Bousquet, Steve Hanneke, Shay Moran, Nikita Zhivotovskiy
Download: PDF
Abstract: The classical PAC sample complexity bounds are stated for any Empirical Risk Minimizer (ERM) and contain an extra logarithmic factor $\log(1/{\epsilon})$ which is known to be necessary for ERM in general. It has been recently shown by Hanneke (2016) that the optimal sample complexity of PAC learning for any VC class C is achieved by a particular improper learning algorithm, which outputs a specific majority-vote of hypotheses in C. This leaves the question of when this bound can be achieved by proper learning algorithms, which are restricted to always output a hypothesis from C.

In this paper we aim to characterize the classes for which the optimal sample complexity can be achieved by a proper learning algorithm. We identify that these classes can be characterized by the dual Helly number, which is a combinatorial parameter that arises in discrete geometry and abstract convexity. In particular, under general conditions on C, we show that the dual Helly number is bounded if and only if there is a proper learner that obtains the optimal joint dependence on $\epsilon$ and $\delta$.

As further implications of our techniques we resolve a long-standing open problem posed by Vapnik and Chervonenkis (1974) on the performance of the Support Vector Machine by proving that the sample complexity of SVM in the realizable case is $\Theta((n/{\epsilon})+(1/{\epsilon})\log(1/{\delta}))$, where $n$ is the dimension. This gives the first optimal PAC bound for Halfspaces achieved by a proper learning algorithm, and moreover is computationally efficient.

at May 26, 2020 12:00 AM UTC

Authors: Argyrios Deligkas, Themistoklis Melissourgos, Paul G. Spirakis
Download: PDF
Abstract: We study the complexity of finding a Walrasian equilibrium in markets where the agents have $k$-demand valuations, where $k$ is a constant. This means that the maximum value of every agent comes from a bundle of size at most $k$. Our results are threefold. For unit-demand agents, where the existence of a Walrasian equilibrium is guaranteed, we show that the problem is in quasi-NC. Put differently, we give the first parallel algorithm that finds a Walrasian equilibrium in polylogarithmic time. This comes in striking contrast to all existing algorithms that are highly sequential. For $k=2$, we show that it is NP-hard to decide if a Walrasian equilibrium exists even if the valuations are submodular, while for $k=3$ the hardness carries over to budget-additive valuations. In addition, we give a polynomial-time algorithm for markets with 2-demand single-minded valuations, or unit-demand valuations. Our last set of results consists of polynomial-time algorithms for $k$-demand valuations in unbalanced markets; markets where the number of items is significantly larger than the number of agents, or vice versa.

at May 26, 2020 01:27 AM UTC

Authors: Alexander L. Gavrilyuk, Roman Nedela, Ilia Ponomarenko
Download: PDF
Abstract: A graph is said to be distance-hereditary if the distance function in every connected induced subgraph is the same as in the graph itself. We prove that the ordinary Weisfeiler-Leman algorithm correctly tests the isomorphism of any two graphs if one of them is distance-hereditary; more precisely, the Weisfeiler-Leman dimension of the class of finite distance-hereditary graphs is equal to $2$. The previously best known upper bound for the dimension was $7$.

at May 26, 2020 12:00 AM UTC

Authors: Eric Goles, Pedro Montealegre, Martín Ríos-Wilson, Guillaume Theyssier
Download: PDF
Abstract: An automata network is a network of entities, each holding a state from a finite set and evolving according to a local update rule which depends only on its neighbors in the network's graph. It is freezing if there is an order on states such that the state evolution of any node is non-decreasing in any orbit. They are commonly used to model epidemic propagation, diffusion phenomena like bootstrap percolation or cristal growth. In this paper we establish how treewidth and maximum degree of the underlying graph are key parameters which influence the overall computational complexity of finite freezing automata networks. First, we define a general model checking formalism that captures many classical decision problems: prediction, nilpotency, predecessor, asynchronous reachability. Then, on one hand, we present an efficient parallel algorithm that solves the general model checking problem in NC for any graph with bounded degree and bounded treewidth. On the other hand, we show that these problems are hard in their respective classes when restricted to families of graph with polynomially growing treewidth. For prediction, predecessor and asynchronous reachability, we establish the hardness result with a fixed set-defiend update rule that is universally hard on any input graph of such families.

at May 26, 2020 12:00 AM UTC

Authors: Amandeep Singh Bhatia, Ajay Kumar
Download: PDF
Abstract: Quantum computing is a winsome field that concerns with the behaviour and nature of energy at the quantum level to improve the efficiency of computations. In recent years, quantum computation is receiving much attention for its capability to solve difficult problems efficiently in contrast to classical computers. Specifically, some well-known public-key cryptosystems depend on the difficulty of factoring large numbers, which takes a very long time. It is expected that the emergence of a quantum computer has the potential to break such cryptosystems by 2020 due to the discovery of powerful quantum algorithms (Shor's factoring, Grover's search algorithm and many more). In this paper, we have designed a quantum variant of the second fastest classical factorization algorithm named "Quadratic Sieve". We have constructed the simulation framework of quantized quadratic sieve algorithm using high-level programming language Mathematica. Further, the simulation results are performed on a classical computer to get a feel of the quantum system and proved that it is more efficient than its classical variants from computational complexity point of view.

at May 26, 2020 12:00 AM UTC

Authors: Divesh Aggarwal, Eldon Chung
Download: PDF
Abstract: Bl\"omer and Seifert showed that $\mathsf{SIVP}_2$ is NP-hard to approximate by giving a reduction from $\mathsf{CVP}_2$ to $\mathsf{SIVP}_2$ for constant approximation factors as long as the $\mathsf{CVP}$ instance has a certain property. In order to formally define this requirement on the $\mathsf{CVP}$ instance, we introduce a new computational problem called the Gap Closest Vector Problem with Bounded Minima. We adapt the proof of Bl\"omer and Seifert to show a reduction from the Gap Closest Vector Problem with Bounded Minima to $\mathsf{SIVP}$ for any $\ell_p$ norm for some constant approximation factor greater than $1$.

In a recent result, Bennett, Golovnev and Stephens-Davidowitz showed that under Gap-ETH, there is no $2^{o(n)}$-time algorithm for approximating $\mathsf{CVP}_p$ up to some constant factor $\gamma \geq 1$ for any $1 \leq p \leq \infty$. We observe that the reduction in their paper can be viewed as a reduction from $\mathsf{Gap3SAT}$ to the Gap Closest Vector Problem with Bounded Minima. This, together with the above mentioned reduction, implies that, under Gap-ETH, there is no $2^{o(n)}$-time algorithm for approximating $\mathsf{SIVP}_p$ up to some constant factor $\gamma \geq 1$ for any $1 \leq p \leq \infty$.

at May 26, 2020 12:00 AM UTC

Authors: N. Joseph Tatro, Stefan C. Schonsheck, Rongjie Lai
Download: PDF
Abstract: For non-Euclidean data such as meshes of humans, a prominent task for generative models is geometric disentanglement; the separation of latent codes for intrinsic (i.e. identity) and extrinsic (i.e. pose) geometry. This work introduces a novel mesh feature, the conformal factor and normal feature (CFAN), for use in mesh convolutional autoencoders. We further propose CFAN-VAE, a novel architecture that disentangles identity and pose using the CFAN feature and parallel transport convolution. CFAN-VAE achieves this geometric disentanglement in an unsupervised way, as it does not require label information on the identity or pose during training. Our comprehensive experiments, including reconstruction, interpolation, generation, and canonical correlation analysis, validate the effectiveness of the unsupervised geometric disentanglement. We also successfully detect and recover geometric disentanglement in mesh convolutional autoencoders that encode xyz-coordinates directly by registering its latent space to that of CFAN-VAE.

at May 26, 2020 12:00 AM UTC

Authors: Jingwei Huang, Yichao Zhou, Leonidas Guibas
Download: PDF
Abstract: We present ManifoldPlus, a method for robust and scalable conversion of triangle soups to watertight manifolds. While many algorithms in computer graphics require the input mesh to be a watertight manifold, in practice many meshes designed by artists are often for visualization purposes, and thus have non-manifold structures such as incorrect connectivity, ambiguous face orientation, double surfaces, open boundaries, self-intersections, etc. Existing methods suffer from problems in the inputs with face orientation and zero-volume structures. Additionally most methods do not scale to meshes of high complexity. In this paper, we propose a method that extracts exterior faces between occupied voxels and empty voxels, and uses a projection-based optimization method to accurately recover a watertight manifold that resembles the reference mesh. Compared to previous methods, our methodology is simpler. It does not rely on face normals of the input triangle soups and can accurately recover zero-volume structures. Our algorithm is scalable, because it employs an adaptive Gauss-Seidel method for shape optimization, in which each step is an easy-to-solve convex problem. We test ManifoldPlus on ModelNet10 and AccuCity datasets to verify that our methods can generate watertight meshes ranging from object-level shapes to city-level models. Furthermore, through our experimental evaluations, we show that our method is more robust, efficient and accurate than the state-of-the-art. Our implementation is publicly available.

at May 26, 2020 12:00 AM UTC

Authors: Jingwei Huang, Chiyu Max Jiang, Baiqiang Leng, Bin Wang, Leonidas Guibas
Download: PDF
Abstract: We present MeshODE, a scalable and robust framework for pairwise CAD model deformation without prespecified correspondences. Given a pair of shapes, our framework provides a novel shape feature-preserving mapping function that continuously deforms one model to the other by minimizing fitting and rigidity losses based on the non-rigid iterative-closest-point (ICP) algorithm. We address two challenges in this problem, namely the design of a powerful deformation function and obtaining a feature-preserving CAD deformation. While traditional deformation directly optimizes for the coordinates of the mesh vertices or the vertices of a control cage, we introduce a deep bijective mapping that utilizes a flow model parameterized as a neural network. Our function has the capacity to handle complex deformations, produces deformations that are guaranteed free of self-intersections, and requires low rigidity constraining for geometry preservation, which leads to a better fitting quality compared with existing methods. It additionally enables continuous deformation between two arbitrary shapes without supervision for intermediate shapes. Furthermore, we propose a robust preprocessing pipeline for raw CAD meshes using feature-aware subdivision and a uniform graph template representation to address artifacts in raw CAD models including self-intersections, irregular triangles, topologically disconnected components, non-manifold edges, and nonuniformly distributed vertices. This facilitates a fast deformation optimization process that preserves global and local details. Our code is publicly available.

at May 26, 2020 01:46 AM UTC

Authors: Omid Daei, Keivan Navi, Mariam Zomorodi-Moghadam
Download: PDF
Abstract: The main objective of this paper is to improve the communication costs in distributed quantum circuits. To this end, we present a method for generating distributed quantum circuits from monolithic quantum circuits in such a way that communication between partitions of a distributed quantum circuit is minimized. Thus, the communication between distributed components is performed at a lower cost. Compared to existing works, our approach can effectively map a quantum circuit into an appropriate number of distributed components. Since teleportation is usually the protocol used to connect components in a distributed quantum circuit, our approach ultimately reduces the number of teleportations. The results of applying our approach to the benchmark quantum circuits determine its effectiveness and show that partitioning is a necessary step in constructing distributed quantum circuit.

at May 26, 2020 12:00 AM UTC

Authors: Yuval Filmus, Meena Mahajan, Gaurav Sood, Marc Vinyals
Download: PDF
Abstract: We study the MaxRes rule in the context of certifying unsatisfiability. We show that it can be exponentially more powerful than tree-like resolution, and when augmented with weakening (the system MaxResW), p-simulates tree-like resolution. In devising a lower bound technique specific to MaxRes (and not merely inheriting lower bounds from Res), we define a new semialgebraic proof system called the SubCubeSums proof system. This system, which p-simulates MaxResW, is a special case of the Sherali-Adams proof system. In expressivity, it is the integral restriction of conical juntas studied in the contexts of communication complexity and extension complexity. We show that it is not simulated by Res. Using a proof technique qualitatively different from the lower bounds that MaxResW inherits from Res, we show that Tseitin contradictions on expander graphs are hard to refute in SubCubeSums. We also establish a lower bound technique via lifting: for formulas requiring large degree in SubCubeSums, their XOR-ification requires large size in SubCubeSums.

at May 26, 2020 12:00 AM UTC

Authors: Tobias Christiani
Download: PDF
Abstract: Weighted minwise hashing is a standard dimensionality reduction technique with applications to similarity search and large-scale kernel machines. We introduce a simple algorithm that takes a weighted set $x \in \mathbb{R}_{\geq 0}^{d}$ and computes $k$ independent minhashes in expected time $O(k \log k + \Vert x \Vert_{0}\log( \Vert x \Vert_1 + 1/\Vert x \Vert_1))$, improving upon the state-of-the-art BagMinHash algorithm (KDD '18) and representing the fastest weighted minhash algorithm for sparse data. Our experiments show running times that scale better with $k$ and $\Vert x \Vert_0$ compared to ICWS (ICDM '10) and BagMinhash, obtaining $10$x speedups in common use cases. Our approach also gives rise to a technique for computing fully independent locality-sensitive hash values for $(L, K)$-parameterized approximate near neighbor search under weighted Jaccard similarity in optimal expected time $O(LK + \Vert x \Vert_0)$, improving on prior work even in the case of unweighted sets.

at May 26, 2020 12:00 AM UTC

Authors: an Ben Basat, Xiaoqi Chen, Gil Einziger, Shir Landau Feibish, Danny Raz, Minlan Yu
Download: PDF
Abstract: Network-wide traffic analytics are often needed for various network monitoring tasks. These measurements are often performed by collecting samples at network switches, which are then sent to the controller for aggregation. However, performing such analytics without ``overcounting'' flows or packets that traverse multiple measurement switches is challenging. Therefore, existing solutions often simplify the problem by making assumptions on the routing or measurement switch placement.

We introduce AROMA, a measurement infrastructure that generates a uniform sample of packets and flows regardless of the topology, workload and routing. Therefore, AROMA can be deployed in many settings, and can also work in the data plane using programmable PISA switches. The AROMA infrastructure includes controller algorithms that approximate a variety of essential measurement tasks while providing formal accuracy guarantees. Using extensive simulations on real-world network traces, we show that our algorithms are competitively accurate compared to the best existing solutions despite the fact that they make no assumptions on the underlying network or the placement of measurement switches.

at May 26, 2020 12:00 AM UTC

Authors: Marvin Künnemann, Dániel Marx
Download: PDF
Abstract: To study the question under which circumstances small solutions can be found faster than by exhaustive search (and by how much), we study the fine-grained complexity of Boolean constraint satisfaction with size constraint exactly $k$. More precisely, we aim to determine, for any finite constraint family, the optimal running time $f(k)n^{g(k)}$ required to find satisfying assignments that set precisely $k$ of the $n$ variables to $1$.

Under central hardness assumptions on detecting cliques in graphs and 3-uniform hypergraphs, we give an almost tight characterization of $g(k)$ into four regimes: (1) Brute force is essentially best-possible, i.e., $g(k) = (1\pm o(1))k$, (2) the best algorithms are as fast as current $k$-clique algorithms, i.e., $g(k)=(\omega/3\pm o(1))k$, (3) the exponent has sublinear dependence on $k$ with $g(k) \in [\Omega(\sqrt[3]{k}), O(\sqrt{k})]$, or (4) the problem is fixed-parameter tractable, i.e., $g(k) = O(1)$.

This yields a more fine-grained perspective than a previous FPT/W[1]-hardness dichotomy (Marx, Computational Complexity 2005). Our most interesting technical contribution is a $f(k)n^{4\sqrt{k}}$-time algorithm for SubsetSum with precedence constraints parameterized by the target $k$ -- particularly the approach, based on generalizing a bound on the Frobenius coin problem to a setting with precedence constraints, might be of independent interest.

at May 26, 2020 12:00 AM UTC

Authors: Md. Helal Ahmed, Dr. Jagmohan Tanti, Sumant Pushp
Download: PDF
Abstract: In this paper, we present the fast computational algorithms for the Jacobi sums of orders $l^2$ and $2l^{2}$ with odd prime $l$ by formulating them in terms of the minimum number of cyclotomic numbers of the corresponding orders. We also implement two additional algorithms to validate these formulae, which are also useful for the demonstration of the minimality of cyclotomic numbers required.

at May 26, 2020 12:00 AM UTC

From the Virtual Transition Team for EC 2020: Due to concerns regarding the novel coronavirus COVID-19, the 2020 ACM Conference on Economics and Computation (EC 2020) will be held virtually.  This change of format offers exciting new opportunities. This guide can also be found on the EC 2020 webpage.

Overview

June 15 – 19: Mentoring Workshop and Live Tutorial Pre-recording Sessions.
June 22 – July 3: Live EC Paper Pre-recording Plenary Sessions.
July 13: Tutorial Watch Parties, Business Meeting and Poster Session
July 14 – 16: EC Conference (Paper Watch Parties, Paper Poster Sessions, and Plenaries).
July 17 – 22: Workshops.

Philosophy

The planning committee has been hard at work revisioning EC 2020 as a virtual event.  The event aims to emphasize opportunities afforded by the virtual format with activities that are intractable in the physical format, but not to recreate aspects of the physical conference experience that are difficult virtually.  Some aspects of the event will look similar to a physical conference, while some will be quite different.  The desiderata for the selected format are:

  • Maximize exposure for EC papers.
  • Maximize interaction between community members.
  • Allow broad global accessibility.

The organizational team studied the results of many virtual conferences that have already taken place (AAMAS, ASPLOS, EuroSys, ICCP, ICLR, WAGON, among others, as well as the ACM guidelines).  Some general rules of thumb are (a) prerecording talks, (b) emphasizing posters, (c) expecting about four hours a day of participation, (d) prime time is 11am-3pm ET.  

The planning committee adopted a strategy that allows the minimal commitment level for authors and attendees of presenting and viewing 18 minute talks and attending activities only during the main EC week of July 13-17.  However, official EC events will run June 15 to July 24 and the planning committee highly encourages members of the community to enjoy a more relaxed pace where all programming is plenary.

Registration

Registration will be mandatory but complementary with ACM SIGecom membership ($10 Professional / $5 Student) which can easily be completed online.  Registration will open on June 1.

Mentoring Workshop and Live Tutorial Pre-recording Sessions

June 15-19: To give junior researchers plenty of time to prepare for the EC conference, the mentoring workshops and tutorial live pre-recording sessions will be held the week of June 15.  In addition to the usual activities, junior students will be paired with senior students who will share their EC itinerary and be available for discussions of papers and events in online chat throughout the duration of the EC events.  There will be three tutorials, each broken into four 45 minute segments and recorded over several days and with live audiences of EC participants.  Students will be able to work together on exercises in between sessions.

Live Paper Pre-recording Plenary Sessions

June 22 – July 3:  EC papers will keep the usual 18-minute format.  The planning committee highly encourages authors to choose to pre-record their EC talks in live pre-recording sessions.  Each session comprises three papers and is followed by a virtual coffee break for discussion between the speakers, coauthors, and attendees.  Speakers and attendees are recommended to schedule for 2 hours.  These sessions are all plenary and will be scheduled for synergies in topic and preferred timing of the speakers.  These sessions begin at regular times:

  • US Eastern/China 9:00, 13:00, 17:00, 21:00, 1:00, 5:00.
  • US Central/Israel  8:00, 12:00, 16:00, 20:00, 24:00, 4:00.
  • US Mountain/Europe 7:00, 11:00, 15:00, 19:00, 23:00, 3:00.
  • US Pacific/London  6:00, 10:00, 14:00, 18:00, 22:00, 2:00.

We expect about three sessions a day over the ten weekdays between June 22 and July 3.  (Pre-recording sessions are optional but highly encouraged for authors of EC papers.)

EC Conference

July 13-16: July 13 is tutorial day.  It will include the EC business meeting and a contributed poster session for breaking results and results from other venues.  The main conference runs July 14-July 16 with programming from 9am-5pm Eastern.  This programming includes pre-recorded talk watch parties, poster discussion sessions for the papers, live plenary talks, and live highlights beyond EC.  The paper and poster discussion sessions will run in two-hour blocks as follows:

Hour 1: 3-5 parallel tracks of watch parties (3 papers each), with realtime chat with the authors.

Hour 2: plenary 1-minute lightning talks for all papers, breakout room for discussion for each paper with poster (a poster is 1-4 slides in PDF). (The lightning talks and poster discussion rooms are optional but highly encouraged for authors of the session’s papers.)

As these watch parties will be in parallel sessions and some may be at times that are hard for some members of the community to attend, the planning committee highly encourages participation in the live pre-recording sessions (June 22-July 3) and all talks will be available for individual viewing in advance of the conference. 

EC Workshops

July 17-22: Three workshops will take place on the Friday of EC and the following week.  Details for these events are still being worked out by the organizers.

Best Presentation by Student or Postdoctoral Research Award.

The Best Presentation by a Student or Postdoctoral Researcher Award is designed to encourage and acknowledge excellence in oral presentations by students and recent graduates. In addition to an honorarium, this year’s winner will be invited to present in a special session at WINE 2020 with expenses covered by SIGecom. To be considered for the award, the presenter must participate in a live pre-recording session.  This year will feature several “fun” categories for video presentations. Audience nominations are requested for creative and fun videos in the categories of: Best Video, Best Cameo by a Child or Pet, Best Special Effects, Best Soundtrack, or Best [Insert Your Nomination Here]. Submit nominations here.

Virtual Transition Team

Virtual aspects of the conference are being coordinated by a Virtual Transition Team appointed by the SIGecom Executive Committee, that in addition to the Executive Committee and the EC 2020 organizing committee includes the following officers:

  • Virtual General Chair: Jason Hartline
  • Virtual Local Chair: Yannai Gonczarowski
  • Virtual Global Outreach Chairs: Rediet Abebe and Eric Sodomka

The team is looking forward to interacting with everyone in the community at an outstanding conference!

by Jason Hartline at May 25, 2020 09:37 PM UTC

(The Baseball season is delayed or cancelled, so I post about baseball instead.)

This post is going to ask a question that you could look up on the web. But what fun with that be?

The following statements are true

1) Don Larsen, a professional baseball player who played from 1953 to 1967, is still alive. He is 90 years old (or perhaps 90 years young---I don't know the state of his health).  He was born Aug 7, 1929. He is best know for pitching a perfect game in the World Series in 1956, pitching for the Yankees. He played for several other teams as well, via trades (this was before free agency).
(CORRECTION- I wrote this post a while back, and Don Larsen has died since then.)


2) White Ford, a professional baseball player who played from 1950 to 1967, is still alive. He is 91 years old (or perhaps 91 years young---I don't know the state of his health).  He was born Oct 21,  1928. He had many great seasons and is in the hall of fame. He played for the New York Yankees and no other team.

3) From 1900 (or so) until 1962 there were 16 professional baseball teams which had 25 people each. From 1962 until 1969 there were 20 teams which had 25 people each. There were also many minor league teams.

4) The youngest ballplayers are usually around 20. The oldest around 35. These are not exact numbers

SO here is my question: Try to estimate

1) How many LIVING  retired major league baseball players are there now who are older than Don Larsen?

2) How many LIVING retired major league baseball players are of an age between Don and Whitey?

3) How  many LIVING retired major league baseball players are older than Whitey Ford?

Give your REASONING for your answer.

by GASARCH (noreply@blogger.com) at May 25, 2020 04:34 PM UTC


Why is the proof so short yet so difficult?

Saeed Salehi is a logician at the University of Tabriz in Iran. Three years ago he gave a presentation at a Moscow workshop on proofs of the diagonal lemma.

Today I thought I would discuss the famous diagonal lemma.

The lemma is related to Georg Cantor’s famous diagonal argument yet is different. The logical version imposes requirements on when the argument applies, and requires that it be expressible within a formal system.

The lemma underpins Kurt Gödel’s famous 1931 proof that arithmetic is incomplete. However, Gödel did not state it as a lemma or proposition or theorem or anything else. Instead, he focused his attention on what we now call Gödel numbering. We consider this today as “obvious” but his paper’s title ended with “Part I”. And he had readied a “Part II” with over 100 pages of calculations should people question that his numbering scheme was expressible within the logic.

Only after his proof was understood did people realize that one part, perhaps the trickiest part, could be abstracted into a powerful lemma. The tricky part is not the Gödel numbering. People granted that it can be brought within the logic once they saw enough of Gödel’s evidence, and so we may write {\ulcorner \phi \urcorner} for the function giving the Gödel number of any formula {\phi} and use that in other formulas. The hard part is what one does with such expressions.

This is what we will try to motivate.

Tracing the Lemma

Rudolf Carnap is often credited with the first formal statement, in 1934, for instance by Eliott Mendelson in his famous textbook on logic. Carnap was a member of the Vienna Circle, which Gödel frequented, and Carnap is considered a giant among twentieth-century philosophers. He worked on sweeping grand problems of philosophy, including logical positivism and analysis of human language via syntax before semantics. Yet it strikes us with irony that his work on the lemma may be the best remembered.

Who did the lemma first? Let’s leave that for others and move on to the mystery of how to prove the lemma once it is stated. I must say the lemma is easy to state, easy to remember, and has a short proof. But I believe that the proof is not easy to remember or even follow.

Salehi’s presentation quotes others’ opinions about the proof:

{\bullet } Sam Buss: “Its proof [is] quite simple but rather tricky and difficult to conceptualize.”

{\bullet} György Serény (we jump to Serény’s paper): “The proof of the lemma as it is presented in textbooks on logic is not self-evident to say the least.”

{\bullet } Wayne Wasserman: “It is `Pulling a Rabbit Out of the Hat’—Typical Diagonal Lemma Proofs Beg the Question.”

So I am not alone, and I thought it might be useful to try and unravel its proof. This exercise helped me and maybe it will help you.

Here goes.

Stating the Lemma

Let {S(w)} be a formula in Peano Arithmetic ({PA}). We claim that there is some sentence {\phi} so that

\displaystyle  PA \vdash \phi \iff S(\ulcorner \phi \urcorner).

Formally,

Lemma 1 Suppose that {S(x)} is some formula in {PA}. Then there is a sentence {\phi} so that

\displaystyle  PA \vdash \phi \iff S(\ulcorner \phi \urcorner).

The beauty of this lemma is that it was used by Gödel and others to prove various powerful theorems. For example, the lemma quickly proves this result of Alfred Tarski:

Theorem 2 Suppose that {PA} is consistent. Then truth cannot be defined in {PA}. That is there is no formula {Tr(x)} so that for all sentences {\phi} {PA} proves

\displaystyle  \phi \iff Tr(\ulcorner \phi \urcorner).

The proof is this. Assume there is such a formula {Tr(x)}. Then use the diagonal lemma and get

\displaystyle  \phi \iff \neg Tr(\ulcorner \phi \urcorner).

This shows that

\displaystyle  \phi \iff \neg Tr(\ulcorner \phi \urcorner) \iff Tr(\ulcorner \phi \urcorner).

This is a contradiction. A short proof.

The Proof

The key is to define the function {F(n)} as follows: Suppose that {n} is the Gödel number of a formula of the form {A(x)} for some variable {x} then

\displaystyle  F(n) = \ulcorner A(\ulcorner A(x) \urcorner) \urcorner.

If {n} is not of this form then define {F(n)=0}. This is a strange function, a clever function, but a perfectly fine function, It certainly maps numbers to numbers. It is certainly recursive, actually it is clearly computable in polynomial time for any reasonable Gödel numbering. Note: the function {F} does depend on the choice of the variable {x}. Thus,

\displaystyle  F(\ulcorner y=0 \urcorner) = \ulcorner (\ulcorner y=0 \urcorner)=0 \urcorner,

and

\displaystyle  F(\ulcorner x=0 \urcorner) = \ulcorner (\ulcorner x=0 \urcorner)=0 \urcorner.

Now we make two definitions:

\displaystyle  \begin{array}{rcl}        g(w) &\equiv& S(F(w)) \\        \phi &\equiv& g(\ulcorner g(x) \urcorner). \end{array}

Now we compute just using the definitions of {F, g, \phi}:

\displaystyle  \begin{array}{rcl}        \phi &=& g(\ulcorner g(x) \urcorner) \\                 &=& S(F(\ulcorner g(x) \urcorner)) \\           &=& S(\ulcorner g(\ulcorner g(x) \urcorner) \urcorner) \\               &=& S(\ulcorner \phi \urcorner). \end{array}

We are done.

But …

Where did this proof come from? Suppose that you forgot the proof but remember the statement of the lemma. I claim that we can then reconstruct the proof.

First let’s ask: Where did the definition of the function {F} come from? Let’s see. Imagine we defined

\displaystyle  \begin{array}{rcl}        g(w) &\equiv& S(F(w)) \\        \phi &\equiv& g(\ulcorner g(x) \urcorner). \end{array}

But left {F} undefined for now. Then

\displaystyle  \begin{array}{rcl}        \phi &=& g(\ulcorner g(x) \urcorner) \\                 &=& S(F(\ulcorner g(x) \urcorner)). \end{array}

But we want {\phi = S(\ulcorner \phi \urcorner)} that happens provided:

\displaystyle  \ulcorner g(\ulcorner g(x) \urcorner) \urcorner) = F(\ulcorner g(x) \urcorner).

This essentially gives the definition of the function {F}. Pretty neat.

But but …

Okay where did the definition of {g} and {\phi} come from? It is reasonable to define

\displaystyle  g(w) \equiv S(F(w)),

for some {F}. We cannot change {S} but we can control the input to the formula {S}, so let’s put a function there. Hence the definition for {g} is not unreasonable.

Okay how about the definition of {\phi}? Well we could argue that this is the magic step. If we are given this definition then {F} follows, by the above. I would argue that {\phi} is not completely surprising. The name of the lemma is after all the “diagonal” lemma. So defining {\phi} as the application of {g} to itself is plausible.

Taking an Exam

Another way to think about the diagonal lemma is imagine you are taking an exam in logic. The first question is:

Prove in {PA} that for any {S(x)} there is a sentence {\phi} so that

\displaystyle  \phi \iff S(\ulcorner \phi \urcorner).

You read the question again and think: “I wish I had studied harder, I should have not have checked Facebook last night. And then went out and {\dots}” But you think let’s not panic, let’s think.

Here is what you do. You say let me define

\displaystyle  g(x) = S(F(x)),

for some {F}. You recall there was a function that depends on {S}, and changing the input from {x} to {F(x)} seems to be safe. Okay you say, now what? I need the definition of {F}. Hmmm let me wait on that. I recall vaguely that {F} had a strange definition. I cannot recall it, so let me leave it for now.

But you think: I need a sentence {\phi}. A sentence cannot have an unbound variable. So {\phi} cannot be {g(x)}. It could be {g(m)} for some {m}. But what could {m} be? How about {\ulcorner \phi \urcorner}. This makes

\displaystyle  \phi = g(\ulcorner g \urcorner).

It is after all the diagonal lemma. Hmmm does this work. Let’s see if this works. Wait as above I get that {F} is now forced to satisfy

\displaystyle  F(\ulcorner g(x) \urcorner) = \ulcorner g(\ulcorner g(x) \urcorner) \urcorner.

Great this works. I think this is the proof. Wonderful. Got the first question.

Let’s look at the next exam question. Oh no {\dots}

Open Problems

Does this help? Does this unravel the mystery of the proof? Or is it still magic?


[Fixed equation formatting]

by rjlipton at May 25, 2020 03:55 PM UTC

    Paper    Data

In our new paper, we explore how closely the ImageNet benchmark aligns with the object recognition task it serves as a proxy for. We find pervasive and systematic deviations of ImageNet annotations from the ground truth, which can often be attributed to specific design choices in the data collection pipeline. These issues indicate that ImageNet accuracy alone might be insufficient to effectively gauge real model performance.

Contextualizing Progress on Benchmarks

Large-scale benchmarks are central to machine learning—they serve both as concrete targets for model development, and as proxies for assessing model performance on real-world tasks we actually care about. However, few benchmarks are perfect, and so as our models get increasingly better at them, we must also ask ourselves: to what extent is performance on existing benchmarks indicative of progress on the real-world tasks that motivate them?

In this post, we will explore this question of benchmark-task alignment in the context of the popular ImageNet object recognition dataset. Specifically, our goal is to understand how well the underlying ground truth is captured by the dataset itself—this dataset is, after all, what we consider to be the gold standard during model training and evaluation.

A sneak peak into ImageNet

The ImageNet dataset contains over a million images of objects from a thousand, quite diverse classes. Like many other benchmarks of that scale, ImageNet was not carefully curated by experts, but instead created via crowd-sourcing, without perfect quality control. So what does ImageNet data look like? Here are a few image-label pairs from the dataset:

missile
projectile
stage
acoustic guitar
monastery
church
Norwich terrier
Norfolk terrier


These samples appear pretty reasonable…but are they? Actually, while these are indeed images from the dataset, the labels shown above are not their actual ImageNet labels! [Click to see the actual ImageNet labels] Still, even though not “correct” from the point of view of the ImageNet dataset, these labels do correspond to actual ImageNet classes, and appear plausible when you see them in isolation. This shows that for ImageNet images, which capture objects in diverse real-world conditions, the ImageNet label may not properly reflect the ground truth.

In our work, we dive into examining how this label misalignment actually impacts ImageNet: how often do ImageNet labels deviate from the ground truth? And how do shortcomings in these labels impact ImageNet-trained models?

Revisiting the ImageNet collection pipeline

Before going further, let’s take a look at how ImageNet was created. To build such a large dataset, the creators of ImageNet had to leverage scalable methods like automated data collection and crowd-sourcing. That is, they first selected a set of object classes (using the WordNet hierarchy), and queried various search engines to obtain a pool of candidate images. These candidate images were then verified by annotators on Mechanical Turk (MTurk) using (what we will refer to as) the Contains task: annotators were shown images retrieved for a specific class label, and were subsequently asked to select the ones that actually contain an object of this class. Only images that multiple annotators validated ended up in the final dataset.

Class Selection
Image Retrieval
Label Validation
Imagenet collection pipeline: Click on a stage at the top for an illustration.

While this is a natural approach to scalably annotate data (and, in fact, is commonly used to create large-scale benchmarks—e.g., PASCAL VOC, COCO, places), it has an important caveat. Namely, this process has an inherent bias: the annotation task itself is phrased as a leading question. ImageNet annotators were not asked to provide an image label, but instead only to verify if a specific label (predetermined by the image retrieval process) was contained in an image. Annotators had no knowledge of what the other classes in the dataset even were, or the granularity at which they were required to make distinctions. In fact, they were explicitly instructed to ignore clutter and obstructions.

Looking back at the ImageNet samples shown above, one can see how this setup could lead to imperfect annotations. For instance, it is unclear if the average annotator knows the differences between a “Norwich terrier” and a “Norfolk terrier”, especially if they don’t even know that both of these (as well as 22 other terrier breeds) are valid ImageNet classes. Also, the Contains task itself might be ill-suited for annotating multi-object images—the answer to the Contains question would be yes for any object in the image that corresponds to an ImageNet class. It is not unthinkable that the same images could have made it into ImageNet under the labels “stage” and “Norwich terrier” had they come up in the search results for those classes instead.

Overall, this suggests that the labeling issues in ImageNet may go beyond just occasional annotator mistakes—the design of the data collection pipeline itself could have caused these labels to systematically deviate from the ground truth.

Diagnosing benchmark-task misalignment

To characterize how wide-spread these deviations are, we first need to get a better grasp of the ground truth for ImageNet data. In order to do this at scale, we still need to rely on crowd-sourcing. However, in contrast to the original label validation setup, we design a new annotation task based directly on image classification. Namely, we present annotators with a set of possible labels for a single image simultaneously. We then ask them to assign one label to every object in the image, and identify what they believe to be the main object. (Note that we intentionally ask for such fine-grained image annotations since, as we saw before, a single label might be inherently insufficient to capture the ground truth.)

Of course, we need to ensure that annotators can meaningfully perform this task. To this end we devise a way to narrow down the label choices they are presented with (all thousand ImageNet classes would be nearly impossible for a worker to choose between!). Specifically, for each image, we identify the most relevant labels by pooling together the top-5 predictions of a diverse set of ImageNet models and filtering them via the Contains task. Note that, by doing so, we are effectively bootstrapping the existing ImageNet labels by first using them to train models and then using model predictions to get better annotation candidates.

This is what our resulting annotation task looks like:

We aggregate the responses from multiple annotators to get per-image estimates of the number of objects in the image (along with their corresponding labels), as well as which object humans tend to view as the main one.

We collect such annotations for 10k images from the ImageNet validation set. With these more fine-grained and accurate annotations in hand, we now examine where the original ImageNet labels may fall short.

Multi-object images

The simplest way in which ImageNet labels could deviate from the ground truth is if the image contains multiple objects. So, the first thing we want to understand is: how many ImageNet images contain objects from more than one valid class?

It turns out: quite a few! Indeed, more than 20% of the images contain more than one ImageNet object. Examples:

Inspect Image
Image
Annotation
Multi-object images: Choose an image on the left to see the annotations we obtain for it.

Looking at some of these images, it is clear that the problem is not just natural image clutter but also the fact that certain objects are quite likely to co-occur in the real-world—e.g., “table lamp” and “lamp shade”. This means that choosing classes which in principle correspond to distinct objects (e.g., using WordNet) is not enough to guarantee that the corresponding images have unambiguous labels. For example, see if you can guess the ImageNet label for the samples below:

Chosen class pair
Show Labels
Co-occurring objects: Select a class pair on the top and see if you can identify which images correspond to each class (click on "Show Labels" to reveal the answers).

Model performance on multi-object images

So, how do models deal with images that contain multiple objects? To understand this, we evaluate a number of models (from AlexNet to EfficientNet-B7), and measure their accuracy (w.r.t. the ImageNet labels) on such images. We plot these accuracies below (as a function of their full test accuracy):

Performance on multi-object images: Model accuracy variation as a function of the number of objects in the image. Hover over each data point to see the corresponding model name.

Across the board, in comparison to their performance on single-object images, models suffer around a 10% accuracy drop on multi-object ones. At the same time, this drop more-or-less disappears if we consider a model prediction to be correct if it matches the label of any valid object in the image (see the paper for specifics).

Aside: The original motivation of top-5 accuracy was exactly to accommodate such multi-label images. However, we find that, while it does largely account for these multi-object confusions, it also seems to overestimate accuracy on single-object images.

Still, even though models seem to struggle with multi-object images, they perform much better than chance (i.e., better than what one would get if they were picking the label of an object in the image at random). This makes sense when the image has a single prominent object that also matches the ImageNet label. However, for a third of all multi-object images the ImageNet label does not even match what annotators deem to be the main object in the image. Yet, even in these cases, models still successfully predict the ImageNet label (instead of what humans consider to be the right label for the image)!

Performance on images with annotator-label disagreement: Model accuracy on images where annotators do not consider the ImageNet label to be the main object. Baseline: accuracy of picking the label of a random prominent object in the image.

Here, models seem to base their predictions on biases in the dataset which humans do not find salient. For instance, models get high accuracy on the class “pickelhaube”, even though, pickelhaubes are usually present in images with other, more salient objects, such as “military uniforms”, suggesting that ImageNet models may be overly sensitive to the presence of distinctive objects in the image. While exploiting such biases would improve ImageNet accuracy, this strategy might not translate to improved performance on object recognition in the wild. Here are a few examples that seem to exhibit a similar mismatch:

Inspect Image
Image
Annotation
Annotator-label disagreement: Select an image on the left to see the annotations we obtain for it. Notice that the ImageNet label is different from the (annotator-selected) "main label".

Biases in label validation

Let us now turn our attention to the ImageNet data filtering process. Recall that each class in ImageNet was constructed by automatically retrieving many images and filtering them (via the Contains task described above). How likely were annotators to filter out mislabeled images under this setup?

To understand this, we replicate the original filtering process on the existing ImageNet images. But this time, instead of only asking annotators to check if the image is valid with respect to its ImageNet label (i.e., the search query), we also try several other labels (each in isolation, with different sets of annotators).

We find that annotators frequently deem an image to be valid for many different labels—even when only one object is present. Typically, this occurs when the image is ambiguous and lacks enough context (e.g. “seashore” or “lakeshore”), or annotators are likely confused between different semantically similar labels (e.g., “assault rifle” vs. “rifle”, dog breeds). It turns out that this confusion, at least partly, stems from the one-sidedness of the Contains task—i.e., asking annotators to ascertain the validity of a specific label without them knowing about any other options. If instead we present annotators with all the relevant labels simultaneously and ask them to choose one (as we did in our annotation setup), this kind of label confusion is alleviated: annotators select significantly fewer labels in total (see our paper for details). So, even putting annotator’s expertise aside, the specific annotation task setup itself drastically affects the quality of the resulting dataset labels.

Number of valid labels: Distribution of labels annotators deem valid (based on an agreement threshold) for single-object images in the Contains task. Click the percentages on the top to change the annotator agreement threshold.

Going back to ImageNet, our findings give us reason to believe that annotators may have had a rather limited ability to correct errors in labeling. Thus, in certain cases, ImageNet labels were largely determined by the automated image retrieval process—propagating any biases or mixups this process might introduce to the final dataset.

In fact, we can actually see direct evidence of that in the ImageNet dataset—there are pairs of classes that appear to be inherently ambiguous (e.g., “laptop computer” and “notebook computer”) and neither human annotators, nor models, can tell the corresponding images apart (see below). If such class pairs actually overlap in terms of their ImageNet images, it is unclear how models can learn to separate them without memorizing specific validation examples.

Chosen class pair
Show Labels
Ambiguous class pairs: Select a class pair on the top and see if you can identify which images correspond to each class (click on "Show Labels" to reveal the answers).

Beyond test accuracy: human-centric model evaluation

Performance of ImageNet-trained models is typically judged based on their ability to predict the dataset labels—yet, as we saw above, these labels may not fully capture the ground truth. Hence, ImageNet accuracy may not reflect properly model performance—for instance, measuring accuracy alone could unfairly penalize models for certain correct predictions on multi-object images. So, how can we better assess model performance?

One approach is to measure model-human alignment directly—we present model predictions to annotators and ask them to gauge their validity:

Human-based model evaluation: Fraction of annotators that select a label as valid in the Contains task (i.e., selection frequency) for ImageNet labels and model predictions. Baseline: (number of correct predictions) × (average selection frequency of the ImageNet label).

Surprisingly, we find that for state-of-the-art models, annotators actually deem the prediction that models make to be valid about as often as the ImageNet label (even when the two do not match). Thus, recent models may be better at predicting the ground truth than their top-1 accuracy (w.r.t. the ImageNet label) would indicate.

However, this does not imply that improving ImageNet accuracy is meaningless. For instance, non-expert annotators may not be able to tell apart certain fine-grained class differences (e.g., dog breeds) and for some of these images the ImageNet label may actually match the ground truth. What it does indicate, though, is that we are at a point where it may be hard to gauge if better performance on ImageNet corresponds to actual progress or merely to exploiting idiosyncrasies of the dataset.

For further experimental details and additional results (e.g., human confusion matrices), take a look at our paper!

Conclusions

We took a closer look at how well ImageNet aligns with the real-world object recognition task—even though ImageNet is used extensively, we rarely question whether its labels actually reflect the ground truth. We saw that oftentimes ImageNet labels do not fully capture image content—e.g., many images have multiple (ImageNet) objects and there are classes that are inherently ambiguous. As a result, models trained using these labels as ground truth end up learning unintended biases and confusions.

Our analysis indicates that when creating datasets we must be aware of (and try to mitigate) ways in which scalable data collection practices can skew the corresponding annotations (see our previous post for another example of such a skew). Finally, given that such imperfections in our datasets could be inevitable, we also need to think about how to reliably assess model performance in their presence.











at May 25, 2020 12:00 AM UTC

Authors: Hubie Chen
Download: PDF
Abstract: The constraint satisfaction problem (CSP) on a finite relational structure B is to decide, given a set of constraints on variables where the relations come from B, whether or not there is a assignment to the variables satisfying all of the constraints; the surjective CSP is the variant where one decides the existence of a surjective satisfying assignment onto the universe of B.

We present an algebraic framework for proving hardness results on surjective CSPs; essentially, this framework computes global gadgetry that permits one to present a reduction from a classical CSP to a surjective CSP. We show how to derive a number of hardness results for surjective CSP in this framework, including the hardness of surjective CSP on the reflexive 4-cycle and on the no-rainbow 3-coloring relation. Our framework thus allows us to unify these hardness results, and reveal common structure among them; we believe that our hardness proof for the reflexive 4-cycle is more succinct than the original. In our view, the framework also makes very transparent a way in which classical CSPs can be reduced to surjective CSPs.

at May 25, 2020 10:25 PM UTC

Authors: Yunzi Ding, Dmitriy Kunisky, Alexander S. Wein, Afonso S. Bandeira
Download: PDF
Abstract: In compressed sensing, the restricted isometry property (RIP) on $M \times N$ sensing matrices (where $M < N$) guarantees efficient reconstruction of sparse vectors. A matrix has the $(s,\delta)$-$\mathsf{RIP}$ property if behaves as a $\delta$-approximate isometry on $s$-sparse vectors. It is well known that an $M\times N$ matrix with i.i.d. $\mathcal{N}(0,1/M)$ entries is $(s,\delta)$-$\mathsf{RIP}$ with high probability as long as $s\lesssim \delta^2 M/\log N$. On the other hand, most prior works aiming to deterministically construct $(s,\delta)$-$\mathsf{RIP}$ matrices have failed when $s \gg \sqrt{M}$. An alternative way to find an RIP matrix could be to draw a random gaussian matrix and certify that it is indeed RIP. However, there is evidence that this certification task is computationally hard when $s \gg \sqrt{M}$, both in the worst case and the average case.

In this paper, we investigate the exact average-case time complexity of certifying the RIP property for $M\times N$ matrices with i.i.d. $\mathcal{N}(0,1/M)$ entries, in the "possible but hard" regime $\sqrt{M} \ll s\lesssim M/\log N$, assuming that $M$ scales proportional to $N$. Based on analysis of the low-degree likelihood ratio, we give rigorous evidence that subexponential runtime $N^{\tilde\Omega(s^2/N)}$ is required, demonstrating a smooth tradeoff between the maximum tolerated sparsity and the required computational power. The lower bound is essentially tight, matching the runtime of an existing algorithm due to Koiran and Zouzias. Our hardness result allows $\delta$ to take any constant value in $(0,1)$, which captures the relevant regime for compressed sensing. This improves upon the existing average-case hardness result of Wang, Berthet, and Plan, which is limited to $\delta = o(1)$.

at May 25, 2020 10:20 PM UTC

Authors: Alexander Barvinok
Download: PDF
Abstract: We consider the problem of computing $\sum_x e^{f(x)}$, where $f(x)=\sum_{ij} a_{ij} \xi_i \xi_j + \sum_i b_i \xi_i$ is a real-valued quadratic function and $x=\left(\xi_1, \ldots, \xi_n\right)$ ranges over the Boolean cube $\{-1, 1\}^n$. We prove that for any $\delta >0$, fixed in advance, the value of $\sum_x e^{f(x)}$ can be approximated within relative error $0 < \epsilon < 1$ is quasi-polynomial $n^{O(\ln n - \ln \epsilon)}$ time, as long as $\sum_j |a_{ij}| \leq 1-\delta$ for all $i$. We apply the method of polynomial interpolation, for which we prove that $\sum_x e^{f(x)} \ne 0$ for complex $a_{ij}$ and $b_i$ such that $\sum_j |\Re\thinspace a_{ij}| \leq 1-\delta$, $\sum_j |\Im\thinspace a_{ij}| \leq \delta^2/10$ and $|\Im\thinspace b_i| \leq \delta^2/10$ for all $i$, which is interpreted as the absence of a phase transition in the Lee - Yang sense in the corresponding Ising model. The bounds are asymptotically optimal. The novel feature of the bounds is that they control the total interaction of each vertex but not every pairwise interaction.

at May 25, 2020 10:28 PM UTC

Authors: Rui Yao, Shlomo Bekhor
Download: PDF
Abstract: Innovative shared mobility services provide on-demand flexible mobility options and have the potential to alleviate traffic congestion. These attractive services are challenging from different perspectives. One major challenge in such systems is to find suitable ride-sharing matchings between drivers and passengers with respect to the system objective and constraints, and to provide optimal pickup and drop-off sequence to the drivers. In this paper, we develop an efficient dynamic tree algorithm to find the optimal pickup and drop-off sequence. The algorithm finds an initial solution to the problem, keeps track of previously explored feasible solutions, and reduces the solution search space when considering new requests. In addition, an efficient pre-processing procedure to select candidate passenger requests is proposed, which further improves the algorithm performance. Numerical experiments are conducted on a real size network to illustrate the efficiency of our algorithm. Sensitivity analysis suggests that small vehicle capacities and loose excess travel time constraints do not guarantee overall savings in vehicle kilometer traveled.

at May 25, 2020 10:27 PM UTC

Authors: Torben Hagerup
Download: PDF
Abstract: A level-ancestor or LA query about a rooted tree $T$ takes as arguments a node $v$ in $T$, of depth $d_v$, say, and an integer $d$ with $0\le d\le d_v$ and returns the ancestor of $v$ in $T$ of depth $d$. The static LA problem is to process a given rooted tree $T$ so as to support efficient subsequent processing of LA queries about $T$. All previous efficient solutions to the static LA problem work by reducing a given instance of the problem to a smaller instance of the same or a related problem, solved with a less efficient data structure, and a collection of small micro-instances for which a different solution is provided. We indicate the first efficient solution to the static LA problem that works directly, without resorting to reductions or micro-instances.

at May 25, 2020 10:29 PM UTC

Authors: Jacques Dark, Christian Konrad
Download: PDF
Abstract: In this paper, we give simple optimal lower bounds on the one-way two-party communication complexity of approximate Maximum Matching and Minimum Vertex Cover with deletions. In our model, Alice holds a set of edges and sends a single message to Bob. Bob holds a set of edge deletions, which form a subset of Alice's edges, and needs to report a large matching or a small vertex cover in the graph spanned by the edges that are not deleted. Our results imply optimal space lower bounds for insertion-deletion streaming algorithms for Maximum Matching and Minimum Vertex Cover.

Previously, Assadi et al. [SODA 2016] gave an optimal space lower bound for insertion-deletion streaming algorithms for Maximum Matching via the simultaneous model of communication. Our lower bound is simpler and stronger in several aspects: The lower bound of Assadi et al. only holds for algorithms that (1) are able to process streams that contain a triple exponential number of deletions in $n$, the number of vertices of the input graph; (2) are able to process multi-graphs; and (3) never output edges that do not exist in the input graph when the randomized algorithm errs. In contrast, our lower bound even holds for algorithms that (1) rely on short ($O(n^2)$-length) input streams; (2) are only able to process simple graphs; and (3) may output non-existing edges when the algorithm errs.

at May 25, 2020 12:00 AM UTC

Authors: Xingwu Liu, Zhida Pan, Yuyi Wang
Download: PDF
Abstract: Motivated by scheduling in Geo-distributed data analysis, we propose a target location problem for multi-commodity flow (LoMuF for short). Given commodities to be sent from their resources, LoMuF aims at locating their targets so that the multi-commodity flow is optimized in some sense. LoMuF is a combination of two fundamental problems, namely, the facility location problem and the network flow problem. We study the hardness and algorithmic issues of the problem in various settings. The findings lie in three aspects. First, a series of NP-hardness and APX-hardness results are obtained, uncovering the inherent difficulty in solving this problem. Second, we propose an approximation algorithm for general undirected networks and an exact algorithm for undirected trees, which naturally induce efficient approximation algorithms on directed networks. Third, we observe separations between directed networks and undirected ones, indicating that imposing direction on edges makes the problem strictly harder. These results show the richness of the problem and pave the way to further studies.

at May 25, 2020 10:27 PM UTC

Authors: Shunsuke Kanda, Koh Takeuchi, Keisuke Fujii, Yasuo Tabei
Download: PDF
Abstract: Massive datasets of spatial trajectories representing the mobility of a diversity of moving objects are ubiquitous in research and industry. Similarity search of a large collection of trajectories is indispensable for turning these datasets into knowledge. Current methods for similarity search of trajectories are inefficient in terms of search time and memory when applied to massive datasets. In this paper, we address this problem by presenting a scalable similarity search for Fr\'echet distance on trajectories, which we call trajectory-indexing succinct trit-array trie (tSTAT). tSTAT achieves time and memory efficiency by leveraging locality sensitive hashing (LSH) for Fr\'echet distance and a trie data structure. We also present two novel techniques of node reduction and a space-efficient representation for tries, which enable to dramatically enhance a memory efficiency of tries. We experimentally test tSTAT on its ability to retrieve similar trajectories for a query from large collections of trajectories and show that tSTAT performs superiorly with respect to search time and memory efficiency.

at May 25, 2020 10:26 PM UTC

Authors: Robert Andrews
Download: PDF
Abstract: We show that lower bounds for explicit constant-variate polynomials over fields of characteristic $p > 0$ are sufficient to derandomize polynomial identity testing over fields of characteristic $p$. In this setting, existing work on hardness-randomness tradeoffs for polynomial identity testing requires either the characteristic to be sufficiently large or the notion of hardness to be stronger than the standard syntactic notion of hardness used in algebraic complexity. Our results make no restriction on the characteristic of the field and use standard notions of hardness.

We do this by combining the Kabanets-Impagliazzo generator with a white-box procedure to take $p$-th roots of circuits computing a $p$-th power over fields of characteristic $p$. When the number of variables appearing in the circuit is bounded by some constant, this procedure turns out to be efficient, which allows us to bypass difficulties related to factoring circuits in characteristic $p$.

We also combine the Kabanets-Impagliazzo generator with recent "bootstrapping" results in polynomial identity testing to show that a sufficiently-hard family of explicit constant-variate polynomials yields a near-complete derandomization of polynomial identity testing. This result holds over fields of both zero and positive characteristic and complements a recent work of Guo, Kumar, Saptharishi, and Solomon, who obtained a slightly stronger statement over fields of characteristic zero.

at May 25, 2020 10:25 PM UTC

Authors: Yaqiao Li, Vishnu V. Narayan, Denis Pankratov
Download: PDF
Abstract: We introduce a new type of adversary for online graph problems. The new adversary is parameterized by a single integer $\kappa$, which upper bounds the number of connected components that the adversary can use at any time during the presentation of the online graph $G$. We call this adversary "$\kappa$ components bounded", or $\kappa$-CB for short. On one hand, this adversary is restricted compared to the classical adversary because of the $\kappa$-CB constraint. On the other hand, we seek competitive ratios parameterized only by $\kappa$ with no dependence on the input length $n$, thereby giving the new adversary power to use arbitrarily large inputs.

We study online coloring under the $\kappa$-CB adversary. We obtain finer analysis of the existing algorithms $FirstFit$ and $CBIP$ by computing their competitive ratios on trees and bipartite graphs under the new adversary. Surprisingly, $FirstFit$ outperforms $CBIP$ on trees. When it comes to bipartite graphs $FirstFit$ is no longer competitive under the new adversary, while $CBIP$ uses at most $2\kappa$ colors. We also study several well known classes of graphs, such as $3$-colorable, $C_k$-free, $d$-inductive, planar, and bounded treewidth, with respect to online coloring under the $\kappa$-CB adversary. We demonstrate that the extra adversarial power of unbounded input length outweighs the restriction on the number of connected components leading to non existence of competitive algorithms for these classes.

at May 25, 2020 10:28 PM UTC

Authors: Lianna Hambardzumyan, Yaqiao Li
Download: PDF
Abstract: Extending the idea in [Impagliazzo, R., Moore, C. and Russell, A., An entropic proof of Chang's inequality. SIAM Journal on Discrete Mathematics, 28(1), pp.173-176.] we give a short information theoretic proof for Chang's lemma that is based on Pinsker's inequality.

at May 25, 2020 10:24 PM UTC

The Vienna Graduate School on Computational Optimization (VGSCO) is a research and training program funded by the Austrian Science Funds (FWF). The VGSCO offers lecture series given by international experts in optimization related fields, organizes research seminars, retreats, soft skills courses, scientific workshops, and social events, provides travel grants, and supports research stays abroad.

Website: http://vgsco.univie.ac.at/positions
Email: vgsco@univie.ac.at

by shacharlovett at May 24, 2020 08:48 AM UTC

We study the MaxRes rule in the context of certifying unsatisfiability. We show that it can be exponentially more powerful than tree-like resolution, and when augmented with weakening (the system MaxResW), p-simulates tree-like resolution. In devising a lower bound technique specific to MaxRes (and not merely inheriting lower bounds from Res), we define a new semialgebraic proof system called the SubCubeSums proof system. This system, which p-simulates MaxResW, is a special case of the Sherali-Adams proof system. In expressivity, it is the integral restriction of conical juntas studied in the contexts of communication complexity and extension complexity. We show that it is not simulated by Res. Using a proof technique qualitatively different from the lower bounds that MaxResW inherits from Res, we show that Tseitin contradictions on expander graphs are hard to refute in SubCubeSums. We also establish a lower bound technique via lifting: for formulas requiring large degree in SubCubeSums, their XOR-ification requires large size in SubCubeSums.

at May 23, 2020 07:58 PM UTC

The next TCS+ talk will take place this coming Wednesday, May 27th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 17:00 UTC). Rahul Ilango from MIT will speak about “Is it (NP) hard to distinguish order from chaos?” (abstract below).

You can reserve a spot as an individual or a group to join us live by signing up on the online form. Due to security concerns, registration is required to attend the interactive talk. (The link to the YouTube livestream will also be posted on our
website
on the day of the talk, so people who did not sign up will still be able to watch the talk live.) As usual, for more information about the TCS+ online seminar series and the upcoming talks, or to suggest a possible topic or speaker, please see the website.

Abstract: The Minimum Circuit Size Problem (MCSP) roughly asks what the “complexity” of a given string is. Informally, one can think of this as determining the degree of “computational order” a string has.

In the past several years, there has been a resurgence of interest in MCSP. A series of exciting results have begun unraveling what looks to be a fascinating story. This story already reveals deep connections between MCSP and a growing list of fields, including cryptography, learning theory, structural complexity theory, average-case complexity, and circuit complexity. As an example, Santhanam recently proved a conditional equivalence between the complexity of MCSP and the existence of one-way functions.

This talk is split into two parts. The first part is a broad introduction to MCSP, answering the following questions: What is this problem? Why is it interesting? What do we know so far, and where might the story go next? The second part discusses recent joint work with Bruno Loff and Igor Oliveira showing that the “multi-output version” of MCSP is NP-hard.

 

by plustcs at May 22, 2020 10:37 PM UTC


How to tell what part of math you are from

Gerolamo Cardano is often credited with introducing the notion of complex numbers. In 1545, he wrote a book titled Ars Magna. He introduced us to numbers like {\sqrt{-5}} in his quest to understand solutions to equations. Cardano was often short of money and gambled and played a certain board game to make money—see the second paragraph here.

Today, for amusement, Ken and I thought we’d talk about tells.

What are tells? Wikipedia says:

A tell in poker is a change in a player’s behavior or demeanor that is claimed by some to give clues to that player’s assessment of their hand.

Other Tells

Ken and I have been thinking of tells in a wider sense—when and whether one can declare inferences amid uncertain information. Historians face this all the time. So do biographers, at least when their subjects are no longer living. We would also like to make inferences in our current world, such as about the pandemic. The stakes can be higher than in poker. In poker, if your “tell” inference is wrong and you lose, you can play another hand—unless you went all in. With science and other academic areas the attitude must be that you’re all-in all the time.

Cardano furnishes several instances. Wikipedia—which we regard as an equilibrium of opinions—says that Cardano

acknowledged the existence of imaginary numbers … [but] did not understand their properties, [which were] described for the first time by his Italian contemporary Rafael Bombelli.

This is a negative inference from how one of Cardano’s books stops short of treating imaginary numbers as objects that follow rules.

There are also questions about whether Cardano can be considered “the father of probability” ahead of Blaise Pascal and Pierre de Fermat. Part of the problem is that Cardano’s own writings late in life recounted his first erroneous reasonings as well as final understanding in a Hamlet-like fashion. Wikipedia doubts whether he really knew the rule of multiplying probabilities of independent events, whereas the essay by Prakash Gorroochurn cited there convinces us that he did. Similar doubt extends to how much Cardano knew about the natural sciences, as correct inferences (such as mountains with seashell fossils having once been underwater) are mixed in with what we today regard as howlers.

Every staging of Shakespeare’s Hamlet shows a book by Cardano—or does it? In Act II, scene 2, Polonius asks, “What do you read, my lord?”; to which Hamlet first replies “Words words words.” Pressed on the book’s topic, Hamlet perhaps references the section “Misery of Old Age” in Cardano’s 1543 book De Consolatione but what he says is so elliptical it is hard to tell. The book also includes particular allusions between sleep and death that go into Hamlet’s soliloquy opening Act III. The book had been published in England in 1573 as Cardan’s Comfort under the aegis of the Earl of Oxford so it was well-known. Yet the writer Italo Calvino held back from the inference:

To conclude from this that the book read by Hamlet is definitely Cardano, as is held by some scholars of Shakespeare’s sources, is perhaps unjustified.

To be sure, there are some who believe Shakespeare’s main source was Oxford, in manuscripts if not flesh and blood. One reason we do not go there is that we do not see the wider community as having been able to establish reliable principles for judging what kinds of inferences are probably valid. We wonder if one could do an experiment of taking resolved cases, removing most of the information to take them down to the level of unresolved cases, and seeing what kinds of inferences from partial information would have worked. That’s not our expertise, but within our expertise in math and CS, we wonder if a little experiment will be helpful.

To set the idea, note that imaginary numbers are also called complex numbers. Yet the term complex numbers can mean other things. Besides numbers like {2 + 3i} it also can mean how hard it is to construct a number.

In number theory, the integer complexity of an integer is the smallest number of ones that can be used to represent it using ones and any number of additions, multiplications, and parentheses. It is always within a constant factor of the logarithm of the given integer.

How easy is it to tell what kind of “complex” is meant if you only have partial information? We don’t only mean scope-of-terminology issues; often a well-defined math object is used in multiple areas. Let’s try an experiment.

Math Tells

Suppose you walk in log-in to a talk without any idea of the topic. If the speaker uses one of these terms can you tell what her talk might be about? Several have multiple meanings. What are some of them? A passing score is {\dots}

  1. She says let {S} be a c.e. set.

  2. She says let {k} be in {\omega}.

  3. She says by the König principle.

  4. She says {S} is a prime.

  5. She says {n} is a prime.

  6. She says {G} is solvable.

  7. She says let its degree be {0}.

  8. She says there is a run.

  9. She says it is reducible.

  10. She says it is satisfiable.

Open Problems

What are your answers? Do you have some tells of your own?

by rjlipton at May 22, 2020 04:12 PM UTC

We show that lower bounds for explicit constant-variate polynomials over fields of characteristic $p > 0$ are sufficient to derandomize polynomial identity testing over fields of characteristic $p$. In this setting, existing work on hardness-randomness tradeoffs for polynomial identity testing requires either the characteristic to be sufficiently large or the notion of hardness to be stronger than the standard syntactic notion of hardness used in algebraic complexity. Our results make no restriction on the characteristic of the field and use standard notions of hardness. We do this by combining the Kabanets-Impagliazzo generator with a white-box procedure to take $p$-th roots of circuits computing a $p$-th power over fields of characteristic $p$. When the number of variables appearing in the circuit is bounded by some constant, this procedure turns out to be efficient, which allows us to bypass difficulties related to factoring circuits in characteristic $p$. We also combine the Kabanets-Impagliazzo generator with recent ``bootstrapping'' results in polynomial identity testing to show that a sufficiently-hard family of explicit constant-variate polynomials yields a near-complete derandomization of polynomial identity testing. This result holds over fields of both zero and positive characteristic and complements a recent work of Guo, Kumar, Saptharishi, and Solomon, who obtained a slightly stronger statement over fields of characteristic zero.

at May 21, 2020 11:45 PM UTC