J Scott Provan (site)
The following post was kindly contributed by Karim Adiprasito. (Here is the link to Karim’s paper.)
So, Gil asked me to say a little bit about my proof of the g-conjecture (and some other conjectures in discrete geometry) on his blog, and since he bought me many coffees to explain it to him (or if he is to be believed, the department paid), I am happy to oblige.
So, I want to explain a special, but critical case to the proof. It contains the some shadow of the core ideas necessary, but needs some more tricks I will remark on afterwards.
Also, I wanted to take this opportunity to mention something marvelous that I learned from Leonid Gurvits recently that increased my own understanding of one of the key tricks used indefinitely. That trick is the following cool lemma.
Leonid Gurvits
PERTURBATION LEMMA: Consider two linear maps
of two real vector spaces and . Assume that
Then a generic linear combination of and has kernel
Cool, no? Proof, then: Find a subspace of such that
so that in particular is injective on . Then, for small enough, the image of
is
But if we norm in any way, then approximates as tends to zero, which is linearly independent from by assumption. WALLA
Now, how is this used.
Let me set up some of the basic objects.
If is an abstract simplicial complex on ground-set , let denote the nonface ideal in , where .
Let denote the face ring of . A collection of linear forms in the polynomial ring is a partial linear system of parameters if
for the Krull dimension. If , then is simply a linear system of parameters, and the corresponding quotient is called an Artinian reduction of .
The g-conjecture (as described earlier in Gil’s blog) is implied by the following property:
(HL) For every sphere of even dimension , there is an Artinian reduction and a degree one element such that the map
is an isomorphism.
This is quite a reasonable demand. Indeed, Graebe proved that and that the resulting pairing
is perfect, so and are isomorphic as vector spaces. We shall call this property (PD), because it is a special case of Poincaré pairing.
(HL) is a special case of the Hard Lefschetz Theorem I prove in my paper, and we will prove it for a subset of all triangulated spheres here. Proving it for all spheres implies the -conjecture (and other conjectures, such as the Grünbaum conjecture), and proving the hard Lefschetz theorem in full generality is not much harder.
Lou Billera
Lets recall a cool notion due to Provan and Billera: A pure simplicial -complex is vertex decomposable if it is a simplex, or there exists a vertex whose link is vertex decomposable of dimension and its deletion is vertex decomposable of dimension .
We restrict our attention to vertex decomposable spheres and disks and assume the boundary of the link is vertex decomposable as well in every step.
THEOREM: Vertex decomposable spheres satisfy (HL).
We prove this theorem by induction on dimension, the base case of zero-dimensional spheres being clear.
Lets label the vertices of in order of their vertex decomposition, from to . Now, will be a linear combination of indeterminates, so lets assume we have constructed an element that uses just the first of them, and such that itself is as close to a Lefschetz element as possible for its kind, that is, the kernel of
is the intersection of kernels of the maps
where ranges from to .
We want to construct a map with this property (which I call the transversal prime property}. To this effect, we want to apply the perturbation lemma to the maps , , and with respect to the spaces and . Let us denote by the ball given as the union of neighborhoods of the first vertices.
For this, we have to find out the kernel of . But this is the the ideal in generated by the monomials which are not divisible by any of the first indeterminates. Lets call it , and its restriction to by . Lets also look at the image of , which by Graebe’s theorem is exactly the span of the images of the maps the maps
where ranges from to .
But then, is in degree if and only if is in degree . Why is that? Because with respect to the Poincaré pairing, (in degree ) and (in degree ) are dual.
The ring is obtained by taking , seen as a quotient of and modding out by the ideal generated by the linear system . But that is of length , even though is only of dimension . We can remove the vertex for the price of removing one of the linear forms, but then we have the same issue, having a -sphere and a system of length . Still, one too many! Taking a subsystem of length , we obtain an Artinian reduction for via a linear system , but what happens to the additional linear form of not in ? It has to act as a Lefschetz element on if we want
to be trivial in degree . But we may assume so by induction! Hence, we can choose as the generic sum of and by the perturbation lemma.
So, ultimately, we can construct a map with the transversal prime property. But then its kernel is the intersection of the kernels of
,
where ranges from to . But that is . SABABA.
Now, we have the Lefschetz theorem for a special class, but that is less than what we want in the end, since vertex decomposable spheres are few and in between (do you see a reason why? there are many). So, what do we do? For a long time, I tried to extend the perturbation lemma to combine more than two maps.
Recently (depending on when Gil puts this post on the blog), I met Leonid Gurvits for the first time on a marvelous little conference at the Simons Institute. I knew that the problem is related to Hall’s Marriage Theorem for operators (I explain this connection a bit further in my paper), but Leonid enlightened this further by pointing me towards several nice papers, starting with his work on Quantum Matching Theory. Indeed, finding a good extension to three and more maps would essentially mean that we could also find Hall Marriage Type Theorems for 3-regular hypergraphs, which we know for complexity reasons to be unlikely.
So what can we do instead? Well, it turns out that I only really needed to look at the -skeleton of above, and there is no need to be vertex decomposable. It is enough to find another nicely decomposable -manifold that contains it the -skeleton of , and then use some technical topological tricks to connect the local picture to global homological properties.
Finding a set of nearly independent objects
Wikipedia bio source |
Giuseppe Vitali was the mathematician who famously used the Axiom of Choice, in 1905, to give the first example of a non-measurable subset of the real numbers.
Today I want to discuss another of his results that is a powerful tool.
The existence of a set that cannot properly be assigned a measure was a surprise at the time, and still is a surprise. It is a wonderful example of the power of the Axiom of Choice. See this for details.
We are interested in another of his results that is more a theorem about coverings. It is the Vitali covering theorem–see this. The theorem shows that a certain type of covering—ah, we will explain the theorem in a moment.
The power of this theorem is that it can be used to construct various objects in analysis. There are now many applications of this theorem. It is a powerful tool that can be used to prove many nice results. I do not know of any—many?—applications of the existence of a non-measurable set. Do you know any?
Let’s look at an application of the Vitali theorem that may be new. But in any case it may help explain what the Vitali theorem is all about.
Suppose that . We can make the map surjective if we restrict to be equal to . It is not so simple to make the map injective, but we can in general do that also.
Theorem 1 Let be a surjective function from to . Then there is a subset of so that is injective from to .
Proof: For each in select one from the set and place it into . Recall is the set of so that .This of course uses the Axiom of Choice to make the choices of which to choose. Then clearly is the required set.
The difficulty with this trivial theorem is that cannot be controlled easily if it is constructed via the Axiom of Choice. It could be a very complicated set. Our goal is to see how well we can control if we assume that the mapping is smooth.
How can we do better? The answer is quite a bit better if we assume that is a “nice” function. We give up surjectivity onto but only by a null set.
Theorem 2 Suppose that is a surjective smooth map from to where and are open subsets of . Also suppose that locally is invertible. Then there is a subset of so that
- The complement of is a null set.
- The map is injective from to .
That is that for all distinct points and in , . Moreover the map from to is smooth.
How can we prove this theorem? An obvious idea is to do the following. Pick an open interval in so that for an open set in and so that is injective from to . Setting to clearly works: the map is injective on . This is far from the large set that we wish to have, but it is a start. The intuition is to select another open interval that is disjoint from so that again is injective from to . We can then add to our .
We can continue in this way and collect many open sets that we add to . Can we arrange that the union of these sets yield a so that is most of ? In general the answer is no. Suppose that the intervals are the following:
for Roughly we can only get about half of the space that the intervals cover and keep the chosen intervals disjoint. If we select then we cannot select since
Vitali’s theorem comes to the rescue. It allows us to avoid his problem, by insisting that intervals have an additional property.
The trick is to use a refinement of a set cover that allows a disjoint cover to exist for almost all of the target set. The next definition is critical to the Vitali covering theorem.
Definition 3 Let be a subset of . Let be intervals over in some index set . We say these intervals are a cover of proved is a subset of the union of all the intervals. Say the intervals also are a Vitali cover of provided for all points in and all , there is an interval that contains and .
The Vitali theorem is the following:
Theorem 4 Let be a subset of . Let be intervals for in some index set . Assume that the family is a Vitali cover of . Then there is a countable subfamily of disjoints intervals in the family so that they cover all of except for possibly a null set.
The Vitali theorem can be extended to any finite dimensional space . Then intervals become disks and so on.
Do you see how to prove Theorem 2 from Vitali’s theorem? The insight is now one can set up a Vitali covering of the space .
Authors: Holden Lee, Oren Mangoubi, Nisheeth K. Vishnoi
Download: PDF
Abstract: Given a sequence of convex functions $f_0, f_1, \ldots, f_T$, we study the
problem of sampling from the Gibbs distribution $\pi_t \propto e^{-\sum_{k=0}^t
f_k}$ for each epoch $t$ in an online manner. This problem occurs in
applications to machine learning, Bayesian statistics, and optimization where
one constantly acquires new data, and must continuously update the
distribution. Our main result is an algorithm that generates independent
samples from a distribution that is a fixed $\varepsilon$ TV-distance from
$\pi_t$ for every $t$ and, under mild assumptions on the functions, makes
poly$\log(T)$ gradient evaluations per epoch. All previous results for this
problem imply a bound on the number of gradient or function evaluations which
is at least linear in $T$. While we assume the functions have bounded second
moment, we do not assume strong convexity. In particular, we show that our
assumptions hold for online Bayesian logistic regression, when the data satisfy
natural regularity properties. In simulations, our algorithm achieves accuracy
comparable to that of a Markov chain specialized to logistic regression. Our
main result also implies the first algorithm to sample from a $d$-dimensional
log-concave distribution $\pi_T \propto e^{-\sum_{k=0}^T f_k}$ where the
$f_k$'s are not assumed to be strongly convex and the total number of gradient
evaluations is roughly $T\log(T)+\mathrm{poly}(d),$ as opposed to $T\cdot
\mathrm{poly}(d)$ implied by prior works. Key to our algorithm is a novel
stochastic gradient Langevin dynamics Markov chain that has a carefully
designed variance reduction step built-in with fixed constant batch size.
Technically, lack of strong convexity is a significant barrier to the analysis,
and, here, our main contribution is a martingale exit time argument showing the
chain is constrained to a ball of radius roughly poly$\log(T)$ for the duration
of the algorithm.
Authors: Alexander Cowtan, Silas Dilkes, Ross Duncan, Alexandre Krajenbrink, Will Simmons, Seyon Sivarajah
Download: PDF
Abstract: We introduce a new architecture-agnostic methodology for mapping abstract
quantum circuits to realistic quantum computing devices with restricted qubit
connectivity, as implemented by Cambridge Quantum Computing's tket compiler. We
present empirical results showing the effectiveness of this method in terms of
reducing two-qubit gate depth and two-qubit gate count, compared to other
implementations.
Authors: Talya Eden, Dana Ron, Will Rosenbaum
Download: PDF
Abstract: In this paper, we revisit the problem of sampling edges in an unknown graph
$G = (V, E)$ from a distribution that is (pointwise) almost uniform over $E$.
We consider the case where there is some a priori upper bound on the arboriciy
of $G$. Given query access to a graph $G$ over $n$ vertices and of average
degree $d$ and arboricity at most $\alpha$, we design an algorithm that
performs $O\!\left(\frac{\alpha}{d} \cdot \frac{\log^3 n}{\varepsilon}\right)$
queries in expectation and returns an edge in the graph such that every edge $e
\in E$ is sampled with probability $(1 \pm \varepsilon)/m$. The algorithm
performs two types of queries: degree queries and neighbor queries. We show
that the upper bound is tight (up to poly-logarithmic factors and the
dependence in $\varepsilon$), as $\Omega\!\left(\frac{\alpha}{d} \right)$
queries are necessary for the easier task of sampling edges from any
distribution over $E$ that is close to uniform in total variational distance.
We also prove that even if $G$ is a tree (i.e., $\alpha = 1$ so that
$\frac{\alpha}{d}=\Theta(1)$), $\Omega\left(\frac{\log n}{\log\log n}\right)$
queries are necessary to sample an edge from any distribution that is pointwise
close to uniform, thus establishing that a $\mathrm{poly}(\log n)$ factor is
necessary for constant $\alpha$. Finally we show how our algorithm can be
applied to obtain a new result on approximately counting subgraphs, based on
the recent work of Assadi, Kapralov, and Khanna (ITCS, 2019).
Authors: Avery Miller, Boaz Patt-Shamir, Will Rosenbaum
Download: PDF
Abstract: We consider the Adversarial Queuing Theory (AQT) model, where packet arrivals
are subject to a maximum average rate $0\le\rho\le1$ and burstiness
$\sigma\ge0$. In this model, we analyze the size of buffers required to avoid
overflows in the basic case of a path. Our main results characterize the space
required by the average rate and the number of distinct destinations: we show
that $O(k d^{1/k})$ space suffice, where $d$ is the number of distinct
destinations and $k=\lfloor 1/\rho \rfloor$; and we show that $\Omega(\frac 1 k
d^{1/k})$ space is necessary. For directed trees, we describe an algorithm
whose buffer space requirement is at most $1 + d' + \sigma$ where $d'$ is the
maximum number of destinations on any root-leaf path.
Authors: Kevin Buchin, Anne Driemel, Martijn Struijs
Download: PDF
Abstract: We study the complexity of clustering curves under $k$-median and $k$-center
objectives in the metric space of the Fr\'echet distance and related distance
measures. The $k$-center problem has recently been shown to be NP-hard, even in
the case where $k=1$, i.e. the minimum enclosing ball under the Fr\'echet
distance. We extend these results by showing that also the $k$-median problem
is NP-hard for $k=1$. Furthermore, we show that the $1$-median problem is
W[1]-hard with the number of curves as parameter. We show this under the
discrete and continuous Fr\'echet and Dynamic Time Warping (DTW) distance. Our
result generalizes an earlier result by Bulteau et al. from 2018 for a variant
of DTW that uses squared distances. Moreover, closing some gaps in the
literature, we show positive results for a variant where the center curve may
have complexity at most $\ell$ under the discrete Fr\'echet distance. In
particular, for fixed $k,\ell$ and $\varepsilon$, we give
$(1+\varepsilon)$-approximation algorithms for the $(k,\ell)$-median and
$(k,\ell)$-center objectives and a polynomial-time exact algorithm for the
$(k,\ell)$-center objective.
Authors: Johannes Bund, Christoph Lenzen, Will Rosenbaum
Download: PDF
Abstract: Synchronizing clocks in distributed systems is well-understood, both in terms
of fault-tolerance in fully connected systems and the dependence of local and
global worst-case skews (i.e., maximum clock difference between neighbors and
arbitrary pairs of nodes, respectively) on the diameter of fault-free systems.
However, so far nothing non-trivial is known about the local skew that can be
achieved in topologies that are not fully connected even under a single
Byzantine fault. Put simply, in this work we show that the most powerful known
techniques for fault-tolerant and gradient clock synchronization are
compatible, in the sense that the best of both worlds can be achieved
simultaneously.
Concretely, we combine the Lynch-Welch algorithm [Welch1988] for synchronizing a clique of $n$ nodes despite up to $f<n/3$ Byzantine faults with the gradient clock synchronization (GCS) algorithm by Lenzen et al. [Lenzen2010] in order to render the latter resilient to faults. As this is not possible on general graphs, we augment an input graph $\mathcal{G}$ by replacing each node by $3f+1$ fully connected copies, which execute an instance of the Lynch-Welch algorithm. We then interpret these clusters as supernodes executing the GCS algorithm, where for each cluster its correct nodes' Lynch-Welch clocks provide estimates of the logical clock of the supernode in the GCS algorithm. By connecting clusters corresponding to neighbors in $\mathcal{G}$ in a fully bipartite manner, supernodes can inform each other about (estimates of) their logical clock values. This way, we achieve asymptotically optimal local skew, granted that no cluster contains more than $f$ faulty nodes, at factor $O(f)$ and $O(f^2)$ overheads in terms of nodes and edges, respectively. Note that tolerating $f$ faulty neighbors trivially requires degree larger than $f$, so this is asymptotically optimal as well.
Authors: John Iacono, Varunkumar Jayapaul, Ben Karsin
Download: PDF
Abstract: The performance of modern computation is characterized by locality of
reference, that is, it is cheaper to access data that has been accessed
recently than a random piece of data. This is due to many architectural
features including caches, lookahead, address translation and the physical
properties of a hard disk drive; attempting to model all the components that
constitute the performance of a modern machine is impossible, especially for
general algorithm design purposes. What if one could prove an algorithm is
asymptotically optimal on all systems that reward locality of reference, no
matter how it manifests itself within reasonable limits? We show that this is
possible, and that algorithms that are asymptotically optimal in the
cache-oblivious model are asymptotically optimal in any reasonable
locality-of-reference rewarding setting. This is surprising as the
cache-oblivious model envisions a particular architectural model involving
blocked memory transfer into a multi-level hierarchy of caches of varying
sizes, and was not designed to directly model locality-of-reference correlated
performance.
Authors: Anupam Gupta, Haotian Jiang, Ziv Scully, Sahil Singla
Download: PDF
Abstract: Suppose there are $n$ Markov chains and we need to pay a per-step
\emph{price} to advance them. The "destination" states of the Markov chains
contain rewards; however, we can only get rewards for a subset of them that
satisfy a combinatorial constraint, e.g., at most $k$ of them, or they are
acyclic in an underlying graph. What strategy should we choose to advance the
Markov chains if our goal is to maximize the total reward \emph{minus} the
total price that we pay?
In this paper we introduce a Markovian price of information model to capture settings such as the above, where the input parameters of a combinatorial optimization problem are given via Markov chains. We design optimal/approximation algorithms that jointly optimize the value of the combinatorial problem and the total paid price. We also study \emph{robustness} of our algorithms to the distribution parameters and how to handle the \emph{commitment} constraint.
Our work brings together two classical lines of investigation: getting optimal strategies for Markovian multi-armed bandits, and getting exact and approximation algorithms for discrete optimization problems using combinatorial as well as linear-programming relaxation ideas.
Authors: Lingxiao Huang, Nisheeth K. Vishnoi
Download: PDF
Abstract: Fair classification has been a topic of intense study in machine learning,
and several algorithms have been proposed towards this important task. However,
in a recent study, Friedler et al. observed that fair classification algorithms
may not be stable with respect to variations in the training dataset -- a
crucial consideration in several real-world applications. Motivated by their
work, we study the problem of designing classification algorithms that are both
fair and stable. We propose an extended framework based on fair classification
algorithms that are formulated as optimization problems, by introducing a
stability-focused regularization term. Theoretically, we prove a stability
guarantee, that was lacking in fair classification algorithms, and also provide
an accuracy guarantee for our extended framework. Our accuracy guarantee can be
used to inform the selection of the regularization parameter in our framework.
To the best of our knowledge, this is the first work that combines stability
and fairness in automated decision-making tasks. We assess the benefits of our
approach empirically by extending several fair classification algorithms that
are shown to achieve the best balance between fairness and accuracy over the
Adult dataset. Our empirical results show that our framework indeed improves
the stability at only a slight sacrifice in accuracy.
Authors: Therese Biedl
Download: PDF
Abstract: It is well-known that every $n$-vertex planar graph with minimum degree 3 has
a matching of size at least $\frac{n}{3}$. But all proofs of this use the
Tutte-Berge-formula for the size of a maximum matching. Hence these proofs are
not directly algorithmic, and to find such a matching one must apply a
general-purposes maximum matching algorithm, which has run-time
$O(n^{1.5}\alpha(n))$ for planar graphs. In contrast to this, this paper gives
a linear-time algorithm that finds a matching of size at least $\frac{n}{3}$ in
any planar graph with minimum degree 3.
Authors: Ashish Dwivedi, Rajat Mittal, Nitin Saxena
Download: PDF
Abstract: Finding an irreducible factor, of a polynomial $f(x)$ modulo a prime $p$, is
not known to be in deterministic polynomial time. Though there is such a
classical algorithm that {\em counts} the number of irreducible factors of
$f\bmod p$. We can ask the same question modulo prime-powers $p^k$. The
irreducible factors of $f\bmod p^k$ blow up exponentially in number; making it
hard to describe them. Can we count those irreducible factors $\bmod~p^k$ that
remain irreducible mod $p$? These are called {\em basic-irreducible}. A simple
example is in $f=x^2+px \bmod p^2$; it has $p$ many basic-irreducible factors.
Also note that, $x^2+p \bmod p^2$ is irreducible but not basic-irreducible!
We give an algorithm to count the number of basic-irreducible factors of $f\bmod p^k$ in deterministic poly(deg$(f),k\log p$)-time. This solves the open questions posed in (Cheng et al, ANTS'18 \& Kopp et al, Math.Comp.'19). In particular, we are counting roots $\bmod\ p^k$; which gives the first deterministic poly-time algorithm to compute Igusa zeta function of $f$. Also, our algorithm efficiently partitions the set of all basic-irreducible factors (possibly exponential) into merely deg$(f)$-many disjoint sets, using a compact tree data structure and {\em split} ideals.
Authors: P. A. M. Casares, M. A. Martin-Delgado
Download: PDF
Abstract: We introduce a new quantum optimization algorithm for Linear Programming (LP)
problems based on Interior Point (IP) Predictor-Corrector (PC) methods whose
(worst case) time complexity is $O(\sqrt{n}Ls^3 k \epsilon^{-1}\epsilon_s^{-1})
$. This represents a quantum speed-up in the number $n$ of variables in the
cost function with respect to the comparable classical Interior Point (IP)
algorithms that behave as $O((n+m)\sqrt{nk}L
s^3\log(\epsilon^{-1})\epsilon_s^{-1})$ or $O(\sqrt{n}(n+m)L)$ depending on the
technique employed, where $m$ is the number of constraints and the rest of the
variables are defined in the introduction. The average time complexity of our
algorithm is $O(\sqrt{n}s^3 k \epsilon^{-1}\epsilon_s^{-1})$, which equals the
behaviour on $n$ of quantum Semidefinite Programming (SDP) algorithms based on
multiplicative weight methods when restricted to LP problems and heavily
improves on the precision $\epsilon^{-1}$ of the algorithm. Unlike the quantum
SDP algorithm, the quantum PC algorithm does not depend on size parameters of
the primal and dual LP problems ($R,r$), and outputs a feasible and optimal
solution whenever it exists.
In the 1990s I published a series of papers on data structures for closest pairs. As long as you already know how to maintain dynamic sets of objects of some type, and answer nearest-neighbor queries among them, you can also keep track of the closest pair, and this can be used as a subroutine in many other computational geometry algorithms. But it turns out that many of those algorithms can now be simplified and sped up by using mutual nearest neighbors (pairs of objects that are each other’s nearest neighbors) instead of closest pairs.
My original motivation for studying these types of problems was to maintain minimum spanning trees of dynamic point sets, using closest red-blue pairs of Euclidean points,^{1} ^{2} and I later found more applications in hierarchical clustering, greedy matching, traveling salesperson heuristics,^{3} ^{4} and (with Jeff Erickson) motorcycle graphs and straight skeletons.^{5} But to use these closest pair data structures, you have to pay two logarithmic factors in time complexity over the time for the underlying nearest-neighbor data structure. So they’re not competitive with (uncolored) Euclidean closest pair data structures, which take only logarithmic time in any fixed dimension. Instead they make more sense to use with other distances than Euclidean, with objects more complicated than single points, or with variations like the red-blue closest pair for which the logarithmic-time solution doesn’t work.
For several variations of hierarchical clustering, an alternative and simpler technique has been known for quite a bit longer, based on finding mutual nearest neighbors (pairs of objects that are nearer to each other than to anything else) rather than closest pairs.^{6} ^{7} It’s called the nearest neighbor chain algorithm, but really it’s a data structure rather than an algorithm, one that allows you to maintain a dynamic point set and find pairs of mutual nearest neighbors, again based on calls to an underlying nearest neighbor data structure. The idea is to maintain a stack of shorter and shorter pairs of nearest neighbors, until the two objects whose distance is on the top of the stack have nothing nearer – they are mutual nearest neighbors. Whenever you want a pair of neighbors, you look at the top pair, an object and its nearest neighbor , and ask whether ’s nearest neighbor is . If so, you have found a mutual nearest neighbor pair, and if not you have a new shorter distance to push onto the stack.
One can this in a hierarchical clustering algorithm that repeatedly finds and merges the nearest two clusters, whenever the distance between clusters has a special property: a merged cluster is never closer to other clusters than the closer of the two clusters that was merged. This property implies both that the stack of distances remains valid after the merge, and that mutual nearest neighbors are always safe to merge. If two clusters are mutual nearest neighbors, then the closest-pair clustering algorithm will eventually merge them, because none of its actions can cause them to stop being mutual nearest neighbors. So we might as well merge them immediately once we discover them to be mutual nearest neighbors. (One way to formulate this mathematically is that the set of mutual nearest neighbor pairs merged by the clustering algorithm forms an antimatroid). When this works, you get a clustering algorithm that uses a linear number of nearest neighbor queries, instead of the queries that you would get using my closest-pair data structures.
In more recent research with UCI student Nil Mamano (finishing his doctorate this year; hire him for a postdoc, he’s good!) we noticed that the nearest neighbor chain algorithm can also be applied to certain stable marriage problems with preferences coming from geometric distances.^{8} Our latest preprint, “Euclidean TSP, Motorcycle Graphs, and Other New Applications of Nearest-Neighbor Chains” (with Efrat, Frishberg, Goodrich, Kobourov, Mamano, Matias, and Polishchuk, arXiv:1902.06875) extends this to a much broader set of applications. As well as simplifying and speeding up my previous work on motorcycle graphs and TSP heuristics, we also use nearest neighbor chains in a bigger class of stable matching problems and in an approximate geometric set cover problem. In each case, we need to show either that the problem has an antimatroid-like property (so using mutual nearest neighbors produces the same solution as closest pairs) or that, even when varying from the same solution, it achieves the same quality. It’s not quite true that anything closest pairs can do, mutual nearest neighbors can do better, but it’s close.
Another idea in the paper is that to find (exact!) mutual nearest neighbor pairs one can sometimes get away with using approximate near neighbor structures. This is important if you’re using Euclidean distance, because the time bounds for exact nearest neighbors have the form for constants that get very small as gets large, while approximate nearest neighbors are logarithmic in all dimensions. The idea is to build the stack of shorter distances by asking for a constant number of approximate near neighbors, the th of which is within a constant factor of the distance to the actual th nearest neighbor. By a packing argument for points in Euclidean space, either some two of these points are closer to each other than the distance on the current stack top (in which case you can build the stack one more level) or these approximate neighbors are guaranteed to contain the actual nearest neighbor (in which case you can either detect a mutual nearest neighbor pair or again build the stack). This idea leads, for instance, to an algorithm for the multi-fragment TSP heuristic that takes time in Euclidean spaces of any bounded dimension; the best previous time appears to be an -time algorithm (valid in any metric space) from one of my previous papers.^{3}
Agarwal, P. K., Eppstein, D., and Matoušek, J., “Dynamic algorithms for half-space reporting, proximity problems, and geometric minimum spanning trees”, FOCS, 1992, pp. 80–89. ↩
Eppstein, D., “Dynamic Euclidean minimum spanning trees and extrema of binary functions”, Discrete Comput. Geom. 13: 111–122, 1995. ↩
Eppstein, D., “Fast hierarchical clustering and other applications of dynamic closest pairs”, SODA, 1998, pp. 619–628, arXiv:cs.DS/9912014, J. Experimental Algorithmics 5 (1): 1–23, 2000. ↩ ↩^{2}
Cardinal, J., and Eppstein, D., “Lazy algorithms for dynamic closest pair with arbitrary distance measures”, ALENEX, 2004, pp. 112–119. ↩
Eppstein, D., and Erickson, J., “Raising roofs, crashing cycles, and playing pool: applications of a data structure for finding pairwise interactions”, SoCG, 1998, pp. 58–67, Discrete Comput. Geom. 22 (4): 569–592, 1999. ↩
Benzécri, J.-P. (1982), “Construction d’une classification ascendante hiérarchique par la recherche en chaîne des voisins réciproques”, Les Cahiers de l’Analyse des Données, 7 (2): 209–218. ↩
Juan, J. (1982), “Programme de classification hiérarchique par l’algorithme de la recherche en chaîne des voisins réciproques”, Les Cahiers de l’Analyse des Données, 7 (2): 219–225. ↩
Eppstein, D., Goodrich, M. T., and Mamano, N., “Algorithms for stable matching and clustering in a grid”, arXiv:1704.02303, IWCIA 2017, LNCS 10256 (2017), pp. 117–131. ↩
OpenAI built a text generator so good, it’s considered too dangerous to releaseor
Researchers, scared by their own work, hold back “deepfakes for text” AIThere are concerns that OpenAI is overhyping solid but incremental work, that they're disingenuously allowing for overhyped coverage in the way they released the information, or worse that they're deliberately controlling hype as a publicity stunt.
It's really hard to watch the GPT-2 conversations unfold like so much else in tech. 1/— MMitchell (@mmitchell_ai) February 18, 2019
To solving the big questions, that is
Cropped from Device Plus source |
Tetsuya Miyamoto is a mathematics teacher who divides his time between Tokyo and Manhattan. He is known for creating in 2004 the popular KenKen puzzle, which the New York Times started running ten years ago. As with its sister puzzles Sudoku and Kakuro, unlimited-size versions of it are -complete.
Today we observe the 10th anniversary of this blog and ask what progress has been made on the question.
The is a question about asymptotic complexity. From time to time we have tried to raise corresponding questions about concrete complexity that might yield more progress. What catches our eye about the KenKen puzzles is that their generation is a full-blown application within concrete complexity. The NYT’s KenKen puzzles are all generated using software by David Levy that can tailor their hardness. Quoting the NYT anniversary article by Will Shortz:
[Levy’s] program knows every possible method for solving a KenKen, which he has rated in difficulty from easy to hard. Thus, when a KenKen has been made, the computer knows exactly how hard it is.
This seems to say there is a hardness measure that is objective—quite apart from the idea of having human testers try the puzzle and say how hard they found it to be. We surmise that it is lower for instances that have more forced plays at the start. We wonder whether Levy’s criteria can be generalized.
Incidentally, this is the same Levy who won a challenge chess match in 1978 against the computer Chess 4.7 to complete his win of a famous ten-year $1,000+ bet. He lost $1,000 back when he was defeated by Deep Thought in 1989. He later became president of the International Computer Games Association (ICGA), whose Journal published a nice paper on the -completess of the aforementioned puzzles and many others.
GLL’s first post, on 12 Feb. 2009, featured Stephen Rudich and his work on the “Natural Proofs.” Two other posts that day covered other aspects of why the question is hard. Our question, dear readers, is:
Has anything happened in the past ten years to make any part of those posts out-of-date in the slightest way?
We won’t claim any such progress, though we have tried to stir ideas. In the meantime, we have written 806 other posts:
To date, we’ve had 18,575 comments plus trackbacks on these posts and just over 2.1 million views. We are less able to quantify impacts, beyond occasionally seeing citations of articles on the blog as sources. We try for precision as well as readability and are grateful for reader comments with fixes when we slip up on the former.
There continue to be claims of proofs that and some that While these proofs do not seem to be correct, there is something that we wish to remark about them. Many argue as follows:
There is some problem say that seems to require a search of exponentially many objects. Then the proof states that any algorithm for must actually look at all or most of the exponentially many objects. This of course is where the proof is not complete.
There is some sense to these proofs. They seem related to the oracle proofs that for example show that for some oracle set it is the case that
we have discussed these types of proofs before—we even said that we did not like them.
The trouble with these results that are rigorous is that they change vs in a central manner, and this seems to make the results much less interesting. Roughly here is how they argue: Imagine that for each we either put a string of length into or we do not. The point is that if we do this in a unpredictable manner then a polynomial time machine will not be able to decide whether for there is or is not a string of length in . But a nondeterministic machine with just use its power and guess. This shows, essentially, that
is true.
There is some relationship to many attempts to show . The proofs often argue that one must look at all the objects. The counterpart here is that a polynomial time machine will not have enough time to check the strings of length to see if they are in . But this works in the oracle case because we allow the rule that decides whether or not a string is in to be very complicated. In the real world, in the world where we study the real question, we cannot assume that -complete problems use a complicated rule. That is precisely what we are trying to prove.
What can we say? Mostly the big open questions remain. We still have no non-linear lower bounds on circuit complexity and no progress of any definite kind on . What do you think?
What is commonly hailed as one of the two biggest results in our field last year was a positive solution to what is intuitively a slightly weaker form of the Unique Games Conjecture (UGC). For UGC we can refer you to Wikipedia’s article:
The note [2] is in turn a reference to a 2010 post here. The new paper proves hardness for the relaxed situation where, roughly speaking, a trial assignment to a node in a constraint graph limits the other node on any connecting edge to at most two possible values, rather than a unique value as in UGC. This relaxation retains many properties that had caused disbelief in the original UGC, yet it was proved—in that sense a big deal.
Nevertheless we note that UGC, at its core, is just asserting that for arbitrarily small , with our power to make other parameter(s) as large as desired, we can execute an -hardness proof. We have been executing -hardness proofs for almost fifty years. That is something we in the field have proven good at. True, these hardness results becomes lower bound proofs if and when is proved, and true, we have been as vocal as any on the standpoint that significant lower bounds will come from constructions that are usually thought of as being for upper bounds. But the new proof from a year ago doesn’t feel like that. We invite readers to tell us connections from UGC to the possibility of actually constructing lower bounds.
We at GLL thank you all for your help and support these ten years. Ken and I plan to continue doing what we have done in the past. Plan on a visit from our special friend on St. Patrick’s day, for example. Thanks again and let us know how we are doing.
[fixed date of first GLL post]
In this post I would like to tell you about three papers and three theorems. I am thankful to Moshe White and Imre Barany for helpful discussions.
a) Universality of vector sequences and universality of Tverberg partitions, by Attila Por;
Theorem (Por’s universality result, 2018): Every long enough sequence of points in general position in contains a subsequence of length n whose Tverberg partitions are exactly the so called rainbow partitions.
b) Classifying unavoidable Tverberg partitions, by Boris Bukh, Po-Shen Loh, Gabriel Nivasch
Theorem (Bukh, Loh, and Nivasch, 2017): Let be a tree-like -uniform simple hypergraph with edges and edges. It is possible to associate to the vertices of each such hypergraph H a set X of n points in so that the Tverberg partitions of X correspond precisely to rainbow coloring of the hypergraph H. Moreover, the number of rainbow coloring is . (Here, we consider two colorings as the same if they differ by a permutation of the colors.)
c) On Tverberg partitions, by Moshe White
Theorem (White, 2015): For any partition of , there exists a set of points, such that every Tverberg partition of induces the same partition on given by the parts . Moreover, the number of Tverberg’s partitions of is
See the original abstracts for the papers at the end of the post.
Recall the beautiful theorem of Tverberg: (We devoted two posts (I, II) to its background and proof.)
Tverberg Theorem (1965): Let be points in , . Then there is a partition of such that .
The (much easier) case of Tverberg’s theorem is Radon’s theorem.
We devoted a post to seven open problems related to Tverberg’s theorem, and one of them was:
Sierksma Conjecture: The number of Tverberg’s -partitions of a set of points in is at least .
Gerard Sierksma’s construction with Tverberg’s partition is obtained by taking copies of each vertex of a simplex containing the origin in its interior, and adding the origin itself. A configuration of points in with precisely Tverberg partitions to parts is called a Sierksma Configuration.
In 2015 Moshe White proved the following theorem which was an open problem for many years. White’s construction was surprisingly simple.
Theorem 1 (White, 2015): For any partition of , there exists a set of points, such that every Tverberg partition of induces the same partition on given by the parts . Moreover, the number of Tverberg’s partitions of is
Five tree-like simple hypergraphs that correspond to configurations of 11 points in 4-dimensional space.
Start with a tree-like hypergraph H of d+1 blocks of size r like the five examples in the Figure above. The intersection of every two blocks has at most one element. The union of all blocks has n=(d+1)(r-1)+1 elements.
A rainbow coloring of a r-uniform hypergraph H is a coloring of the vertices of H with r colors so that the vertices of every edge is colored by all r colors.
Theorem 2 (Bukh, Loh, and Nivasch): It is possible to associate to the vertices of each such hypergraph H a set X of n points in so that the Tverberg partitions of X correspond precisely to rainbow coloring of the hypergraph H. Moreover, the number of rainbow coloring is . (Here, we consider two colorings as the same if they differ by a permutation of the colors.)
For a star-like hypergraph where all blocks have a vertex in common we get the original Sierksma’s example. (Example (d) above.) White’s examples are obtained by considering such hypergraphs where there exists an edge such that all edges have non empty intersection with . (Examples (c), (d), and (e) above).
Rainbow colorings of our five examples
It is natural to consider $n$ points on the moment curve . It turns out that the set of Tverberg’s partitions for points on the moment curve depend on the precise location of the points. By stretched points on the moment curve I mean that you take the points where , namely $t_2$ is much much larger than and is much much much much larger than , etc. etc. In this case, the configuration corresponds to a path : you let the vertices be and the edges are sets of the form . A stretched configuration of points on the moment curve has the property that every subset is also a stretched configuration of points on the moment curve.
The importance of Tverberg’s partitions for stretched points on the moment curve was realized by Barany and Por, by Bukh, Loh, and Nivasch, and by Perles and Sidron (See their paper Tverberg Partitions of Points on the Moment Curve), and perhaps by others as well.
Por’s universality theorem asserts that in terms of Tverberg partitions every large enough configuration of points in general position in contains a configuration whose Tverberg partitions are those of a stretched configuration of points on the moment curve! Por’s universality result was conjectured independently by Bukh, Loh, and Nivasch, (and they gave some partial results) and by Por himself.
Theorem 3 (Por’s universality result, 2018): Every long enough sequence of points in in general position contains a subsequence of length n whose Tverberg partitions are exactly the so called rainbow partitions.
Por actually proved an apparently stronger statement: We can find a subsequence so the conclusion holds not only for but also for every subsequence of .
The work of Bukh, Loh, and Nivasch relied on an important method of “staircase convexity”. An earlier 2001 application of the method (where it was introduced) was for lower bounds on weak epsilon nets by Bukh, Matousek, and Nivasch (Here are links to the paper, and to slides from a talk by Boris Bukh. See also this post and this one of the weak epsilon net problem.) Roughly the idea is this: consider a stretched grid where the sequence of coordinates are very very fast growing. When you choose configuration of points in such a grid, questions regarding their convex hulls translate to purely combinatorial problems.
Stairconvex sets explained by Boris Bukh
Let ES(n) be the smallest integer such that any set of ES(n) points in the plane in general position contains n points in convex position. In their seminal 1935 paper, Erdős and Szekeres showed that ES(n) is finite.
The finiteness of ES(n) can be stated as follows: Given a sequence of N points in general position in the plane there is a subsequence such that the line segments and intersect. With this statement, the Erdős and Szekeres’ theorem can be seen as identifying a universal set of points in term of its Radon partitions (or equivalently in terms of its order type).
In higher dimensions we can define and replace “in convex position” by “in cyclic position”. The finiteness of (with terrible bounds) follows easily from various Ramsey results. In a series of papers very good lower and upper bounds where obtained: Imre Barany, Jiri Matousek, Attila Por: Curves in R^d intersecting every hyperplane at most d+1 times; Marek Eliáš, Jiří Matoušek, Edgardo Roldán-Pensado, Zuzana Safernová: Lower bounds on geometric Ramsey functions; Marek Elias, Jiri Matousek: Higher-order Erdos–Szekeres theorems .
Por’s result can be seen as a far-reaching strengthening of the finiteness of .
Can you base a higher-order notion of “order types” on Tverberg partitions?
The order type of a sequence of points affinely spanning , is the described by the vector of signs (0, 1 or -1) of volume of simplices described by subsequences of length . Equivalently the order type can be described by the minimal Radon partitions of the points.
Another way to consider “higher” order types is to enlarge the family by to start with a family of points add to it all Radon points of minimal Radon’s partition and consider the order type of the new configuration. (This operation can be repeated times.) See this paper of Michael Kallay on point sets which contain their Radon points.
Understanding order types of configuration of points on stretched grids of Bukh et al. is a very interesting problem. It is interesting to understand such configurations that are not in general position as well. (In particular, which matroids are supported on the stretched grid?) Of course, the method may well have many more applications.
Is the following true: For every sequence of points in there is a Sierksma’s configuration of $n$ points so that every Tverberg’s partition of is a Tverberg’s partition of ?
An even stronger version is:
Does every sequence of points in there is a tree-like simple hypergraph so that all the rainbow coloring of correspond to Tverberg partitions of the sequence? If true this will be a fantastically strong version of Sierksma’s conjecture.
Erdős and Szekeres proved in 1935 that , and in 1960, they showed that , and conjectured this to be optimal. Despite the efforts of many researchers, until recently no improvement in the order of magnitude has ever been made on the upper bound over 81 years. A recent breakthrough result by Andrew Suk (Here are links to the paper, and to our post discussing the result) asserts that . Sometime ago I asked over MO a question on outrageous mathematical conjectures and perhaps the conjecture that on the nose is an example.
Universality of vector sequences and universality of Tverberg partitions, by Attila Por;
Classifying unavoidable Tverberg partitions, by Boris Bukh, Po-Shen Loh, Gabriel Nivasch
On Tverberg partitions, by Moshe White
Beware the Ides of February.
Holes and their reflections (). (The reflections are in the curved surface of an espresso portafilter.)
The 2019 Bridges mathematical art gallery is online! (). Brian Hayes lists his favorites as being the warped notepaper of Matt Enlow and the Penrose quilt of Douglas G. Burkholder.
Some of my own favorites from this year’s Bridges mathematical art gallery (): Fielding Brown’s 3d Lissajous wood ribbon sculpture, Diana Davis’s periodic pentagonal billiards patterns, Stephen Kenney’s illustration of triangle geometry, Elizabeth Paley’s stoneware Klein bottle, and Anduriel Widmark’s knotted glasswork.
Rod Downey, a New Zealand-based theoretical computer scientist who co-founded the theory of parameterized complexity, has won the Rutherford Medal, New Zealand’s highest science award (). Somehow I missed this when it came around last October.
Hannah Bast’s slides on the European Symposium on Algorithms 2018 Track B experiment (). (two independent program committees decided on the same set of papers and then the conference accepted the union of their acceptances). Some conclusions: the initial scoring is remarkably consistent, and per-paper discussions to reconcile differences of scoring are useful, but the final decision on which “gray zone” papers to keep is random and could be replaced by a simple threshold.
Quanta writes up recent progress on the Erdős–Szemerédi sum-product problem, that any set of numbers must either have many distinct pairwise sums or many distinct products (). Progress: “many” increased from to .
How to handle journal referees who ask authors to add unjustified citations to their own papers? (). Is their misbehavior protected by the anonymity of peer review or can they be publicly named and shamed?
The Cal Poly ag students have started selling these blood oranges at the local farmer’s market, as they do every year around this time, only $1 for five. In the summer they sell sweet corn on the cob. ().
Turing patterns in shark skin (, original paper). Researchers at the University of Florida led by Gareth Fraser and his student Rory Cooper used reaction-diffusion patterns (also named Turing patterns after Turing’s early work) to model the distribution of scales on sharks, and performed knockdown experiments to validate their model in vivo.
Did you know that two different graphs with 81 vertices and 20 edges/vertex are famous enough to have Wikipedia articles? (). The strongly regular Brouwer–Haemers graph connects elements of GF(81) that differ by a fourth power. The Sudoku graph connects cells of a Sudoku grid that should be unequal. Sudoku puzzles are instances of precoloring extension on this graph. Unfortunately the natural graphs on the 81 cards of Set have degree ≠ 20…
Josh “cortex” Millard describes how he made a stained glass Menger sponge ().
Jacob Siehler labels cubic graphs with binary strings of length 5 so that all labels appear once and each vertex is the xor of its neighbors (). He can do three vertex-transitive 32-vertex graphs: the Dyck graph, an expansion of the vertices of into four cycles, and another one I don’t know.
Four of Conway’s five $1000-prize problems remain unsolved (): the dead fly problem on spacing of point sets that touch all large convex sets, existence of a 99-vertex graph with each edge in a unique triangle and each non-edge the diagonal of a unique quadrilateral, the thrackle conjecture, on graphs drawn so all edges cross once, and who wins Sylver coinage after move 16?
The list of accepted papers from this year’s Symposium on Computational Geometry just came out ().
Henry Cohn
A follow up paper on the tight bounds for sphere packings in eight and 24 dimensions. (Thanks, again, Steve, for letting me know.)
For the 2016 breakthroughs see this post, this post of John Baez, this article by Erica Klarreich on Quanta Magazine, and a Notices AMS article by Henry Cohn A conceptual breakthrough in sphere packing. See also, Henry Cohn’s 2010 paper Order and disorder in energy minimization, and Maryna Viazovska’s ICM 2018 videotaped lecture.
Abstract: We prove that the root lattice and the Leech lattice are universally optimal among point configurations in Euclidean spaces of dimensions 8 and 24, respectively. In other words, they minimize energy for every potential function that is a completely monotonic function of squared distance (for example, inverse power laws or Gaussians), which is a strong form of robustness not previously known for any configuration in more than one dimension. This theorem implies their recently shown optimality as sphere packings, and broadly generalizes it to allow for long-range interactions.
The proof uses sharp linear programming bounds for energy. To construct the optimal auxiliary functions used to attain these bounds, we prove a new interpolation theorem, which is of independent interest. It reconstructs a radial Schwartz function from the values and radial derivatives of and its Fourier transform at the radii √2π for integers n ≥ 1 in and n ≥ 2 in . To prove this theorem, we construct an interpolation basis using integral transforms of quasimodular forms, generalizing Viazovska’s work on sphere packing and placing it in the context of a more conceptual theory.