This note represents a rare misstep by Dijkstra: the pigeonhole principle actually is a special and important principle worthy of a special name of its own. For example:
1. When you formulate the pigeonhole principle in propositional logic, its resolution proofs are exponentially large. Since modern SAT solvers are basically very fancy propositional resolution provers, this gives you a nice way to find hard instances for them. It also makes the question of when propositional proof systems can be more succinct than one another is also a fundamental question in complexity theory.
2. It also arises in geometry: a compactness for metric spaces is essentially the statement that the pigeonhole principle applies to the space.
I don't have a unified perspective of these two facts, but either one of them is really striking. As a result I'm okay with giving the pigeonhole principle its own natural language name.
I think neel_k was referring to the fact that when a DPLL or CDCL solver concludes UNSAT you can take the trace its backtracking activity and rewrite it from the bottom up as a resolution refutation. So in this sense the solvers are finding resolution proofs.
While they indeed don't use resolution directly, DPLL/CDCL is in many ways equivalent.
Most modern SAT solvers will still have trouble with the pigeon-hole style problems (although some have inbuilt explicit detection, or symmetry breaking, which will let them solve it efficiently).
The fact that every infinite sequence of points in a compact metric space has a cluster point is sort of like the pigeonhole principle. The grandparent comment in my opinion makes much more sense if the phrase "a compactness for metric spaces" is replaced by "sequential compactness for metric spaces".
Firstly, when you break down your proof on the pigeonhole principle, you've shown that the problem reduces to something simple and understandable. It's basically a nicer version of "q.e.d.".
Secondly, and much more important: Integrating the principle allows the reader to understand the end of the proof more easily. I.e., if I find the principle mentioned in the end, I know that I need to look for "pigeons" and "holes" and that the proof is a way to get to those - allowing me to understand the proof from the conclusion going backwards, or at least helps doing so.
Of course, this varies per person and depends a bit on how much you're trained in reading proofs, too. But it's enough to justify naming it in my opinion.
Strongly agreed. I've taught university-level discrete math several times, and beginners to the subject need pegs to hang their hat on, something that serves as a goal and a signal that they're done. The Pigeonhole Principle is an ideal example of this.
Conversely, in research papers, or in conversations among math researchers (at least in my discipline) the Pigeonhole Principle is seldom mentioned by name. The idea is considered too "obvious" to need a name.
a) The history of mathematics is littered with examples of assumed "obvious" results that turned out to have drastic implications. This is especially true of combinatorics, which was blissfully ignorant to the axiom of choice for so long and now must wrestle with it forever. Strict adherence to some formalism may be justifiably understood, and it's a little shocking to find EWD of all people arguing otherwise for such a pragmatic reason as pedagogical clarity.
b) Comparing 0) and 1) as EWD suggests, the first thing I notice is the sudden appearance of the word "finite" which never appears again in the article (nor in negation; also the inelegant but mostly harmless introduction of "nonempty"). The generalization of the principle to support reals costs it its trivial generalization over infinite bags, which is also often valuable (a classic example being Siegel's lemma).
This isn't exclusive to the Pigeon-hole Principle, either. Many named conjectures are pointlessly used, just because they're named, even when it makes the proof longer.
This isn't just exclusive to mathematics. It's similar to the reason people bring in single-line Node.JS packages; a fear of re-inventing the wheel when they don't actually need a wheel in the first place, or when the wheel is trivial to re-state in a more natural form for the proof / program.
I (or any math noob) can easily understand the pigeon hole principle. His more detailed arguments about max, avg and a bunch of other stuff is more obtuse and harder to follow. IMO, pigeon hole principle, where applicable, wins this argument. If you're dealing with a sequence of real numbers, you might need max >= avg.
I agree. I've seen some marvelous pigeon hole explanations of why you cannot write a general lossless compression algorithm that makes all files smaller. I have no idea what that explanation would look like in terms of maxes and averages, and I have to believe it would be much more difficult to understand.
Yes, talking about averages sounds like it might lead a prover astray into reasoning about the average size of compressed data. The average that matters here though is how many inputs map to the same output.
Compressing (n+1)-bit input to n-bit output makes the average output correspond to 2^(n+1)/2^n = 2 inputs. By Dijkstra's rephrasing of the pigeonhole principle, the maximum number of inputs colliding on the same output must be at least 2.
Or we could make it more robust and say we're compressing to n-bit-or-smaller output. Then the average output corresponds to 2^(n+1)/(2^(n+1)-1) inputs. That average is still strictly greater than 1, so the maximum number of inputs colliding on the same output must also be strictly greater than 1 (at least 2, since that's the smallest natural which is greater than 1).
That's really more about surjectivity and simple counting. Neither the pigeon hole principle not a max/average argument are really necessary.
Say you're trying to find a way to compress every n+1 bit string into an n bit string. Losslessness means that you have to come up with a function d from n bit string to n+1 bit string that decompresses. The number of n+1 bit string is larger, so there must be n+1 bit strings that are not in the image of d, which means that such a function d cannot exist.
This is still the pigeonhole principle, just inverted - If you have more holes than pigeons (and you can’t make fractional or quantum pigeons) at least one hole must have no pigeons.
It's more than similar, it's gone from talking about the necessary properties of a function with an image smaller than its domain, to the impossibility of an inverse function as the image cannot be larger than its domain.
We might as well call this "the whole-pigeon principle."
The prompt you were responding to was "I have no idea what that explanation would look like in terms of maxes and averages, and I have to believe it would be much more difficult to understand."
Specifically, they were looking for how you write the can't-always-compress proof using formulation (1)
I feel this is an instance of prof.dr. Edsger W.Dijkstra just being mean. If one works more on pigeon-hole principles you end up with Ramsey Theory, which covers some amazing results.
Yeah, this feel like a comment that might have been OK at a cocktail party or made offhand in the middle of an essay on something else.
Sure, maybe, in some instances, some people, might, perhaps make a little too much out of using the pigeon hole principle. But here, he make too much out of that situation.
I disagree with this note, even though I often read Dijkstra's notes for enlightenment. The pigeonhole principle is one of the basic tools we have to approach mathematics, and therefore deserves a name. We often assimilate proofs by "abbreviating" the major steps through naming them - "pigeonhole principle", "passing to subsequences" etc.
Even in the realm of "infinite" mathematics like analysis, theorems like the Heine-Borel Theorem and the Bolzano-Weierstrass theorem rely on pigeonhole-like arguments. Of course, that statement can be made rigorous in reverse mathematics.
Since a proof is a proof, the methods are a matter of taste. Driving away from pontless reductio ad absurdum arguments is a good thing, but sometimes putting in the spotlight the object that need to be matched or counted instead of bare numbers can be useful.
For instance, in the ranch example I'd ensure that I'm counting correctly by considering the cowboys and horses in the connected components of the bipartite graph whose edges represent exclusive horse ownership: each connected component contains 0 or 1 cowboys and 1 or more horses.
I hardly ever disagree with the man, but as long as at least 50% of freshman get this wrong in introductionary lectures of fundamental logic, a named principle is very appropriate imo.
The pigeonhole principle is about counting things.
Counting things is one of the simplest things in mathematics. Much simpler than "averages", "real numbers" (a very complicated subject) and other things Dijkstra mentions in his alternative formulation.
It seems like he's sacrificing simplicity and ease of understanding for other properties that aren't generally applicable.
1. When you formulate the pigeonhole principle in propositional logic, its resolution proofs are exponentially large. Since modern SAT solvers are basically very fancy propositional resolution provers, this gives you a nice way to find hard instances for them. It also makes the question of when propositional proof systems can be more succinct than one another is also a fundamental question in complexity theory.
2. It also arises in geometry: a compactness for metric spaces is essentially the statement that the pigeonhole principle applies to the space.
I don't have a unified perspective of these two facts, but either one of them is really striking. As a result I'm okay with giving the pigeonhole principle its own natural language name.
Rare? He was often biased by his preferences, teaching styles, and opinions, in his notes, as opposed to pure science/logic.
Most modern SAT solvers will still have trouble with the pigeon-hole style problems (although some have inbuilt explicit detection, or symmetry breaking, which will let them solve it efficiently).
https://williewong.wordpress.com/2010/03/18/compactness-part...
Firstly, when you break down your proof on the pigeonhole principle, you've shown that the problem reduces to something simple and understandable. It's basically a nicer version of "q.e.d.".
Secondly, and much more important: Integrating the principle allows the reader to understand the end of the proof more easily. I.e., if I find the principle mentioned in the end, I know that I need to look for "pigeons" and "holes" and that the proof is a way to get to those - allowing me to understand the proof from the conclusion going backwards, or at least helps doing so.
Of course, this varies per person and depends a bit on how much you're trained in reading proofs, too. But it's enough to justify naming it in my opinion.
Conversely, in research papers, or in conversations among math researchers (at least in my discipline) the Pigeonhole Principle is seldom mentioned by name. The idea is considered too "obvious" to need a name.
a) The history of mathematics is littered with examples of assumed "obvious" results that turned out to have drastic implications. This is especially true of combinatorics, which was blissfully ignorant to the axiom of choice for so long and now must wrestle with it forever. Strict adherence to some formalism may be justifiably understood, and it's a little shocking to find EWD of all people arguing otherwise for such a pragmatic reason as pedagogical clarity.
b) Comparing 0) and 1) as EWD suggests, the first thing I notice is the sudden appearance of the word "finite" which never appears again in the article (nor in negation; also the inelegant but mostly harmless introduction of "nonempty"). The generalization of the principle to support reals costs it its trivial generalization over infinite bags, which is also often valuable (a classic example being Siegel's lemma).
This isn't just exclusive to mathematics. It's similar to the reason people bring in single-line Node.JS packages; a fear of re-inventing the wheel when they don't actually need a wheel in the first place, or when the wheel is trivial to re-state in a more natural form for the proof / program.
Compressing (n+1)-bit input to n-bit output makes the average output correspond to 2^(n+1)/2^n = 2 inputs. By Dijkstra's rephrasing of the pigeonhole principle, the maximum number of inputs colliding on the same output must be at least 2.
Or we could make it more robust and say we're compressing to n-bit-or-smaller output. Then the average output corresponds to 2^(n+1)/(2^(n+1)-1) inputs. That average is still strictly greater than 1, so the maximum number of inputs colliding on the same output must also be strictly greater than 1 (at least 2, since that's the smallest natural which is greater than 1).
Say you're trying to find a way to compress every n+1 bit string into an n bit string. Losslessness means that you have to come up with a function d from n bit string to n+1 bit string that decompresses. The number of n+1 bit string is larger, so there must be n+1 bit strings that are not in the image of d, which means that such a function d cannot exist.
We might as well call this "the whole-pigeon principle."
When you compress a file, you are representing a big number with a small number.
There are more big numbers than there are small numbers.
Specifically, they were looking for how you write the can't-always-compress proof using formulation (1)
Sure, maybe, in some instances, some people, might, perhaps make a little too much out of using the pigeon hole principle. But here, he make too much out of that situation.
(I usually like Dijkstra's works but the quote does make a person smile.)
Even in the realm of "infinite" mathematics like analysis, theorems like the Heine-Borel Theorem and the Bolzano-Weierstrass theorem rely on pigeonhole-like arguments. Of course, that statement can be made rigorous in reverse mathematics.
I like the metaphor, it helps me to understand the principle in the first place. I doubt I'm the only one who feels this way.
Counting things is one of the simplest things in mathematics. Much simpler than "averages", "real numbers" (a very complicated subject) and other things Dijkstra mentions in his alternative formulation.
It seems like he's sacrificing simplicity and ease of understanding for other properties that aren't generally applicable.