Return to home page
More flaws in algorithmic democracy
4 January 2003
Kenneth Arrow showed in his classic monograph published 50 years ago, Social Choice and Individual Values, that there is no way, under broad and intuitively very plausible assumptions, of guaranteeing that a social welfare function based only on individual preferences be proof against cyclical majorities. Call this the Condorcet-Arrow paradox. As Arrow points out, the social welfare function represents an abstract generalisation of any possible mechanistic democratic process that proceeds directly from individual preference to collective choice; the flaw therefore applies to any proposal for mechanical direct democracy. I will call such schemes algorithmic democracy.
The demonstration is important in the history of ideas as a body blow against the extreme political rationalism of Bentham and his socialist and non-socialist heirs, struck using their own mathematical tools. The proof has stood. However, it is not by itself enough to show that algorithmic democracy is just a political fantasy. Apart from quibbling about Arrow's assumptions, Bentham has two good defences.
First, he could well ask whether cyclical majorities are likely to occur. In the real world, a voting system may be disrupted by a variety of causes. Some of these are sufficiently likely to call for preventive action: fire, corruption, human or computer error. Others, such as large meteorites, mass hallucinations, and nuclear war, are usually treated as being too remote to worry about in this context. So it is a fair question whether the Condorcet-Arrow paradox is a real-life risk or a practically negligible one.
It is essential for cyclical majorities to posit differences in preference maps. Where can these come from? Random variation is one such source. Heroically, suppose preferences have no structure at all, and are just chaotic. If the number of choices is very small, the risk of circularity is significant. But it is an intuitively appealing conjecture that the frequency of cyclical majorities converges to zero as the number of individuals and choices increases. Since polities of interest have from 102 to 109 members, and the number of possible states is enormous – i will return to other implications of this -, the conjecture would mean that the Condorcet-Arrow paradox could be ignored for all practical purposes, like a giant meteorite. The first step in proving or refuting the conjecture would be a statistical analysis of the frequency of cyclical majorities under various assumptions. This task is beyond me; however, I append a sketch of an argument pointing in the direction of the conjecture.
A second and more radical argument is that while the positive theory of a market economy depends on differences in endowments and tastes, for public policy the hypothesis is less obvious than it sounds. To a first approximation, people are all the same: we pursue like lemmings a restricted set of goods – health, sex, reproduction, status, comfort – making what use we can of our several endowments in the varied environments we face1. Refine this a little, and each of us can mark off a small group of kin, friends and workmates that we treat as individuals; but for any polity bigger than a village, most of our fellow-citizens are nameless strangers. If we propose to take account of their interests at all, the only way we can do so is by radical stereotyping. So the second approximation is that all strangers are the same to me; and in a given society individuals are likely to operate with a similar stereotype. The small individualised portions of our social preference maps will be equally unimportant under any even roughly egalitarian scheme of aggregation, and the regions of the social welfare function that are relevant to decisions will correspond to the cookie-cutter model, in which circular majorities cannot arise.
These arguments are not watertight, but they strengthen the view that to make the Condorcet-Arrow paradox a serious practical risk, it is necessary to adduce positive arguments that make circular patterns more probable than under random variation.
The aim of this short essay is to approach the problem in a roundabout way, by linking cyclical majorities to two other flaws in algorithmic democracy. The problem of cyclical majorities is a flaw under the the test of determinacy; but there are other issues in relation to tests of of computability and consistency, which are also minimum requirements for any rational decision-making system. The problems are related, in that solutions to one make the others worse. I will not refer here to the common-sense test of giving a sensible result, in the sense of the vicinity of a Pareto optimum. If a process gives non-sensible results, then there exists a large potential coalition for changing the rules by bribery or revolution. Such schemes have little merit and and if implemented are likely to collapse.
The starting-point on the computability test is that the neo-classical theory of individual choice is epistemologically impossible on combinatorial grounds. No human brain, or for that matter any conceivable computer, could process an individual welfare function that ordered all possible consumption patterns. Take the very restricted case of a shopping expedition to a small supermarket. A basket of 10 items has to be chosen out of the 1000 lines carried by the store. The number of possible baskets is
= n!/(n-r)!r! = 1000!/990!10! > 10010 = 1020
We need say 30 bytes to code each basket and another 3 for its ranking, so a list of the possible baskets requires
> 33 x 1020 bytes of memory = 3 billion gigabytes.
Sorting this list, at the implausible speed of 1 microsecond per basket, would take over 3 x 1013 years, which is 2000 times the age of the universe (about 1.5 x 109 years). A fortiori, the same is true for an individual social welfare function that orders all possible social states, and for a collective social welfare function that takes into account all the contents of individual social welfare functions.
As a foundation for microeconomics, the idealisation may not be too misleading for predictions of individual behaviour. But scaling up to collective choice involves going from metaphors to explicit public algorithms; and here the assumption of zero computational costs must be wrong. Indeed, the combinatorial impossibility of any method of comprehensive choice leads to the reverse conclusion: one major constraint on a model or scheme for decision-making is that it must be computable with the available resources, and lead to a definite result in a reasonable time.
We need a more realistic theory of choice under limited computation. Real shoppers (adults anyway) are clearly able to make consumption choices within the colossal variety thrown at them by a modern capitalist economy without getting stuck like Buridan's ass or spending all their waking hours on the task. More than that, most people seem to make fairly sensible choices; they buy airline tickets with hotel reservations, meat, potatoes and vegetables to eat, and only rarely go bankrupt by breaching budget constraints. People seem to proceed roughly like this. They define projects (meals, home maintenance, outings), using fuzzy knowledge about typical costs; draw up detailed shopping lists of complementary inputs required by the projects, using fuzzy knowledge about the best suppliers; and adapt the lists while shopping in the light of more accurate information about prices and availability. Small divergences from expectations lead to substitution (rump steak for fillet), major ones may lead to a review of the projects (not replacing the washing machine, changing the family holiday destination) or a search for new suppliers.
Let me try to formalise this a little. The decision-making process has the form of a branching search pattern, as in the exploration of a maze. The shopper sets up a decision tree that reflects in a fundamental way the distinction between means (which structure the branching) and ends (which supply the criteria for choice within it). Its fundamental components are therefore not allocations of resources but courses of action. These are linked in causal chains:
= going on a holiday
booking a hotel
hotel list from travel guide.....
or requires fixing accommodation with relatives
phoning relatives ....
booking a plane ticket
and so on.
At the many ends of the decision tree, we can identify fully specified courses of action, as it were by leaves or groups of leaves on the ends of the finest twigs. In the typical case, there are thousands of possibilities. This structure means that the relationships between courses of action are characterised by pervasive complementarity between means and ends and clusters of correlated means to a given end, and substitution between ends and between alternative means to a given end. The preference any individual attaches to a given course of action will depend markedly on the realisation of its complements and substitutes. These relations of complementarity and substitution are not matters of personal taste, but reflect our perception of the causal structure of our environment. Our perception, not necessarily the real structure of nature; for the Three Witches in Macbeth , eye of newt was an essential ingredient of a good curse.
The consumer's actions are not determined only by the tree structure. At each branching point the individual has to make a choice in the light of limited information, going deeper down the branch until reaching either a satisfactory end-point or an unsatisfactory one, a dead end. If the latter, the consumer retraces steps back to a higher branching point and tries again. What are the criteria for these choices? As a minimum, there must be a scale of satisfaction or satisfactions and an imputed cost of search time for deciding whether to stop, and a map of expectations constructed to meet a budget constraint. If the price of the coffee-maker is higher in Wal-Mart than expected, we will look elsewhere, if there is another store reasonably close; if not, we stop. We have a narrow-beam searchlight, not a 360° radar. It looks as if the the computational and informational demand of such a model is finite. It may require subjective cardinal utility for the benchmarking.
The model generates the indifference maps of textbook consumer choice theory, but only as an overall assessment applied to sets of worked-out choices. Allow prices to vary, watch consumer choices change, and ask the consumers whether they are better off overall than before. But our minds focus on our real environment, not remote contingencies. Middle-class Americans have no idea how they would live on the income of an African peasant tor a billionaire; nor it is reasonable to expect otherwise.
This model scales up in a far more satisfactory way to collective choice. A hospital building programme requires financing; national prestige requires advanced weaponry. In turn, these chains typically branch into several options: the programme can be financed by increased taxation, borrowing, or decreased spending elsewhere.
Complementarity and substitution are as pervasive features of collective choice as individual. What are the essential differences? First, the end-points of individual economic choices are typically single acts of consumption, work or saving, as in the classical theory of choice. Collective political decisions are typically more general, such as the adoption of a law or policy. In a small-scale democracy the town meeting may debate the painting of the fire station, but for larger units this is made impossible by decision-making costs. Nor is it desirable, unless you attach an extraordinarily high value, as 1960s radicals did, to political participation as such. Most theorists of democracy have advocated a “rule of laws not of men”; leaving the decisions how to implement laws and policies in individual cases to bureaucrats and judges is not only convenient, it is one of the few protections against abuse of government power. Only a totalitarian state, Inca or Stalinist, would ever purport to decide on fully-worked-out allocations of goods and services.
The second point of note is that preference trees will not only be more subjective in the public sphere than the private, they will be even more fragmentary. By combinatorial necessity, we only go to a lot of trouble to work out details of implementation for matters we think important. The universe of public choices is wider than that of our private ones, and our knowledge less, so we will only have detailed preference maps on our priority issues, and sketchy ones on others. A computer geek may have a preference between fibre-optical and coaxial cabling for broadband Internet access, while a feminist has no opinion; and vice versa for gender stereotyping in textbooks. The very hierarchy of the branches of preference trees reflects such personal priorities: a conservative might begin with cutting taxation, a social democrat with improved public services. Personal trees for public choices differ both in fuzziness and in structure.
The third point is that differences in perception about causality do not mean much in the overall organisation of a market economy: eccentric views about diet may have consequences for those who hold them, but not to others; the theory of externalities covers the exceptions. But the general nature of political decisions means that the relevant individual preferences must deal with high-order, abstract causal relationships in the functioning of society. Even very well-informed people have different understandings of the effectiveness of penal laws and economic policies, such as the impact of gun control on street crime, or that of a minimum wage on unemployment. The pattern of complements and substitutes that is so transparent for our private consumption is, in the public sphere, deeply puzzling to the finest minds. These relationships also lead to differences in structure in preference trees.
How can an algorithmic democracy deal with preferences that differ deeply in these ways? It looks as if it must include the union of the most detailed courses of action to be found in individual preference maps. But these courses of action are not found at the end of homologous branching structures. The collective algorithm has to impose a common structure of choices, not necessarily held by any citizens, and certainly not held by many of them. In this respect the scheme must fail to take into account a major feature of the political views of the citizens. It is therefore either infeasible or unacceptably dictatorial, as with the benign and superhumanly knowledgeable legislator of classical utilitarianism.
On the other hand, these differences are only to a small extent the result of the innate variation in personal tastes and dispositions. They reflect collective forces - ideology, group interest, religion, fashion, family tradition, education and schools of scientific thought. Consequently political views – especially their major options - are well-structured and highly predictable from other characteristics, as political consultants well know. This cohesiveness makes democracy possible, as well as imperfect.
The other class of algorithmic failure in decision-making lies in inconsistency between means and ends. As we have seen, in their private consumption choices individuals are quite skilled at dealing with causal constraints and relationships, but public choices are far less successful, whether in democracies or autocracies.
The simplest type of means-ends inconsistency in a democracy arises when many individuals have severally inconsistent views about ends and means. This is the autocrat's or oligarch's argument against democracy, but there is little evidence that non-democratic forms are any better – on the historical record, autocrats have been singularly prone to irrational decisions, oligarchs perhaps less so2. It is a reasonable assumption for algorithmic democracy – in fact any decision-making system at all – to insist on individual consistency: “wer A sagt, muss auch B sagen”.
A more interesting case is seen when many share views about ends, but have divergent views as to means. A social welfare function that independently aggregates preferences about ends and means would give a high ranking to the end, and a low ranking to each of the means. In this way consistent private choices can lead to inconsistent collective ones. Common observation suggests that such blockages are far from rare. A recent example from US history is the failure of the first Clinton administration to enact universal health coverage. Another, and far more serious, was the failure of the Founding Fathers to abolish slavery, during the early decades of US independence when this aim was shared by the majority of the population, of the states, and of educated opinion. Abolitionists could not agree on the strategy: stopping the slave trade first, or buying out all the slave-owners3. In both historical examples, the decision-makers we know of had individually consistent agendas.
The second step might be to avoid separate decisions on means and ends, and allow only consistent courses of action (sets of ends and means) as arguments in the algorithm.
But this is by itself only opens up a real possibility of circular majorities, as the following example shows.
a toy universe of three agents (1, 2, 3) evaluating four possible
states (a, b, x, y). The states a and b are mutually independent,
likewise x and y. But the two subsets are not mutually
independent. The interpretation is as ends and means. These are
related by a constraint that only pairs of ends and means are
allowed. So the objects to be ordered are ax, ay, bx, by.
Assume by is last choice for all and can be ignored. The
following dual patterns generate cyclical majorities under a simple
1: ax > ay > bx and its inverse 1a: bx > ay > ax
2: ay > bx > ax “ 2a: ax > bx > ay
3: bx > ax > ay “ 3a: ay > ax > bx
There is nothing in the model that makes these patterns infeasible, so the point is made.
than this, a cyclical pattern is in some circumstances intuitively
quite likely. Interpret the universe to be a legislature with three
factions, and the choice the adoption of a budget. Let a and b
be expenditure plans; a is high, b low. Let x be a
low tax law, y high. Then the choices can be interpreted as: ax:
high deficit budget; ay: high balanced budget; bx: low
balanced budget, and the despised by: low surplus budget. The
three factions can be read as follows, again with psychological
plausibility: Faction 1 are liberal Democrats (the “left”):
their priority is high spending, with a secondary preference for low
taxes; faction 2 are conservative Republicans (the “right”):
their priority is low taxes, with a secondary preference for low
spending; faction 3 are conservative Democrats (the ”centre-right”).
Their priority is a balanced budget, with a secondary preference for
Under the first pattern above, the interpretation of the paradox is then:
the high deficit budget is voted
down by an alliance of the left and centre-right:
▪ the high balanced budget is voted down by an alliance of the left and right:
▪ the low balanced budget is voted down by an alliance of the left and centre-left.
Such scenarios cannot be dismissed as remote mathematical contingencies. The US form of government, which allows any member of Congress to propose either tax or spending, has extreme difficulty in adopting a budget each year; in other systems, the initiative of legislators over the budget is constrained, solving one problem at the price of a considerable transfer of power to the executive branch, which is a local oligarchy or autocracy.
It may be objected that the example holds for a small universe of choice, and a true algorithmic democracy, cutting out the middleman, would be practically safe, if not theoretically immune, from the problem. But this leads immediately to the converse problem of aggregation. Since individuals have so many different consistent plans for society, how do you reach a single solution?
One strategy is to allow the social welfare function to take as arguments the first-choice complete courses of action. It is most unlikely that there will be any kind of majority for any one of them. If we take a simple plurality, the decision may well be determined by small coteries of like-minded cranks. An iterative algorithm could take into account the second, third, .... n th choices of the participants, and allow the progressive rallying of support, through elimination of the less attractive options.
Quite apart from the combinatorial problems in processing such a vast data set, a more significant objection is that as we have seen individual preference maps are incomplete. How do you take account of the empty or fuzzy views of many people on some particular issue? Where do you start the process of elimination? By starting with the trivial, would there not be a risk of the final result on big issues being sensitive to an arbitrary choice among insignificant ones?
The last paragraph shows the way forward. The story of successive elimination of marginal choices and convergence on a scheme acceptable to many is basically sound, provided we do not read it algorithmically. For the citizen, working out what one wants and how it might be achieved is only the first step. The only way to reach a collective decision starting from structurally different and incomplete preference trees is, it seems to me, an interactive process of mutual discovery, bargaining and coalition-building. The basic model of democratic politics as a learning process is the scenario of The Magnificent Seven: A meets B, they exchange views, find common ground, negotiate away key differences into a common agenda – which is still partial - and then try to recruit C. This discovery process driven by incomplete information and pervasive complementarities. There is no reason to think that the end-points are unique; small differences in the starting sequence in a clean-start model could presumably lead to different results; chance and initiative make a difference.
One important intermediate result is the establishment of a party electoral platform. Real party platforms may be inconsistent in detail but they are expected to set out a coherent set of national priorities, and often do. Parties are essential mediators between individual choices and political decisions, not because individuals are stupid or inconsistent but because the only reliable scheme of aggregation of their preferences is stepwise discovery and negotiation. Essential mediators are of course in a position to extract rents, so actual parties are often very unsatisfactory; but they deserve a much better press than democratic theorists usually give them. There is no reason to think that party coalitions are uniquely determined. However, the social investment in the complex task of assembling a party explains why parties live so long; the basic outline of the US or British party map has changed only once or twice a century4.
These arguments are of course only a sketch but are nevertheless already quite persuasive. It was already shown by Arrow that no scheme of algorithmic democracy can be guaranteed determinate. My conjecture is that no such scheme can meet the linked tests of determinacy, computability and consistency to the more realistic standard of a reasonable surety. Democratic politics cannot be reduced to a procedure, but has to be a way of life.
Note on the frequency of cyclical majorities
Take 3 individuals 1,2,3 ordering 3 alternatives a,b,c. Let the voting rule be straight majority.
For one individual, there are 3! = 6 possible orderings. The number of distinct permutations of sets of 3 orderings is 6P3 = 120; distinct combinations, 20.
Only the combinations are relevant, as it doesn't matter for determinacy which individuals hold particular patterns of preference, only what overall patterns there are.
brute force enumeration, there are only 2 cyclical combinations:
a>b>c, b>c>a, c>a>b
and its mirror image
c>b>a, a>c>b, b>a>c.
The proportion of cyclical majorities is 10%.
increase the number of alternative states to 4. The total number of
possible orderings for one individual is now 4! = 24.
For our three people, it is 24.23.22/6 = 2024.
Cyclical majorities of three can arise for any of the four subsets abc, abd, acd, and bcd.
For each of these there are, as above, 2 combinations with cyclical majorities, making 8 in all, or 0.4%.
Are there any cyclical majorities of 4 or more elements that do not include one of the cyclical subsets of three? I have been unable to find any and suspect that they do not exist.
In that case, the general formula for the frequency of cyclical majorities is
(n!/(n-3)!) / ((2 * n! * (n-3)!))/ 3!) = 3 / ((n-3)!)2
The terms of this series converge rapidly to zero.
I conjecture, but am unable to prove, that this convergence would hold for larger sets of individuals.
1For an amusing debunking of academic myths of Altérité, and a long list of cultural universals, see Steven Pinker, The Language Instinct, 1995 edn, pp 59-65, 411-415
2The best example of a long-term prudent oligarchy is the Republic of Venice.
3Joseph J. Ellis, Founding Brothers, 2000, Chapter 3.
4This only holds for a first-past-the-post system. Under proportional representation, parties are more ephemeral and closer to the initial preferences of the citizens. The bargaining takes place between parties in the parliament to form government coalitions.