When societies make decisions in the information age — is this a bit like video processing?

5 min readNov 9, 2021

This article is about something I know well — video processing — and something I don’t really know at all — the role of information systems when societies make collective decisions (like voting in elections, or using a stock market to determine the values of companies).

I suggest that these things could be similar, look at traits of video processing systems, and ask whether these can tell us anything about collective decision-making in the modern world.

If you’re an expert in the role of information systems in society I’d be delighted and fascinated to hear what you think about this. But do please note the aim here is to share ideas that might stimulate discussion and exploration. Some or all of this analogy could turn out to be very wrong!

Algorithms

A video processing algorithm

In 2003 I finished my Ph.D in video processing, focussed on the task of how to track moving objects in video images. This was before some of the bigger advances in deep learning and we worked with stochastic signal processing methods known as particle filters. For my application they worked like this:

Take a model of how you think something looks, and start with your best guesses at where it is. If you have no other information you may have a wide distribution of guesses (it could be anywhere), or if tracking through a video sequence their location might be constrained by previous times. Each guess location is known as a ‘particle’.
Jiggle these particles around a bit, and look for evidence that the object is at these modified particle locations by checking to see how much the model matches the real video there.
Keep a subset of the particles, but give priority to the ones that have the most evidence. Particles with weak evidence are not all thrown away because there is still some chance the object is at those locations, but they less likely to be chosen. The subset that remain represents the distribution of where the object might be.
Aggregate the results from the particles to give a best estimate.

Looking for the torso in video images. Shows perturbed initial guesses as at step 2 (left), and the subset remaining after evaluating evidence as described in step 3 (right).

An algorithm for collective human decision-making

Let’s compare the above video algorithm steps with what happens when large groups of people make collective decisions. I’ll use a democratic election as the example, but we could imagine a similar process when investors select companies to buy or sell on a stock market.

An individual within a population has a set of mental connections that have been informed by their experience of the world and the information they’ve received up until now. We could think of each person like a particle in the video processing system.
Organisations, communities, and individuals provide information flows to affect the mental connections of each person (particle) before they cast their vote. This is like the act of jiggling the particles around.
The vote of each person is cast. When they do this their minds are in some way summing the internal signals that indicate which party or candidate they should choose.
Votes are counted to aggregate the individual decisions across the population.

Traits of the stochastic video processing algorithm

The above algorithms sound similar don’t they? But is this analogy of any use? Can we use traits of these stochastic video processing techniques to tell us something about democracy or about stock purchases? I’ll introduce a few traits that might apply so that we can explore the analogy further.

Video systems that succeed

The particle filter technique described depends on a computer model of how an object looks. If this model is accurate then the system can do a great job of tracking the object. It can even do so when evidence is ambiguous, because enough particles will ‘vote’ for the most appropriate outcome in aggregate.

Video systems that fail

Unfortunately, engineers using particle filters for video processing soon discover that this technique is irritatingly effective at seeking out the ‘holes’ in a model. What if there are other objects that look a bit like the one you’re trying to find? The jiggling around could cause some particles to latch onto the wrong one. If the model is decent then that should only happen to a few particles, and the aggregate decision will be correct. But if the model is weak then things can go wrong. Sometimes they go wrong for a few frames and snap back, but sometimes they go wrong irrecoverably.

Systems can also fail if the input data is bad. Maybe the camera is noisy or faulty. Making sure that the system does not produce very spurious results in such a case is an important engineering challenge for these types of system.

Questions

With those traits in mind, here are questions to explore ways that the analogy might be useful. I wish I had answers but I don’t (yet!) — hope these are food for thought anyway. Do let me know what you think.

Firstly, is it even reasonable to imagine that a democratic or investment process has strengths and weaknesses of a similar nature to the stochastic signal processing system described? And if so, how might we model the system in more detail?
Can the analogy help reach consensus on a ‘strong’ model for our societies to have that we should build this into the systems? For example, does it tell us anything about the roles of civic education and deliberative democracy, or the regulation of financial markets?
In this analogy we could consider modern information systems (including social media) as having unprecedented power to ‘jiggle’ the particles, and therefore to cause models that even recently had appeared strong to make unexpected decisions. It is tempting to think those decisions might turn out to be inefficient, or even catastrophic. Is it helpful to think of the system in this way?
If so how do we design systems that mitigate this problem? How much of it can be solved by strengthening the model and how much by constraining the information perturbations?
Some analogies of noisy video input data in the video system might be poorly formed policy options available to choose between, or corrupt companies on the stock market. Obvious fixes would be to have well-designed policy options and companies with strong codes of ethics. But if the analogy holds and the systems are not robust to these scenarios, then they could result in extreme outcomes. Is there a need to consider how to make the systems more fault-tolerant?