Social media and democracy: Can we learn from machine learning?

Marco Lardelli
5 min readSep 17, 2019
Image: Shutterstock

In a democracy, there is not a single person (i.e. King) making the decisions but a large number of people make them together. What are the benefits of this setup? Why is democracy historically so successful? We might argue that democracies are more stable, because everybody is allowed to participate in the decision process and therefore the result will represent the wishes of the people. I personally don’t believe in this theory: there are many cases in history where the majority in a democratic country suppressed a minority. I rather assume that in a democracy, the decisions tend to be much smarter than what can be achieved by a single king or a small group of leaders. It is the collective intelligence formed by the whole voting population which makes democracy superior to other systems of government.

Voting and electing are, from a machine learning perspective, classification problems: An „algorithm“ in our brain has to choose one from several available options as the correct one. The options are candidates in the case of elections or „YES/NO“ in the case of polls (like in the direct democracy in Switzerland).

The democracy method is also well known in machine learning under the term „ensemble methods“. Ensemble methods combine several weak classifiers (for instance small decision trees) to form a better classifier together. In the „bagging“ method for instance, each (simple) classifier is trained with a different subset of the data. The outputs of these classifiers are then used to vote on the final result. The results are often much better than what can be achieved with a single complex classifier (like a large decision tree) with comparable effort.

Assuming that this analogy between voters in a democracy and classifiers in a machine learning ensemble is to some extent valid, we can learn a few things about democracy and media:

„Stupid“ voters

It is probably not a big problem that people are „stupid“ (I personally really don’t think they are, but it is an argument we can hear often in discussions why democracies don’t work well enough). With ensemble methods you can build a powerful classifier from a large number of dumb classifiers. Of course, the better the individual weak classifiers work, the better the result of the ensemble will be. It does therefore make sense to invest into the education of people. But in theory it should be possible to get smart election results also from uneducated people. You might just need more voters to get a good result.

„People don’t read good newspapers anymore“

In ensemble methods it is important that the individual classifiers are different, which is usually achieved by training them on different subsets of the data. If we train all the weak classifiers on the same data, the performance of the ensemble will be equivalent to the performance of a single weak classifier (which is poor). This is, because in this case all the classifiers in the ensemble will be identical after training and will therefore always vote for the same class. Therefore it is undesirable that the whole population reads the same newspaper (like the „Pravda“ in the former Soviet Union). So if you believe that things would improve if everybody read your favorite quality newspaper, think again.

„People get their information only from Facebook these days“

This (alone) can be an advantage. The social media feed of every person is different (according to interests, friends and location) and therefore the requirement, that people are consuming different subsets of the data is fulfilled. Social media could implement this in theory much better than classic media (newspapers and TV) which are not personalized and feed the same information to a large number of people.

The filter bubble

What happens in machine learning if we start to train a classifier only on data from which we assume that the classifier will classify it in a specific way (agreement for instance)? This happens when we let algorithms decide in social media which information people will see, based on what they liked before. I have never tried this in a machine learning experiment but it sounds like a very bad idea. The classifier will most probably deteriorate quickly. If we do this with the classifiers of an ensemble, its performance will also deteriorate.

In machine learning we select the data subset, which is used to train one of the weak classifiers, randomly. The subset should therefore still contain information about the whole data set.

In ensemble methods it is also possible to make the weak classifiers different by feeding each of them a different subset of the data fields. Transferred to democracy, this means that it is not a problem if each voter cares only about certain topics (and not all of them). This could even be an advantage. But again, if the data fields / topics are selected by an algorithm based on the classifier / voter behavior, the performance of the classifier / voter will deteriorate.

„Fake news“

What happens to an ensemble, if we add noise to the training data? Noise in the sense of mislabelled training records (i.e. for an image classifier pictures of horses labelled as tigers). The performance of all the weak classifiers would deteriorate. But it would be still possible to get a good performance of the ensemble (we might just need more weak classifiers). If you feed all the classifiers with the same data (the extreme classic media „Pravda“ case) the damage from fake news is much higher (remember: in this case the performance of the ensemble will be equivalent to the deteriorated performance of a single weak classifier). Therefore social media maybe has the potential to make democracy more resilient to fake news. But if we allow fake news to be pushed via paid posts to a large fraction of the voters we again have a problem.

The viral effect

In social media it is possible that some content reaches a large fraction of the population due to viral effects. This again might be bad, as it „trains“ the „ensemble“ members on the same data. Therefore „forwarding limits“ as recently implemented by WhatsApp might be a good thing for all social media platforms.

We can conclude from this analogy that social media could be a chance for democracy and could potentially offer substantial advantages over classic media. But only if we get it right! If social media platforms are not carefully designed, their impact on democracy might be adverse.

This topic still needs and deserves more research and I believe there is a lot to be learned from machine learning.

--

--

Marco Lardelli

Swiss physicist, software / machine learning engineer , data scientist. Twitter: @MarcoLardelli https://lardel.li