Whodunit? Maths Has The Answer

17th August 2015


Dame Agatha Mary Clarissa Christie

Dame Agatha Mary Clarissa Christie, DBE is officially the most successful novelist of all time even after her death in 1976. With over 83 books published in her lifetime - and 66 of those murder mysteries - she has been incredibly popular for 125 years and has sold over 2 billion books (2 BILLION) in over 103 languages.

Her novel “And Then There Were None” has sold 100 million times, making it the world's best-selling murder mystery ever, and one of the best-selling books of all time.

Readers gripped by the adventures of her fascinating characters have always loved trying to figure out who the murderer was or ‘whodunit’.

In the Doctor Who episode “The Unicorn And The Wasp” (editors note: if you don’t know what Doctor Who is then you really should) when the time travelling genius meets Dame Christie in 1926 he gives her high praise indeed saying  “Oh, I love your stuff. What a mind. You fool me every time. Well, almost every time. Well, once or twice. Well, once. But it was a good once.”

Data Analysis

Most people would be stumped more than “once” trying to figure out ‘whodunit’ in an Agatha Christie novel.  But researchers’ Dr Dominique Jeannerod, Dr James Bernthal and data analyst Brett Jacob have devised a formula to predict who the murderer is in each book.

They told The Guardian newspaper that they analysed 27 of Christie’s books, looking at lots of critical factors such as location, murder method, which detective was the protagonist and even what modes of transport are prevalent to predict who the murderer is likely to be.

Agatha Christie fans may be horrified, and would argue that her inventive imagination cannot possibly be boiled down to numbers. It is certainly true that the success of her novels was not just due to the suspense of the ‘whodunit’ factor but the fact that she was a brilliant writer and an incredible observer of people. She understood what made them tick and this shows in her novels.

However, with such meticulous analysis, certain patterns do emerge and the team were able to identify them. The team explained that “We were able to discover patterns emerging in several aspects of Christie’s novels: trends formed when we grouped our data via year, detective, gender of culprit, motive, cause of death”

The formula itself is rather dauntingly complex so we will be kind and summarise their findings for you (luckily the team themselves have done so as well). Here is the formula.

k r, δ, θ, c=f{rk +δ+θP,M, c(3≤4.5}

The research was commissioned by the TV channel Drama as Professor Eugenia Cheng’s formulas on food were commissioned by various food companies but this in no way invalidates the work involved and they are still very interesting to read about!

What To Watch Out For

One factor is the relationship - represented in the formula by the letter r - between the murderer and the victim. As Brett Jacob explains … in the majority of cases the victim is related by blood or a spouse of the killer…” which is certainly a good place to start. They also discovered that a “main clue” tends to show up halfway through the story.

Many of the results of the analysis concern the gender of the killer. For example, they found that if the victim was strangled or stabbed, the killer was more likely to be male but if they were poisoned the killer was more likely to be female.

This makes a slightly sexist kind of sense; men are seen to be more physical and violent whereas women would be seen as more devious.

However some patterns seemed to make no sense yet the conclusions drawn from the data analysis seem to prove accurate – if the primary mode of transport featured in the novel is more nautical rather than cars or trains, the murderer is likely to be male. Even the location gives a clue – if the novel is set in a country house the killer was 75% more likely to be a woman than a man.

Female killers are more likely to be identified by ‘a domestic item’ such as forgetting a glove or hat, whereas male killers were more likely to be identified by discovering new information and applying  logic.

Sentiment

Even the choice of words and sentiment was telling.

Dominique Jeannerod explained that questions had long been asked about whether Christie followed a pattern saying “We gathered data including the number of culprit mentions per chapter, a ‘sentiment analysis’ of culprit mentions, transport mentions and several cross-references with other key concepts of the novels.”

“We also assessed the sentiment of the first mentions of the culprit in each work, using a sentiment analysis program, Semantria, to unmask themes in Christie’s word patterns and choices when mentioning the culprit. We found that, generally, for example, she employs more negative sentiment when the culprit is female, whereas a male culprit has higher levels of neutral or positive sentiment.”

We always thought of Christie as a type of proto-feminist but it seems that subconsciously even she could not escape the prejudices and gender based biases of her time!

To be fair she was born in 1890 so we won’t judge her too harshly for that.

We found this study fascinating and it just shows the power that maths and data analysis has – it can even unravel the mind of the world greatest ever crime writer! You could argue that it takes the fun out of the suspense but we think it is amazing. Like hacking the world…