A newly developed algorithm can spot depression in Twitter users with 88.39 percent accuracy.
Developed by researchers at Brunel University London and the University of Leicester, the algorithm determines someone’s mental state by extracting and analysing 38 data points from their public Twitter profile, including the content of their posts, their posting times, and the other users in their social circle.
The research team say similar systems could have a range of different uses in the future across multiple platforms, such as early depression diagnosis, employment screening or police investigations.
“We tested the algorithm on two large databases and benchmarked our results against other depression detection techniques,” said Prof Abdul Sadka, Director of Brunel’s Institute of Digital Futures. “In all cases, we've managed to outperform existing techniques in terms of their classification accuracy.”
The algorithm was trained using two databases that contain the Twitter history of thousands of users, alongside additional information about those users’ mental health. Eighty percent of the information in each database was used to teach the bot, with the other 20 percent then used to test its accuracy.
The bot works by first excluding all users with fewer than five tweets and running the remaining profiles through natural language software to correct for misspellings and abbreviations.
It then considers 38 distinct factors – such as a user’s use of positive and negative words, the number of friends and followers they have, and their use of emojis – and makes a determination on that user’s mental and emotional state.
Using the Tsinghua Twitter Depression Dataset, the team managed an accuracy of 88.39 percent, whilst an accuracy of 70.69 percent was achieved using John Hopkins University’s CLPsych 2015 dataset.
“Anything that's above 90 percent is considered excellent in machine learning. So, 88 percent for one of the two databases is fantastic,” said Prof Sadka.
“It's not 100 percent accurate, but I don't think at this level any machine learning solution can achieve 100 percent reliability. However, the closer you get to the 90 percent figure, the better.”
The team say that such a system could potentially flag a user’s depression before they post something into the public domain, paving the way for platforms such as Twitter and Facebook to proactively flag mental health concerns with users.
However, the bot can also be used after a post has made it into the public domain, potentially allowing employers and other businesses to assess a user’s mental state based on their social media posts. It could be used for a number of reasons, the researchers say, including for use in sentiment analysis, criminal investigations or employment screening,
“The proposed algorithm is platform independent, so can also be easily extended to other social media systems such as Facebook or WhatsApp,” said Prof Huiyu Zhou, Professor of Machine Learning at the University of Leicester.
“The next stage of this research will be to examine its validity in different environments or backgrounds, and more importantly, the technology raised from this investigation may be further developed to other applications, such as e-commerce, recruitment examination or candidacy screening.”
The research, Cost-sensitive Boosting Pruning Trees for depression detection on Twitter, was published by the Institute of Electrical and Electronics Engineers (IEEE).
Tim Pilgrim, Media Relations
+44 (0)1895 268965 firstname.lastname@example.org