Statistics catch Twitter bots in the act.

Seriously, Science?
By Seriously Science
Oct 24, 2013 9:00 PMNov 19, 2019 8:33 PM


Sign up for our email newsletter for the latest science news

Photo: flickr/steevithakEver wonder whether a Twitter account is run by a real person or a bot? So did these British scientists, who have developed a way to tell the two apart. In this study, the authors use Bayesian statistics to distinguish between human, corporate, and robot Twitter accounts, primarily using the tweet timing. They found that robot accounts post more consistently throughout the day, while human accounts exhibit more highs and lows corresponding to daily routines. In particular, humans take a break from tweeting around noon, and reach peak Twitter frenzy around 9pm. So, if you want your Twitter robot to appear more human, give it a lunch break. Scaling-Laws of Human Broadcast Communication Enable Distinction between Human, Corporate and Robot Twitter Users "Human behaviour is highly individual by nature, yet statistical structures are emerging which seem to govern the actions of human beings collectively. Here we search for universal statistical laws dictating the timing of human actions in communication decisions. We focus on the distribution of the time interval between messages in human broadcast communication, as documented in Twitter, and study a collection of over 160,000 tweets for three user categories: personal (controlled by one person), managed (typically PR agency controlled) and bot-controlled (automated system). To test our hypothesis, we investigate whether it is possible to differentiate between user types based on tweet timing behaviour, independently of the content in messages. For this purpose, we developed a system to process a large amount of tweets for reality mining and implemented two simple probabilistic inference algorithms: 1. a naive Bayes classifier, which distinguishes between two and three account categories with classification performance of 84.6% and 75.8%, respectively and 2. a prediction algorithm to estimate the time of a user's next tweet with an R^2=0.7. Our results show that we can reliably distinguish between the three user categories as well as predict the distribution of a user's inter-message time with reasonable accuracy. More importantly, we identify a characteristic power-law decrease in the tail of inter-message time distribution by human users which is different from that obtained for managed and automated accounts. This result is evidence of a universal law that permeates the timing of human decisions in broadcast communication and extends the findings of several previous studies of peer-to-peer communication." Bonus quote from the full text: "We can observe that personal accounts increase their tweeting activity level as the day progresses, peaking at 9pm. Managed accounts tend to tweet more during work hours, between 9am and 6pm. The dip in the distribution at 12pm can probably be explained by lunch hour breaks. Finally, the distribution for bot-controlled accounts exhibits a variety of peaks, which is probably because their behaviour is not associated with a structured daily routine."

Related content: NCBI ROFL: Public health surveillance of dental pain via Twitter.

NCBI ROFL: Social networks lack useful content for incontinence.

NCBI ROFL: Who needs a doctor when you have Facebook?

1 free article left
Want More? Get unlimited access for as low as $1.99/month

Already a subscriber?

Register or Log In

1 free articleSubscribe
Discover Magazine Logo
Want more?

Keep reading for as low as $1.99!


Already a subscriber?

Register or Log In

More From Discover
Recommendations From Our Store
Shop Now
Stay Curious
Our List

Sign up for our weekly science updates.

To The Magazine

Save up to 40% off the cover price when you subscribe to Discover magazine.

Copyright © 2024 Kalmbach Media Co.