Bot accounts are a persistent problem on Twitter, where they can be used to spam out favorable news stories and influence politics more broadly. But as companies and third-party groups have tried to push back against bot campaigns, simply identifying the accounts has proven remarkably difficult.
A new system called the Botometer, built by researchers at Indiana University and Northeastern University, is the perfect example of why. The system looks at over 1,000 factors, ranging from the tweets themselves (including metadata on how and where they were posted) to the composition of followers. “We are using a wide range of signals to compute scores,” said Onur Varel, a Northeastern University researcher who worked on the project. “Depending on user behavior, different feature types might be revealing.”
The early results have been alarming. When I ran some accounts earlier this week, I came away with a 27 percent score — while our esteemed editor-in-chief weighed in at 40 percent. That’s just 3 percent lower than @Arguetron, Sarah Nyberg’s infamous counter-troll bot. (Nilay’s score has come down in the days since, presumably the result of more humane and emotional tweeting.)
In theory, anything under 40 percent is basically a human finding, so I don’t need to start questioning my humanity just yet. And crucially, Varel doesn’t claim the system is accurate enough to make a firm judgment. The system is still a work in progress, and the team is actively soliciting more data for the system’s Bot Repository.
At the same time, Varel says bots themselves are getting more sophisticated, becoming significantly harder to spot since he began studying them in 2011. The result could be bad news for governments, social networks, and anyone hoping to crack down on bot spam. We’re getting better at spotting bots, but not fast enough to catch up with the bots themselves.