Twitter Analytics : Which usage behavior attracts many followers?

This is the first part of a series of posts where Data Mining and Text Mining will be applied to extract potentially useful facts about the usage of Twitter and to draw some conclusions such as what makes a Twitter account interesting enough to other users.

The conclusions that will be presented here are from the analysis of 3651 Twitter accounts and are meant to show how Predictive Analytics can help. Please note that results are shown for informational purposes only.


First, the data used can be summarized with the following table :





You can immediately see problems in the ranges of the data used especially on the number of "followers" and "following". This is something to be expected since among the users captured were Jack Dorsey (founder of Twitter), Sen. McCain and George Stephanopoulos - users that obviously have a huge amount of followers.

Before finding which usage behavior attracts many followers, one should be able to identify what exactly is a "popular twitter account". Is it just the absolute number of followers? Perhaps it could be equally important -or at least interesting- to also look at :

1) The followers/following ratio

2) The number of followers per day

For our example the absolute number of followers was used as the only criterion of a successful Twitter account. The results can be summarized with the following decision tree :





Some usage patterns that raise the chance of having a successful Twitter account are the following :

  • Having a bio is an absolute must : 82.3% of unsuccessful Twitter accounts have their biography information missing.

  • You should provide more than 3 links per 20 tweets and also more than 0.960 updates per day

  • If you don't want to provide more than 3 links per 20 tweets, then try to post more than 5.857 updates per day.

  • Users that post more than 3 links per 20 tweets but post less than or equal to 0.960 updates per day, will need more than 222.5 days of usage to get an adequate amount of followers.

By using Feature Selection we are able to look also at the relevant importance of each parameter on achieving many followers : Here are the results of Feature Selection from using ChiSquare, GainRatio and InfoGain attribute evaluators.



=== Attribute selection 10 fold cross-validation (stratified), seed: 1 ===

average merit average rank attribute
362.743 +-10.419 1 +- 0 4 numberOfLinks
319.397 +-10.133 2.4 +- 0.49 6 hasBlankProfile?
311.661 +- 8.612 2.6 +- 0.49 7 updatesPerDay
192.525 +- 7.481 4.1 +- 0.3 3 retweetsNumber
178.236 +- 5.963 4.9 +- 0.3 1 elapsedDays
36.148 +- 3.579 6 +- 0 2 otherUsersTalk
17.843 +- 4.475 7 +- 0 5 questionsAsked


average merit average rank attribute
0.1 +- 0.003 1 +- 0 6 hasBlankProfile?
0.042 +- 0.001 2.4 +- 0.49 4 numberOfLinks
0.039 +- 0.002 3.2 +- 0.6 3 retweetsNumber
0.04 +- 0.004 3.4 +- 0.92 7 updatesPerDay
0.025 +- 0.001 5 +- 0 1 elapsedDays
0.011 +- 0.001 6 +- 0 2 otherUsersTalk
0.005 +- 0.001 7 +- 0 5 questionsAsked

average merit average rank attribute
0.082 +- 0.002 1 +- 0 4 numberOfLinks
0.074 +- 0.003 2.1 +- 0.3 6 hasBlankProfile?
0.071 +- 0.002 2.9 +- 0.3 7 updatesPerDay
0.044 +- 0.002 4.1 +- 0.3 3 retweetsNumber
0.041 +- 0.001 4.9 +- 0.3 1 elapsedDays
0.008 +- 0.001 6 +- 0 2 otherUsersTalk
0.004 +- 0.001 7 +- 0 5 questionsAsked


We see that all three attribute evaluators agree that the number of links provided on Tweets and whether the profile of the user is filled in are the two most important parameters in achieving many followers. Notice also that sending messages to other users (otherUsersTalk) and asking questions (questionsAsked) is not as important as one would expect.

The analysis shown above gives many insights but it does not take into account what the users say and how this affects the popularity of a Twitter account. Text Mining will try to give some answers for this question and also identify which keywords on Twitter profiles seem to be associated with many followers.

17 Responses to "Twitter Analytics : Which usage behavior attracts many followers?"

Matt Lourie Says :
May 5, 2009 8:19 PM

I like this and look forward to follow on results. What was the exact criteria for deciding which accounts were "good" and "bad"? Also, do you have any theories on why asking questions doesn't help that much?

BTW, do you have a twitter account :-)

Themos Kalafatis Says :
May 6, 2009 12:47 AM

Matt,

The choice was made by performing a logarithm transformation on the number of followers and then using the right half of the distribution as the "good" accounts.

I have no idea why questions do not help but they don't seem to hurt either...

I must confess i do not have a Twitter account and probably never will because of lack of free time!

John Grimes Says :
May 7, 2009 12:34 AM

This is really interesting stuff.

I think that there is going to be increasing demand for sophisticated Twitter analytics from brands and individuals going forward.

The people who are running what are still largely proof-of-concept Twitter marketing campaigns within organisations will increasingly need to start making concrete business cases to justify their investment in the medium.

How many followers did you end up defining as being the minimum number that a 'successful' Twitter account would have?

wrightee Says :
May 7, 2009 1:16 AM

It will be interesting to go the other way by modeling the "good" path on twitter itself. This data and that like it is the start of a whole new industry to rival SEO...

Jaremy Says :
May 7, 2009 1:42 AM

Interesting look, but I still believe that a great deal of Twitter popularity comes from offline popularity. Ashton Kutcher, Oprah and Anderson Cooper have hundreds of thousands of followers due to their popularity and influence offline.

You're also not looking at UsersFollowed. I think if you take out the "celebrity" Twitterers, you'll find that a high number of followers is strongly correlated with a high number of following. Obviously I haven't looked at the numbers myself, but that's my hunch. Additionally, it's extremely important and helpful to have another platform to advertise your tweets. If there were any way to calculate this, I think finding backlinks to Twitter would also tell a story (some people use their blogs/websites to link to their twitter, while others use their celebrity status on TV).

Themos Kalafatis Says :
May 7, 2009 2:02 AM

John,

Thanks for your feedback. As stated in my post this analysis is an example of what can be done and not all available information was used. For this study, having more than 786 followers was the threshold value.

Themos Kalafatis Says :
May 7, 2009 2:11 AM

Jeremy,

Celebrities, singers, politicians etc are a very small percentage of the "Twitter population". It is true that they have many followers because of outside influence. But when we analyze tens of thousands of users then we become more confident that our results are indeed correct for the majority -say 75%- true. There are always exceptions to the rules and also things that we cannot measure.

Ben Martin, CAE @bkmcae Says :
May 7, 2009 3:31 PM

I LOVE this kind of stuff. Thanks for doing this analysis!

Themos Kalafatis Says :
May 7, 2009 3:35 PM

Ben,

Thanks! Many more coming up shortly...

Neicole Crepeau Says :
May 8, 2009 2:26 AM

This is really interesting and useful information. I'm wondering about "number of followers" as the criteria for success. If most people don't bother to winnow their follow lists, then once they follow someone, that person just remains on their list. If people do winnow their lists, then the unfollow rate might be a factor to consider. Be nice to know how many times people actually click on the links that a popular tweeter sends.

Themos Kalafatis Says :
May 8, 2009 3:09 AM

Neicole,


Thank you for your insightful comments. The number of followers was chosen because this is what most people discuss as being important...however other "success metrics" could -and should- be used. This deserves more discussion.

Ulstrup Says :
May 8, 2009 10:00 AM

First glance impression by intuision:

Quantity:
# following
# followers
# updates

Quality:
Bio

Really nice to get some actual numers and calculations to on the above assumptions.

Look foreward for the next post, thanks

Themos Kalafatis Says :
May 13, 2009 7:51 PM

I decided to start a Twitter account just for the experience of it...

@Matt Lourie "Never say never."

Lisa Whelan Says :
May 15, 2009 12:23 AM

This is great stuff. I'd be curious to know whether there's a correlation between an individual's popularity and whether they post regularly about trending topics... i.e. Do the people who talk regularly about trending topics grow in popularity faster than those who don't?

Themos Kalafatis Says :
May 15, 2009 1:43 AM

Lisa,


Great idea and to be honest i haven't looked at this. I will let you know!

Twitter Buzz Says :
December 30, 2009 12:39 PM This comment has been removed by a blog administrator.
vizkr Says :
February 10, 2010 10:05 PM

I am curious what tool/methods were used to cull this data. It is a great report and am also curious if anyone has seen any work incorporating some of the other good ideas presented in the comments.

Regards,
http://twitter.com/mikelking

Post a Comment