Of course that I was aware of the scope of data mining happening on the tubes. I mean, I am a “computer guy” after all. You know, market basket analysis, that kind of stuff. Business intelligence, CRM analyses, you name it. What else … ah! the American government hiring a private company to compile information on teenagers it can recruit for the army. All in all, with the intensity of mining happening nowadays, Ted Stevens might as well have called the Internet “a series of tunnels” rather than “a series of tubes” (r.i.p. Ted Stevens, you were a funny guy).
Well, (I have been setting this up, haven’t I), I have to admit that I was pleasantly surprised to read about some quite cool applications of data mining, in the social media in particular, that I was not aware of. In this post I will concentrate on some interesting analyses of sentiments in tweets conducted recently.
As it turns out, a group of researchers in Rice University developed a program that they called SportSense that examines tweets of NFL fans in real time in order to try to guess events in the games as well as their outcomes. Not only is the software able to tell within seconds when big plays such as touchdowns and interceptions occur, but it can also measure the excitement of fans for each game. As SportSense co-creator Lin Zhong notes, development in the field of mining targeted audiences in Twitter has some very neat implications, not only for large audiences, but on a local scale, too:
“We’re also interested in sensing things on a local scale. For example, when a storm hits and the power goes out in my neighborhood, I would like to know when it comes back on — even if I happen to be at work. People tweet about those types of events, so the signal is there in the data; it’s just a matter of finding it.”
Digging further into the topic, I discover that scientists from the Technical University of Munich have developed software that uses analysis of Twitter messages to… predict stock trends. Ka-ching. I mean seriously, since thousands of investors are tweeting about the stock tendencies every day, why not exploit these absolutely free opinions and advices by automating the way they are read and analyzed? Well, you could be skeptical of the legitimacy and trustworthiness of the content posted on Twitter. Also, it is not a straightforward task to parse freely composed messages for useful stock trends.
The results, however, are astonishing – in a study conducted by TUM, concentrating on monitoring the stocks of the 500 S&P-listed companies, “the sentiment from Twitter messages develops similar to the stock market and even leads by a day (…) If an investor had oriented his share purchases according to the Twitter sentiment in the first half of 2010, he would have achieved an average rate of return of up to 15 percent”. Are you thinking what I am thinking? The development of this approach should definitely be closely watched. It could also be incorporated in the algorithms of more sophisticated stock-price-predicting software.
Another interesting analysis of Twitter messages conducted in TUM was used to predict the outcome of federal elections in Germany in 2009. In the process they used over 100,000 messages referencing either a politician or a political party. The results? It turns out that judging only by the mere number of references to a political party in the Twitter messages mirrored the results of the elections correctly. Further investigating the sentiments contained in the tweets, however, showed to the researchers even more – that the online sentiments closely corresponded to the politicians’ positions and campaigns. This clearly indicates that political deliberation does indeed take place on Twitter, as opposed to the skeptics’ opinions. The platform should in no way be underestimated when studying future elections.
These are only a few examples of analyzing public data in the social media that I found particularly interesting, however the trend of conducting such research is much larger and I reckon it is/should continue growing. This trend is in part made possible by, as discussed by Anders Albrerechtslund from the University of Illinois at Chicago in his article Online Social Networking as Participatory Surveillance, people’s tendency to be extremely open in social networking sites:
“In public opinion and academia, many people have voiced concern and amazement about the openness, or perhaps thoughtlessness, expressed in the behavior of social networking site’s users. As Jon Callas, chief security officer at the encryption software maker, PGP, puts it: ‘I am continually shocked and appalled at the details people voluntarily post online about themselves’ “
I am going to leave you with a thought on how this methodology is relevant to people in the SMB sector. I asked myself this question before researching the answer and the first idea that popped into my head was that public posts on social media could be mined for “business+location” phrases to determine what is in demand in different areas. This is also facilitated by Facebook, Twitter and Google Plus allowing their users to specify the locations they post from. Well, it turns out that there is already a service that does a similar job.
Although I think that all the examples in this post are interesting applications of mining the social media, I believe that future applications are to be even more innovative. So let’s live and see!