Using social media data for science: Till Keyling on ‘Facepager’
Social media harbour veritable data treasures. Social media data made public become accessible through application program interfaces (APIs) and can be used for subsequent scientific evaluation with the aid of special computer programs. Junior scientist Till Keyling of the Institute for Communications Science and Media Research at Ludwig-Maximilians University in Munich has developed the ‘Facepager’ program together with colleagues.
Till Keyling, born in 1985, studied Communication Science and Modern German Literature at Ludwig-Maximilians University in Munich. Since 2010 he has undertaken research in the sub-project ‘The Emergence and Use of Political Media Agendas on YouTube and its Significance for Adolescents’ of the DFG research group ‘Political Communication in the Online World’. In his master's thesis he is working on automated content analyses in communication science.
Question: Mr Keyling, with ‘Facepager’ you have developed a small computer program with which data can be collected from social networks in order to analyse them scientifically from a variety of perspectives. What exactly is this program and how is Facepager different from other programs?
Till Keyling: Facepager enables you to collect public data from platforms on the social web (such as Facebook, Twitter or YouTube) which these platforms make available through program interfaces (APIs). Our tool is open source, so it is freely accessible; but most importantly, the steps of data collection can all be traced exactly and documented, which is particularly relevant for scientific work. Facepager reduces the technical obstacle to collecting such data because it no longer requires programming skills. Besides, we offer certain presets that make it easier to get started with, but you still have to devote some time to APIs and their documentation.
Social media enable political communication to be analysed
Question: Your little tool has met with quite a bit of interest; mostly because the source code is freely accessible to everyone on the internet. Do you know who uses your program Facepager and for what purposes?
Till Keyling: Only when users contact us explicitly and ask us questions. Besides students who collect data for their final papers, other scientists use the tool for their own projects, particularly for analysing political communication on Twitter or Facebook. But we also know that companies use the program for market analyses, although of course we cannot and do not want to compete with commercial programs.
Question: Many users do not make the data they store on social networks like Facebook accessible to everyone but deliberately limit the circle of those who can access them. Others do not provide any details whatsoever about certain things in their profile. What data can then actually be accessed with Facepager?
Till Keyling: Generally, only the data the user has in fact made publicly accessible. While many users of Facebook do not make their profile information public (friends, likes and other details), this is quite different on Twitter or YouTube. The Facebook pages of politicians or companies, in particular, are publicly visible and therefore accessible for data collection. If private data of users had to be collected over and beyond that, a separate consent from each individual user would have to be obtained.
Critical examination of data from the social web
Question: How valid are the results you can obtain from analysing these data with Facepager?
Till Keyling: Collecting behavioural data in particular – such as how many ‘likes’ a news article has received – offers the advantage that these are what is known as non-reactive measurements which, in addition, can be collected quickly and in large numbers. The benefit here is that people do not have to be explicitly asked whether they have read a particular article. Such a procedure would not only be complex and costly but may also be inaccurate because the people you ask may not remember any more.
Besides methodical effects, the question of validity does in fact arise. What exactly do ‘likes’ stand for? What does a ‘retweet’ on Twitter mean and are Facebook ‘friends’ actually ‘real’ friends? Finding an answer to this requires a critical examination of the data. Fortunately, however, appropriate debates are increasingly being conducted in the academic context.
Discussion of the scientific use of social media data