Online Social Media Analytics Software as a Tool for Automating Data Collection: Concurrent Validity and Feasibility Study

Red Thaddeus Dela Peña Miguel, Cara Isabella Macul Uy


Background: Facebook based research is emerging and social media analytics software may be a tool that could lower the cost of research. Unfortunately, the quality of data it extracts has not be documented or validated.


Objectives: To test accessibility and efficiency of social media analytics software in a Facebook based study and to test concurrent validity of Likes extracted.


Methods: We conducted a review of accessible online social media analytics software and selected one for a case study comparing it to manual extraction procedures. Thereafter we tested concurrent validity of the social media analyzer as a method for extracting Likes from Facebook pages. The agreement in Likes extracted was tested with intraclass correlation coefficient (ICC), concordance correlation coefficient (CCC), and Bland and Altman plot.


Results: Eighteen software were found with five being completely free. The selected software was used in the completion of a case study at no cost but took a longer time to extract data compared to manual extraction procedures. Exact data points were matched in only a few pages (n=20, 33.9%) but differences between Likes extracted by the software and manual extraction was not statistically different (p=0.471). The software was found to have perfect ICC for half of the studies with the rest having “almost perfect” agreement (ICC = 0.97 and ICC = 0.98, for 3rd and 4th quartile, respectively). Concurrent validity was high (CCC = 0.995) with Bland and Altman plot showing only 5% of measurements outside 95% agreement level.


Conclusion: Social media analyzer software are accessible and can be used at no cost. Facebook Likes extracted through software compared to Likes manually extracted may not be exact matches but have strong agreement and validity.


Facebook, Social Media Analyzer, Concurrence Validity, Concordance Correlation Coefficient, Intraclass Correlation Coefficient, Feasibility

Full Text:



Adel, R. (2019). List: certified senatorial candidates for 2019 elections. Retrieved from

Alemayehu C, Mitchell G, Nikles J. (2018). Barriers for conducting clinical trials in developing countries- a systematic review. Int J Equity Health. 17(1):37. doi:10.1186/s12939-018-0748-6

American Psychological Association. (2010). Ethical principles of psychologists and code of conduct. Available at: (accessed 4 August 2016)

Aslam, S. (2019). Facebook by the numbers: stats, demographics, & fun facts. Retrieved from

Bland, J.M., Altman, D.J. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 8:307–310.

Buchanan, L., Yeatman, H., Kelly, B., Kariippanon, K. (2018), A thematic content analysis of how marketers promote energy drinks on digital platforms to young Australians. Australian and New Zealand Journal of Public Health, 42: 530-531. doi:10.1111/1753-6405.12840

Cole, A., Leonard, M.T., McAuslan, P. (2018) Social media and couples: what are the important factors for understanding relationship satisfaction? Cyberpsychology, Behavior, and Social Networking, 21:9.

Eichstaedt, J.C., Smith, R.J., Merchant, R.M., Ungar, L.H., Crutchley, P., Preoţiuc-Pietro, D., Asch, D.A., Schwartz, H.A. (2018) Facebook language predicts depression in medical records. Proc Natl Acad Sci USA, 115(44): 11203–11208. doi:10.1073/pnas.1802331115

Eranti, V., Lonkila, M. (2015). The social significance of the Facebook Like button. First Monday, 20(6). doi:

Eysenbach, G., Till, J.E. (2001). Ethical issues in qualitative research on Internet communities. BMJ. 323(7321):1103–1105.

Facebook (2019). Stats. Retrieved from

Facebook. Data Policy. Retrieved from

Fiks, A. G., Gruver, R. S., Bishop-Gilyard, C. T., Shults, J., Virudachalam, S., Suh, A. W., Gerdes, M., Kalra, G.K., DeRusso, P.A., Lieberman, A., Weng, D., Elovitz, M.A., Berkowitz, R.I., Power, T.J. (2017). A social media peer group for mothers to prevent obesity from infancy: the

Grow2Gether randomized trial. Childhood Obesity, 13(5): 356–368. doi:10.1089/chi.2017.0042

Flicker, S., Haans, D., Skinner, H. (2004). Ethical dilemmas in research on Internet communities. Qual Health Res. 14(1):124–134. doi: 10.1177/1049732303259842.

Gough, A., Hunter, R. F., Ajao, O., Jurek, A., McKeown, G., Hong, J., Barrett, E., Ferguson, M., McElwee, G., McCarthy, M., Kee, F. (2017). Tweet for Behavior Change: Using Social Media for the Dissemination of Public Health Messages. JMIR public health and surveillance, 3(1), e14. doi:10.2196/publichealth.6313

Hudson, J.M., Bruckman, A. (2004). ‘Go away’: Participant objections to being studied and the ethics of chatroom research. The Information Society. 20(2): 127–139.

Inkster, B., Stillwell, D., Kosinski, M., Jones, P. (2016). A decade into Facebook: where is psychiatry in the digital age? Lancet Psychiatry, 3 (11): 1087-1090.

Kessler, A. (2012). The button that made Facebook billions. Retrieved from

Kite, J., Foley, B.C., Grunseit, A.C., Freeman, B. (2016). Please Like me: Facebook and public health communication. PLoS ONE, 11(9): e0162765. doi:10.1371/journal.pone.0162765

Kosinski, M., Stillwell, D., Graepel, T. (2013) Private traits and attributes are predictable from digital records of human behavior. Proc Natl Acad Sci USA. 110(15):5802-5. doi: 10.1073/pnas.1218772110.

Kristensen, J.B., Albrechtsen, T., Dahl-Nielsen, E., Jensen, M., Skovrind, M., Bornakke. T. (2017). Parsimonious data: How a single Facebook like predicts voting behavior in multiparty systems. PLoS ONE, 12(9): e0184562. 10.1371/journal.pone.0184562

Lee, K. (2019). Know what's working on social media: 27 paid and free social media analytics tool. Retrieved from

Lin, L. (1989). A concordance correlation coefficient to evaluate reproducibility. Biometrics. 45: 255–268.

Liu, J., Tang, W., Chen, G., Lu, Y., Feng, C., Tu, X.M. (2016). Correlation and agreement: overview and clarification of competing concepts and measures. Shanghai Arch Psychiatry. 28(2):115–120. doi:10.11919/j.issn.1002-0829.216045

McBride, G.B. (2005). A proposal for strength-of-agreement criteria for Lin's Concordance Correlation Coefficient. [PDF file] NIWA Client Report: HAM2005-062; p. 6. Retrieve from

Moreno, M. A., Kerr, B., Lowry, S. J. (2018). A longitudinal investigation of associations between marijuana displays on Facebook and self-reported behaviors among college students, Journal of Adolescent Health. doi:10.1016/j.jadohealth.2018.03.017

Park, B.K., Calamaro, C. (2013). A systematic review of social networking sites: Innovative platforms for health research targeting adolescents and young adults. J Nurs Scholarsh. 45(3):256–264. doi: 10.1111/jnu.12032.

Pearlman, L. (2009). “I like this”. Retrieved from

Perrin, A., & Anderson, M. (2019). Share of U.S. adults using social media, including Facebook, is mostly unchanged since 2018. Retrieved from

Pittman, M. (2018). Happiness, loneliness, and social media: perceived intimacy mediates the emotional benefits of platform use. The Journal of Social Media in Society, 7 (2): 164-176.

Portney, L.G., Watkins, M.P. (2000). Foundations of clinical research Applications to practice. Prentice Hall Inc. New Jersey ISBN 0-8385-2695-0. 560-567.

R Core Team (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from

Ringelhan, S., Wollersheim, J., Welpe, I.M. (2015). I Like, I cite? Do Facebook Likes predict the impact of scientific work? PLoS ONE, 10(8): e0134389. doi:10.1371/journal.pone.0134389

Samuel-Azran, T., Yarchi, M., Wolfsfeld, G. (2017). Engagement and likeability of negative messages on Facebook during Israel’s 2013 elections. The Journal of Social Media in Society, 6(1): 42-68.

Serugaa, B., Sadikovb, A., Cazapc, E.L., Delgadod, L.B., Digumartie, R., Leighlf, N.B., Meshrefg, M.M., Minamih, H., Robinsoni, E., Yamaguchij, N.H., Pylek, D., Cuferl, T. (2013). Barriers and challenges to global clinical cancer research. Oncologist. 19(1):61–67. doi:10.1634/theoncologist.2013-0290

Shrout, P.E, & Fleiss, J.L. (1979). Intraclass correlation: uses in assessing rater reliability. Physiological Bulletin. 86:420–428.

Starling, M.S., Kandel, Z., Haile, L., Simmons, R.G. (2018). User profile and preferences in fertility apps for preventing pregnancy: an exploratory pilot study. Mhealth, 4:21. doi:10.21037/mhealth.2018.06.02

Tefertiller, A. (2017). Like us on Facebook: social capital, opinion leadership, and social media word-of-mouth for promoting cultural goods.

The Journal of Social Media in Society, 7(2): 274-296.

Walker, M., King, G., Hartman, L. (2018). Exploring the potential of social media platforms as data collection methods for accessing and understanding experiences of youth with disabilities: a narrative review. The Journal of Social Media in Society, 7 (2): 43-68.

Whitaker, C., Stevelink, S., & Fear, N. (2017). The use of Facebook in recruiting participants for health research purposes: a systematic review. J Med Internet Res, 19(8): e290. Published 2017 Aug 28. doi:10.2196/jmir.7071

Wilkinson, D., Thelwall, M. (2011). Researching personal information on the public web: methods and ethics. Social Science Computer Review. 29(4): 387–401

Wilson, R.E., Gosling, S.D., & Graham, L.T. (2012). A review of Facebook research in the social sciences. Perspectives on Psychological Science, 7(3): 203-20.

Youyou, W., Kosinski, M., Stillwell, D. (2015) Computer-based personality judgments are more accurate than those made by humans. Proc Natl Acad Sci USA, 112: 1036–40.

Zhang, X. (2018). Social media popularity and election results: a study of the 2016 Taiwanese general election. PLos ONE. 13(11): e0208190. doi: 10.1371/journal.pone.0208190

Zimmer, M. (2010). “But the data is already public”: On the ethics of research in Facebook. Ethics Inf Technol. 12(4):313–325. doi: 10.1007/s10676-010-9227-5.


  • There are currently no refbacks.

Based at Tarleton State University in Stephenville, Texas, USA, The Journal of Social Media in Society is sponsored by the Colleges of Liberal and Fine Arts, Education, Business, and Graduate Studies.