Rion B. Correia1,2,3, Ian B. Woood2, Johan Bollen2, Luis M. Rocha1,2,*

1Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington IN, USA
2Instituto Gulbenkian de Ciencia, Portugal
3CAPES Foundation, Ministry of Education of Brazil
* To whom correspondence she be addressed.

Citation: R.B. Correia, I.B Wood, J. Bollen, L.M. Rocha [2020]. "Mining social media data for biomedical signals and health-related behavior". Annual Review of Biomedical Data Science. 3(1): 433-458. DOI: 10.1146/annurev-biodatasci-030320-040844. The open-source full text pdf and the arXiv:2001.10285 preprint are also available.


Social media data have been increasingly used to study biomedical and health-related phenomena. From cohort-level discussions of a condition to population-level analyses of sentiment, social media have provided scientists with unprecedented amounts of data to study human behavior associated with a variety of health conditions and medical treatments. Here we review recent work in mining social media for biomedical, epidemiological, and social phenomena information relevant to the multilevel complexity of human health. We pay particular attention to topics where social media data analysis has shown the most progress, including pharmacovigilance and sentiment analysis, especially for mental health. We also discuss a variety of innovative uses of social media data for health-related applications as well as important limitations of social media data access and use.

Keywords: social media, healthcare, pharmacovigilance, sentiment analysis, biomedicine, public health, digital medicine