Machine Learning Models that Predict Mental Health Status on Twitter and Their Privacy Implications

Recent studies have shown that machine learning can be used to identify individuals with mental illnesses by analyzing their social media posts. These findings open up various possibilities in mental health research and early detection of mental illnesses. However, they also raise numerous privacy concerns. Our results show that machine learning can be used to make predictions even if the individuals do not actively talk about their mental illness on social media. In order to fully understand the implications of these findings, we need to analyze the features that make these predictions possible. We analyze bag of words, word clusters, part of speech n-grams and topic models to understand the machine learning model and to discover language patterns that differentiate individuals with mental illnesses from a control group. This analysis confirmed some of the known language patterns and uncovered several new patterns. We then discuss the possible applications of machine learning to identify mental illnesses, the feasibility of such applications, and associated privacy implications. We then suggest mitigating steps that can be taken by policymakers, social media platforms, and users.

Presented by