Abstract
Social media platforms are increasingly being used to share and seek advice on mental health issues. In particular, Reddit users freely discuss such issues on various subreddits, whose structure and content can be leveraged to formally interpret and relate subreddits and their posts in terms of mental health diagnostic categories. There is prior research on the extraction of mental health-related information, including symptoms, diagnosis, and treatments from social media; however, our approach can additionally provide actionable information to clinicians about the mental health of a patient in diagnostic terms for web-based intervention. Specifically, we provide a detailed analysis of the nature of subreddit content from domain expert's perspective and introduce a novel approach to map each subreddit to the best matching DSM-5 (Diagnostic and Statistical Manual of Mental Disorders - 5th Edition) category using multi-class classifier. Our classification algorithm analyzes all the posts of a subreddit by adapting topic modeling and word-embedding techniques, and utilizing curated medical knowledge bases to quantify relationship to DSM-5 categories. Our semantic encoding-decoding optimization approach reduces the false-alarm-rate from 30% to 2.5% over a comparable heuristic baseline, and our mapping results have been verified by domain experts achieving a kappa score of 0.84.
Original language | American English |
---|---|
Journal | CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management |
DOIs | |
State | Published - Oct 1 2018 |
Disciplines
- Bioinformatics
- Communication
- Communication Technology and New Media
- Computer Sciences
- Databases and Information Systems
- Life Sciences
- OS and Networks
- Physical Sciences and Mathematics
- Science and Technology Studies
- Social and Behavioral Sciences