HITS Team Publishes Computational Solutions to Addiction Crisis

Two new studies published by HITS (Health Information Technologies Studies) researchers, both led by Rachel Kornfield, offer computational health communication solutions to substance abuse.

The most recent study (available through Journal of Medical Internet Research) was based on an analysis of a mobile phone-based health intervention for individuals in recovery from alcohol use disorder. Human coders labeled discussion forum messages according to whether or not authors mentioned problems in their recovery process. Linguistic features of these messages were extracted via several computational techniques: (1) a Bag-of-Words approach, (2) the dictionary-based Linguistic Inquiry and Word Count program, and (3) a hybrid approach combining the most important features from both Bag-of-Words and Linguistic Inquiry and Word Count. A boosted decision tree classifier, utilizing features from both Bag-of-Words and Linguistic Inquiry and Word Count performed best in identifying problems disclosed within the discussion forum, achieving 88% sensitivity and 82% specificity in a separate cohort of patients in recovery. This study demonstrates that differences in language use can distinguish messages disclosing recovery problems from other message types. Incorporating machine learning models based on language use allows real-time flagging of concerning content such that trained staff may engage more efficiently and focus their attention on time-sensitive issues.

This work builds on a previous study by Kornfield and colleagues (available through Health Communicationthat investigated whether language use within a peer-to-peer discussion forum could predict future relapse among individuals treated for AUD. A logistic regression model was built to predict the likelihood that individuals would engage in risky drinking within a year based on their language use, while controlling for baseline characteristics and rates of utilizing the mobile system. Results show that all baseline characteristics and system use factors explained just 13% of the variance in relapse, whereas a small number of linguistic cues, including swearing and cognitive mechanism words, accounted for an additional 32% of the total 45% of the variance in relapse explained by the model.

Both studies show that messages exchanged on AUD forums could provide an unobtrusive and cost-effective window into the future health outcomes of AUD sufferers, and their psychological underpinnings. As online communication expands, models that leverage user-submitted text toward predicting relapse will be increasingly scalable and actionable.