The title of my thesis is : Detection and analysis of drug non compliance in social media
The complete thesis can be downloaded here. It's in French.
Drug non-compliance refers to situations where the patient does not follow instructions from medical authorities when taking medications. Such situations include taking too much (overuse) or too little (underuse) of medications, drinking contraindicated alcohol, or making a suicide attempt using medication. Increasing drug compliance may have a bigger impact on public health than any other medical improvements. However non-compliance data are difficult to obtain since non-adherent patients are unlikely to report their behaviour to their healthcare providers. This is why we use data from social media to study drug non-compliance. Our study is applied to French-speaking forums.
First we collect a corpus of messages written by users from medical forums. We build vocabularies of medication and disorder names such as used by patients. We use these vocabularies to index medications and disorders in the corpus. Then we use supervised learning and information retrieval methods to detect messages talking about non-compliance. With machine learning, we obtain 0.513 F-mesure, with up to 0.5 precision or 0.6 recall. With information retrieval we identify specific situations such as drinking contraindicated alcohol or using neuroleptics for their psychotropic effect.
After that, we study the content of the non-compliance messages. We identify various non-compliance situations and patient's motivations. We identify 3 main motivations: self-medication, seeking an effect besides the effect the medication was prescribed for, or being in addiction or habituation situation. Self-medication is an umbrella for several situations: avoiding an adverse effect, adjusting the medication's effect, underusing a medication seen as useless, taking decisions without a doctor's advice. Non-compliance can also happen thanks to errors or carelessness, without any particular motivation.
Our work provides several kinds of result: annotated corpus with non-compliance messages, classifier for the detection of non-compliance messages, typology of non-compliance situations and analysis of the causes of non-compliance.