Advanced Models Using Text with Dr. Helge Marahrens
Abstract: Textual data are abundant, but to extract meaningful insights from them we need strong tools. Even simple tasks such as identifying the most significant words in a text require careful pre-processing and modeling. In this workshop, we cover several advanced models, including identifying the most impactful words, categorizing documents by their thematic content, and predicting emotions in text. Our agenda includes a deep dive into “fightin’ words,” a survey of topic and topic-noise models, and an exploration of word embeddings. To conclude, we will touch upon the role of neural networks using the example of sentiment analysis and emotion detection.
Bio: Dr. Helge Marahrens is a Postdoctoral Fellow at the Massive Data Institute and at the Institute for the Study of International Migration working on developing indicators of forced migration using social media and digital archives. He received a PhD in Sociology and a MS in Applied Statistics from Indiana University, where he taught several Python workshops.