July 16th, 2017
Deep Representation of Lyrical Style and Semantics for Music Recommendation
Abhay L. Kashyap
11:00-1:00 Thursday, 20 July 2017, ITE 346
In the age of music streaming, the need for effective recommendations is important for music discovery and a personalized user experience. Collaborative filtering based recommenders suffer from popularity bias and cold-start which is commonly mitigated by content features. For music, research in content based methods have mainly been focused in the acoustic domain while lyrical content has received little attention. Lyrics contain information about a song’s topic and sentiment that cannot be easily extracted from the audio. This is especially important for lyrics-centric genres like Rap, which was the most streamed genre in 2016. The goal of this dissertation is to explore and evaluate different lyrical content features that could be useful for content, context and emotion based models for music recommendation systems.
With Rap as the primary use case, this dissertation focuses on featurizing two main aspects of lyrics; its artistic style of composition and its semantic content. For lyrical style, a suite of high level rhyme density features are extracted in addition to literary features like the use of figurative language, profanity and vocabulary strength. In contrast to these engineered features, Convolutional Neural Networks (CNN) are used to automatically learn rhyme patterns and other relevant features. For semantics, lyrics are represented using both traditional IR techniques and the more recent neural embedding methods.
These lyrical features are evaluated for artist identification and compared with artist and song similarity measures from a real-world collaborative filtering based recommendation system from Last.fm. It is shown that both rhyme and literary features serve as strong indicators to characterize artists with feature learning methods like CNNs achieving comparable results. For artist and song similarity, a strong relationship was observed between these features and the way users consume music while neural embedding methods significantly outperformed LSA. Finally, this work is accompanied by a web-application, Rapalytics.com, that is dedicated to visualizing all these lyrical features and has been featured on a number of media outlets, most notably, Vox, attn: and Metro.
Committee: Drs. Tim Finin (chair), Anupam Joshi, Tim Oates, Cynthia Matuszek and Pranam Kolari (Walmart Labs)
July 12th, 2017
Analysis of Irregular Event Sequences using Deep Learning, Reinforcement Learning, and Visualization
11:00-1:00 Thursday 13 July 2017, ITE 346, UMBC
History is nothing but a catalogued series of events organized into data. Amazon, the largest online retailer in the world, processes over 2,000 orders per minute. Orders come from customers on a recurring basis through subscriptions or as one-off spontaneous purchases, resulting in each customer exhibiting their own behavioral pattern when it comes to the way in which they place orders throughout the year. For a company such as Amazon, that generates over $130 billion of revenue each year, understanding and uncovering the hidden patterns and trends within this data is paramount in improving the efficiency of their infrastructure ranging from the management of the inventory within their warehouses, distribution of their labor force, and preparation of their online systems for the load of users. With the ever increasingly availability of big data, problems such as these are no longer limited to large corporations but are experienced across a wide range of domains and faced by analysts and researchers each and every day.
While many event analysis and time series tools have been developed for the purpose of analyzing such datasets, most approaches tend to target clean and evenly spaced data. When faced with noisy or irregular data, it has been recommended to undergo a pre-processing step of converting and transforming the data into being regular. This transformation technique arguably interferes on a fundamental level as to how the data is represented, and may irrevocably bias the way in which results are obtained. Therefore, operating on raw data, in its noisy natural form, is necessary to ensure that the insights gathered through analysis are accurate and valid.
In this dissertation novel approaches are presented for analyzing irregular event sequences using a variety of techniques ranging from deep learning, reinforcement learning, and visualization. We show how common tasks in event analysis can be performed directly on an irregular event dataset without requiring a transformation that alters the natural representation of the process that the data was captured from. The three tasks that we showcase include: (i) summarization of large event datasets, (ii) modeling the processes that create events, and (iii) predicting future events that will occur.
Committee: Drs. Tim Oates (Chair), Jesus Caban, Penny Rheingans, Jian Chen, Tim Finin