Research & methodologies

After collecting 1.6 million user-tagged feelings from 12 million online posts that mention emotions, our research team has utilized machine learning techniques towards building an automatic ?feelings meter?; a tool to automatically detect emotional dimensions from text, mapping out the emotional signiatures within conversations.

The detection of emotions from text has been both theoretically informed (from a theory of social data and traditions of emotion research), technically robust and sound (via NLP and visual analytics) [1]. Development of the EmotionVis tool has likewise been both methodologically rigorous (following Action Design Research), and is continuously being empirically evaluated (in real-world campaigns and trainee evaluations) [2].

Theoretically informed

This research combines the fields of information systems with human behaviour and emotion. In doing so, we also merge the disciplines of natural language processing (NLP) with visual anlytics. The design of the resulting artifact is further informed by the applied theories of design science research (DSR).

When Facebook launched a feature in 2013 allowing users to add a feeling tag to their posts as part of their daily interactions online. Our research leverages the text accompanying all such volunteered feeling tags in an effort to map the semantic space of 'Facebook feelings' as they are catalogued by the crowd.

The resulting "folksonomy of feelings" from the crowd allowed us to contrast their semantic space with the dimensional mappings of emotions proposed by over 100 years of psychological research.

This ennomorous collection of emotions also allowed us to show temporal and social patterns in the most commonly shared feelings. For example, weekday distributions can show that people share that they are busiest on Mondays, amoung several other trends (Zimmerman et al, 2015).

Zimmerman, al.(2015) ?Emergence of Things Felt: Harnessing the Semantic Space of Facebook Feeling Tags?. (PDF Download Available).

Technically robust

After collecting these 1.6 million user-tagged feelings from 12 million online posts that mention emotions, we utilized machine learning techniques towards building an automatic "feelings meter"; a tool for both researchers and practitioners to automatically detect emotional dimensions from text.

The arousal and valence classifiers (shown right) provide two dimensional detection. The six-way core emotion classifier (below) leverages the feelings families and identifies commonly used feelings with the given emotion.

Note! This is not a tool that counts the smilies or current use of feelings tags in your data. It has already learned from millions of those, and now knows what words are used most in association with each feeling without them being explicitly volunteered anymore.The classifier now looks for words like these that you use in text.

Zimmerman, al.(2016) "emotionVis: Designing a tool for Emotion Text Inference and Visual Analytics", inDesign Science Research in Information Systems and Technologies (DESRIST) Conference, St John's, NL, Canada, May 24-25, 2016. Springer International Publishing AG.

Methodologically Built

Following several iterations, the test version has now taken shape as emotionVis, a dasboard prototype for users to infer emotions from their own textual datasets, while presenting the results for visual analysis.

This resulting prototype has multiple parts. It consists of a back-end and front-end. The backend includes two Python Flask applications: an API interacting with the classifier and a web app interacting with the API. This app receives a CSV file from the user, extracts necessary data, sends it to the emotionVis API, adds classification scores to the CSV file, and returns the data to the user, along with computing the chart data for the dashboards. The front-end (UI) is made in HTML5, CSS and JavaScript. Bootstrap is used for the layout, while the D3.js and NVD3 visualization libraries are leveraged for the charts.

Offline installations have also been created to handle larger volumes of data while the API is available for users to connect to directly.