This drawing (HQ version) shows a result from our research published today in PLoS ONE. It actually is a simple plot depicting the difference between the emotions of Joy and Sadness (as represented by relevant word taxonomies taken from WordNet Affect) in approx. 5 million books published in the 20th century (1900-2000). You may have noticed that the peak for Sadness — and equivalently the minimum level of Joy — occurred during World War II.
However, for people who despise simplicity, we have also included some more elaborate (and perhaps more interesting) results in our paper. I'm not going to repeat them here, of course, as it makes no sense — that's the purpose of the publication!
The title "Detecting Events and Patterns on Twitter" may not be the original title of my Ph.D. Thesis, but it is a good human-readable alternative for the actual "Detecting Events and Patterns in Large-Scale User Generated Textual Streams with Statistical Learning Methods".
The main contributions of this research can be summarised in two simple points:
- The Social Web (Twitter, Facebook etc.) does contain useful information
- It is possible to extract portions of this information automatically with Artificial Intelligence
(For a more formal report on this, please refer to Section 6.2 of my Ph.D. Thesis.)
2011 was an interesting year for many and obvious reasons. Well, this is my opinion. However, a year is perceived very differently by individuals and therefore, the mean tendencies in the population are of interest.
In this direction ‒ extracting emotional tendencies from the general population ‒, I have analysed Twitter content geolocated in the UK and have extracted affective (mood) scores using an approach which is presented in my Ph.D. Thesis. Four emotions have been considered, namely anger, fear, sadness and joy. In the attached plot, the scores of anger, fear and sadness have been merged and joy's scores have been subtracted from them. Red line is the exact extracted signal, whereas the black line is its smoothed version (using a 7-point moving average to derive a weekly tendency). read more »
(For a more formal report on this, please refer to Section 7.1 of my Ph.D. Thesis.)
This is a network showing the average daily influence on Twitter for some urban centres in the UK. It is based on a set of 70 million tweets posted within a 6-month window and geolocated in the UK.
Starting from London and going clockwise, the importance or influence of a location (node) decreases. The network is directed; nodeA → nodeB means that nodeA influences the content of nodeB.
A quick and easy observation is that London, Manchester and Liverpool are UK's most influential cities in terms of Twitter content on a daily basis.