research highlights

My current research focus is on the development of methods which exploit user-generated information posted on Social Media platforms in order to conduct various types of inferences or extract unsupervised patterns (see for example my Ph.D. Thesis). I am also interested in interdisciplinary research that brings together Computer Science, Statistics and Social Sciences. Publications are listed elsewhere.

Academic Tags: Data Mining; Machine Learning; Natural Language Processing; Social Networks


Is text regression always a linear problem? Would a nonlinear approach improve the inferred model?

A user-centric model of voting intention from Social Media (Vasileios Lampos, Daniel Preotiuc-Pietro and Trevor Cohn, in ACL 2013). In this paper, we propose a bilinear method for modelling text regression based on Social Media content.  We also extend this approach in order to perform multi-task learning, i.e. model multiple target variables using a shared optimisation function. Our case studies, drawn from the domain of politics, are the inference of voting intention percentages in the United Kingdom (Conservative, Labour and Liberal Democrat parties) as well as in Austria (SPÖ, ÖVP, FPÖ, Green parties). The proposed method is language independent and could be easily applied to other domains (such as health or finance).

Bilinear model for multi-task and multi-output text regression

A bilinear model of the group ℓ1/ℓ2 regulariser for multi-task learning and multi-output text regression from Social Media (u and w denote user and word weights respectively)


What can 5 million books tell us about human emotions?

The expression of emotions in 20th century books (Alberto Acerbi, Vasileios Lampos, Philip Garnett and Alexander Bentley, in PLoS ONE, 2013): We report here trends in the usage of mood words, that is, words carrying emotional content, in 20th century English language books, using the data set provided by Google that includes word frequencies in roughly 4% of all books published up to the year 2008...
 
Press releases are available from PLoS ONE, University of Bristol and University of Sheffield. This work has been covered by a multitude of media in various countries (Huffington PostredOrbit, Popular Science, SlateBBC Radio 4, The Telegraph, Die Welt, Galileo, etc.), including the news section of Nature.

Joy minus Sadness

Differences of z-scores between Joy and Sadness in books from 1900 to 2000


Is it possible to use content from the Social Media to infer the result of an election? Could we model voting intention polls using the Twittersphere as input information?

On voting intentions inference from Twitter content: a case study on UK 2010 General Election (Vasileios Lampos, CoRR, arXiv:1204.0423, 2012): This is a report, where preliminary work regarding the topic of voting intention inference from Social Media ‒ such as Twitter ‒ is presented. Our case study is the UK 2010 General Election and we are focusing on predicting the percentages of voting intention polls (conducted by YouGov) for the three major political parties ‒ Conservatives, Labours and Liberal Democrats ‒ during a 5-month period before the election date...

Voting Intentions inference from Twitter

Voting intention percentages inferred from Twitter content vs. voting intention polls conducted by YouGov (UK 2010 General Election)


Mood of the Nation uses more than half a million geolocated tweets on a daily basis to detect mood and affect trends in the UK population. We focus on four categories, namely Joy, Sadness, Anger and Fear and make comparable inferences for several regions in the United Kingdom (such as South, North England etc.).

Effects of the Recession on Public Mood in the UK (Thomas Lansdall-Welfare, Vasileios Lampos and Nello Cristianini, in WWW '12): Large scale analysis of social media content allows for real time discovery of macro-scale patterns in public opinion and sentiment. In this paper we analyse a collection of 484 million tweets generated by more than 9.8 million users from the UK over the past 31 months...
 
A press release on this work is available here. This work has been featured by several mainstream or science media, including Mashable, New Scientist, Dradio and BBC World News.

Mood in the UK

Mood of the Nation infers Mood and Affect scores for several regions in the UK based on Twitter content. A static mood score figure is available here.

Using a similar method we have also extracted circadian patterns of several mood types based on UK Twitter users.
 


This work proposes a generic methodology that exploits user-generated content for nowcasting events emerging in real life. In particular, an additional (to the inference of flu rates) case study is presented, where rainfall rates in several UK cities are inferred by geolocated tweets. Inferring precipitation figures forms a much harder problem given that such a quantity does not have a smooth behaviour (especially in the UK).

Nowcasting Events from the Social Web with Statistical Learning (Vasileios Lampos and Nello Cristianini, in ACM TIST, 2011): We present a general methodology for inferring the occurrence and magnitude of an event or phenomenon by exploring the rich amount of unstructured textual information on the social part of the web...
    Note: An extended version is presented in Chapter 5 of my Ph.D. Thesis.
 
A press release on this work is available here. This work has been featured in several mainstream or science media (examples listed here), including ScienceDaily, Natural Hazards Observer, BBC Radio 4 and ITV. It has also been selected by EPSRC as a highlight for the research conducted in 2011.

Word Cloud with automatically extracted n-grams used to track rainfall rates from Twitter content. Font size is proportional to an n-gram's importance weight and flipped words take negative weights.


Flu Detector is a tool that uses the content of Twitter for nowcasting the level of flu-like illness in several UK regions. The applied methodology is presented in the following publications:

Tracking the flu pandemic by monitoring the Social Web (Vasileios Lampos and Nello Cristianini, in CIP 2010): We report on a monitoring method to measure the prevalence of disease in a population by analysing the contents of social networking tools, such as Twitter...
    Note: An extended version is presented in Chapter 4 of my Ph.D. Thesis.

Flu Detector - Tracking Epidemics on Twitter (Vasileios Lampos, Tijl De Bie and Nello Cristianini, in ECML/PKDD 2010): Flu Detector is an automated tool with a web interface for tracking the prevalence of Influenza-like Illness (ILI) in several regions of the United Kingdom using the contents of Twitter's microblogging service...

Flu Detector - Tracking Epidemics on Twitter

This work has been featured in or pointed by mainstream media about advances in technology and science such as MIT Technology Review, New Scientist and the Communications of the ACM.

    
 


Weather talk exploited the content of blogs and news articles as well as road traffic data available on the web to infer weather states in several locations in the UK. The applied methodology has been presented in my MSc Thesis:

Weather talk - extracting weather information by text mining (Vasileios Lampos, MSc Thesis advised by Nello Cristianini, 2008)

Weather talk - extracting weather information by text mining

A web page that visualises some results of this work.