A look into Covid-19 spreading through social media and the lodging sector

Kernel Density Estimate representations.

Objective:

Analysing the potential of data from social media and the lodging industry (Airbnb) for the study of the spreading of Covid-19.
KDE footprints and clusters for AOIs

Description:

Areas of interest (AOI) are regions relevant for people due to different reasons. In urban environments, these areas attract the attention of citizens due to the presence of specific infrastructures, services or landmarks in general that determine the functionality of the space.

Nowadays, huge amounts of volunteered geographic information (VGI) in the form of georeferenced posts in social media or other sources like lodging services are available for research. Geosocial media (GSM) such as Twitter, Flickr, Foursquare, etc. or lodging services like Airbnb offer access to part of these data through the use of application programming interfaces (APIs). In other cases, big datasets are already available for processing and analysis.

VGI posts contain terms that can be mined to determine semantics of interest for users. These semantics can be associated to the locations to which the posts have been referenced. Then it is possible modelling AOIs that represent common semantics from groups of users.

Natural Language Processing (NLP) can be used for the extraction of semantics from natural language contained in the posts. Subsequently, geolocated posts can be grouped based on their characteristic terms of interest with methods such as Kernel Density Estimation (KDE) (Mckenzie & Adams, 2017) or clustering with algorithms like DBSCAN (Hu et al., 2015).

Among other semantics, it is possible extracting information about opinions or emotions. Sentiment Analysis is a technique that uses NLP to mine opinions, attitudes or emotions on textual contents such as those contained in user posts (Bagheri & Islam, 2017; Kharde & Sonawane, 2016). There are multiple open options to spatially visualise these results.

We could suppose that the recent situation with Covid-19 has produced an impact in the contents generated by users. Hence, depending on the student’s interest, this work could focus on posts containing Covid-related terms.

The aim of this thesis is analysing the potential of data from social media and the lodging industry for the study of the spreading of Covid-19. Moreover, exploring alternatives for the visualisation of sentiment analysis results.

This work will tackle some of the following questions, depending on the student interests/background:

  • How suitable is Twitter, Flickr and Airbnb data to identify patterns in the spreading of Covid-19? How to combine them?
  • How suitable is Twitter, Flickr and Airbnb data to analyse and visualise users’ stance towards Covid-19?
  • How similar are sentiment analysis visualisations obtained from these 3 sources for a European scale?
  • How to visualise sentiment analysis results for a large-scale area?

The master thesis candidate must have knowledge on Python or Java, as well as any DBMS such as PostgreSQL or MySQL.

Staff working in the domain:
Francisco Porras Bernárdez
francisco.porras.bernardez@tuwien.ac.at

Kernel Density Estimate representations.

References:

  • Bagheri, H., & Islam, M. J. (2017). Sentiment analysis of twitter data. arXiv:1711.10377 [cs]. http://arxiv.org/abs/1711.10377

  • Gao, S., Janowicz, K., Montello, D. R., Hu, Y., Yang, J. A., McKenzie, G., Ju, Y., Gong, L., Adams, B., & Yan, B. (2017). A data-synthesis-driven method for detecting and extracting vague cognitive regions. International Journal of Geographical Information Science, 31(6), 1245-1271. https://doi.org/10.1080/13658816.2016.1273357

  • urban areas of interest using geotagged photos. Computers, Environment and Urban Systems, 54, 240-254. https://doi.org/10.1016/j.compenvurbsys.2015.09.001
    Kharde, V. A., & Sonawane, P. S. (2016). Sentiment Analysis of Twitter Data: A Survey of Techniques. International Journal of Computer Applications, 139(11), 5-15. https://doi.org/10/ghj9f7
    Mckenzie, G., & Adams, B. (2017). Juxtaposing Thematic Regions Derived from Spatial and Platial User-Generated Content. Leibniz International Proceedings in Informatics (LIPIcs), September, 1-13. https://doi.org/10.4230/LIPIcs.COSIT.2017.20

Domain(s):

Study Program(s):

  • MSc. Cartography (EXCLUSIVELY externally advertised)