1A. Data Science for Official Statistics

12:10 - 13:20, Aula 9

Organizer: Orietta Luzi

Chair: Orietta Luzi

Data science at Istat for urban green

Fabrizio De Fausti, Marco Di Zio, Giuseppe Lancioni, Stefano Mugnoli, Alberto Sabbi and Francesco Sisti

Abstract: This paper discusses a machine learning-based approach for urban green estimation that exploits information from orthophotos. The techniques discussed aim to automatically determine a threshold of the Normalized Digital Vegetation Index, a metric used for quantifying density of vegetation based on the spectral reflectance measurement, for the classification of urban greenery taking into account issues related to the variability naturally inherent remote sensing data.

Click here to view the abstract.

Twitter (X) as a Data Source for Official Statistics: Monitoring Italian Debate on Immigration through Text Analysis

Elena Catanese, Gerarda Grippo, Francesco Ortame and Maria Clelia Romano

Abstract: Official Statistics (OS) is more and more interested to investigate the possible use of new sources, including social media website to enrich its production. Aim of the work is to study public opinions towards immigrants in Italy by exploiting Twitter data. In this study, a large dataset of Italian tweets between 2018 and 2022 containing a set of keywords related to immigration is analyzed using text mining techniques such as topic modelling. This kind of analysis provides valuable insights about the debate on immigration, serving as a preliminary step for the construction of sentiment-based synthetic indices and as a continuous process to enhance their interpretability. Emerging patterns about immigration are discussed along with possible generalizations according to the chosen methodology.

Click here to view the abstract.

Online Job Advertisements for official statistics: methodological challenges, solutions and open issues

Francesca Inglese, Annalisa Lucarelli, Renato Magistro, Giulio Massacci and Giuseppina Ruocco

Abstract: In the last few years, National Statistical Institutes (NSIs) have started exploring big data sources to enrich and complement official statistics. In this context, online advertisements on job portals and company websites provide relevant information, highlighting job market trends. The need to be aware of sudden changes in the skills required by companies in a context of major IT developments is fully met by the Online Job Advertisements (OJAs) ability to reflect timely changes in the labour market. In order to exploit the potentials of web data sources, the European Statistical System (ESS) has launched the WIN (Web Intelligence Network) project. Part of the activities carried out within this initiative aims at defining a quality and methodological framework for web data. The use of OJAs is one of the most mature use cases of the WIN project. The article describes the quality issues on OJA data, to be faced and solved in order to use the OJA data for statistical purposes and to produce experimental statistics on labour demand dynamic.

Click here to view the abstract.