Complex data: new methodologies and applications

10:50 - 12:05, Aula 9

Organizer: Francesca Greselin

Discussant: Cinzia Viroli

A cluster-weighted model for COVID-19 hospital admissions

Daniele Spinelli, Paolo Berta, Salvatore Ingrassia and Giorgio Vittadini

Abstract: We propose a cluster-weighted model to analyze the mortality and the latent heterogeneity of COVID-19 patients. We focus on administrative data collected during in the earliest phases of the COVID-19 pandemic. Results highlight that a model-based clustering approach is helpful to detect unobserved clusters of COVID-19 patients.

Click here to view the abstract.

Multi-class text classification of news data

Maurizio Romano and Maria Paola Priola

Abstract: Several Multi-class text classification (MCC) strategies, namely One- Vs-Rest (OVA), One-Vs-One (OVO), Best-of-Best (BOB), and Error-Correcting- Output-Codes (ECOC), are compared in terms of accuracy and computational efficiency. Each strategy is implemented utilizing several classifiers such as Na ̈ıve Bayes, Random Forest, Logistic Regression, Neural Networks, Linear Discriminant Analysis, Support Vector Machine, and the recently-introduced Threshold-based Na ̈ıve Bayes (Tb-NB). We run a horse race involving the analysis of the 20News- Group dataset, well known in the literature for its complexity. Our results high- light the importance of choosing the right classifier whilst pairing it with an optimal strategy, providing valuable insights for optimizing classifier performance in MCC classification tasks considering both environmental implications and the need for accurate predictions.

Click here to view the abstract.


A work by Gianluca Sottile

(on behalf of the local organizing committee)