
Statistical Analytics and Regional Representation Learning for COVID-19 Pandemic Understanding

This paper processes and combines an extensive collection of publicly available datasets to provide a unified information source for representing geographical regions with regards to their pandemic-related behavior. The features are grouped into various categories to account for their impact based on the higher-level concepts associated with them. This work uses several correlation analysis techniques to observe value and order relationships between features, feature groups, and COVID-19 occurrences. Dimensionality reduction techniques and projection methodologies are used to elaborate on individual and group importance of these representative features. In addition, a specific RNN-based inference pipeline called DoubleWindowLSTM-CP is designed in this work for predictive event modeling, thus utilizing sequential patterns as well as enabling concise record representation.