PaperView: TabNN: A Universal Neural Network Solution for Tabular Data

Photo from the article

Please checkout my video explaining this interesting paper by clicking on the “Video” button above.

Abstract:

Neural Networks (NN) have achieved state-of-the-art performance in many tasks within image, speech, and text domains. Such great success is mainly due to special structure design to fit the particular data patterns, such as CNN capturing spatial locality and RNN modeling sequential dependency. Essentially, these specific NNs achieve good performance by leveraging the prior knowledge over corresponding domain data. Nevertheless, there are many applications with all kinds of tabular data in other domains. Since there are no shared patterns among these diverse tabular data, it is hard to design specific structures to fit them all. Without careful architecture design based on domain knowledge, it is quite challenging for NN to reach satisfactory performance in these tabular data domains. To fill the gap of NN in tabular data learning, we propose a universal neural network solution, called TabNN, to derive effective NN architectures for tabular data in all kinds of tasks automatically. Specifically, the design of TabNN follows two principles: to explicitly leverage expressive feature combinations and to reduce model complexity. Since GBDT has empirically proven its strength in modeling tabular data, we use GBDT to power the implementation of TabNN. Comprehensive experimental analysis on a variety of tabular datasets demonstrate that TabNN can achieve much better performance than many baseline solutions.

Shayan Fazeli
Shayan Fazeli
Ph.D. Candidate in Computer Science

Ph.D. candidate researcher at the eHealth and Data Analytics Lab - CS [at] UCLA