PREDICTION OF DATA IN THE INSURANCE INDUSTRY BASED ON NEURAL NETWORK METHODS
Abstract
The paper presents a comparative analysis of the generalized linear regression model with the leading machine learning method Feed Forward Neural Network (FFNN) from the point of view of predicting counting data. These two models are described and compared from a theoretical and practical point of view. The stability of the models on the bicycle rental data set is checked, their accuracy is evaluated, the learning curves are built on test and training data sets. In order to improve the interpretability of models, the importance of input variables is evaluated. Because FFNN is often called the “black box” method, there is no direct way to evaluate the importance of variables. A new indirect method for assessing the importance of variables for deep neural networks based on the principles of information theory is proposed. It has been demonstrated that the FFNN network provides much better predictive power compared to the generalized linear regression model with a slight increase in model complexity.
References
1. Benjamin A., Fernandes H., Tomlinson T., Ramkumar R., VerSteeg C., Chowdhury R., … , Kording K. (2017). Modern machine learning far outperforms GLMs at predicting spikes. Retrieved from: https://www.biorxiv.org/content/10.1101/111450v2 (accessed January 29, 2020).
2. Open-source neural-network library. Retrieved from: https://keras.io/ (accessed January 21, 2020).
3. Wüthrich M.V. (2018). Data Analytics for Non-Life Insurance Pricing. ETH Zurich.
4. Bishop C.M. (2006). Pattern recognition and machine learning. Springer.
5. Goodfellow I., Bengio Y., & Courville A. (2016). Deep learning. MIT press.
6. Murphy K.P. (2012). Machine learning: a probabilistic perspective. MIT press.
7. Competitive web-based data mining platform. Retrieved from: https://www.kaggle.com (accessed January 29, 2020).