Loss pattern recognition and profitability prediction for insurers through machine learning

Wang, Ziyu, S.M. Massachusetts Institute of Technology

Author(s)

Wang, Ziyu, S.M. Massachusetts Institute of Technology

DownloadFull printable version (9.070Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

David Simchi-Levi.

Terms of use

MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

For an insurance company, assessing risk exposure for Property Damage (PD), and Business Interruption (BI) for large commercial clients is difficult because of the heterogeneity of that exposure, within a single client (account), and between different divisions, and regions, where the client is active. Traditional risk assessment models attempt to scale up the single location approach used in personal lines: A large amount of data is collected to profile a sample of the locations and based on this information the risk is then inferred and somewhat subjectively assessed for the whole account. The assumption is that the risk characteristics at the largest locations are representative of all locations, and moreover, that risk is proportional to the size of the location. This approach is both ineffective and inefficient. Thus our first goal is to build a better risk assessment model through machine learning based on clients' data from internal sources. Further, we define a new problem, to predict whether a specific contract would be profitable or unprofitable for the insurance company. This problem turns out to be an imbalance classification, which attracts the second half of our research efforts in this thesis. In Chapter 2, we first review related literature on state-of-the-art risk assessment models in the field of insurance. Later in the chapter we move to the imbalance classification problems and review some popular and effective solutions researchers have proposed. In Chapter 3, we describe the data structure, provide some preliminary analysis over certain attributes and discuss the preprocessing techniques used for feature construction. In Chapter 4, we propose a new model with the objective to develop a new risk index which represents clients' potential future risk level. We then compare the performance of our new index with the original risk index used by the insurance company and computational results show that our new index successfully captures clients' financial loss pattern, while the original risk score used by the insurance company fails to do so. In Chapter 5, we propose a multi-layer algorithm to predict whether a specific contract would be profitable or unprofitable for the insurance company. Simulation shows that we can accurately label more than 83 percent of the contracts on record and that our proposed algorithm outperforms traditional classifiers such as Support Vector Machines and Random Forests. Later in the chapter, we define a new imbalance classification problem and propose a hybrid method to improve the recall percentage and prediction accuracy of Support Vector Machines. The method incorporates unsupervised learning techniques into the classical Support Vector Machines algorithm and achieves satisfying results. In Chapter 6, we conclude the thesis and provide future research guidance. This thesis builds models and trains algorithms based on real world business data from a global leading insurance and reinsurance company.

Description

Thesis: S.M., Massachusetts Institute of Technology, Computation for Design and Optimization Program, 2017.

S.M. !c Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science 2017

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 91-94).

Date issued

2017

URI

http://hdl.handle.net/1721.1/111514

Department

Massachusetts Institute of Technology. Computation for Design and Optimization Program

Publisher

Massachusetts Institute of Technology

Keywords

Computation for Design and Optimization Program., Electrical Engineering and Computer Science.

Collections

Graduate Theses