Loss pattern recognition and profitability prediction for insurers through machine learning
Author(s)Wang, Ziyu, S.M. Massachusetts Institute of Technology
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
MetadataShow full item record
For an insurance company, assessing risk exposure for Property Damage (PD), and Business Interruption (BI) for large commercial clients is difficult because of the heterogeneity of that exposure, within a single client (account), and between different divisions, and regions, where the client is active. Traditional risk assessment models attempt to scale up the single location approach used in personal lines: A large amount of data is collected to profile a sample of the locations and based on this information the risk is then inferred and somewhat subjectively assessed for the whole account. The assumption is that the risk characteristics at the largest locations are representative of all locations, and moreover, that risk is proportional to the size of the location. This approach is both ineffective and inefficient. Thus our first goal is to build a better risk assessment model through machine learning based on clients' data from internal sources. Further, we define a new problem, to predict whether a specific contract would be profitable or unprofitable for the insurance company. This problem turns out to be an imbalance classification, which attracts the second half of our research efforts in this thesis. In Chapter 2, we first review related literature on state-of-the-art risk assessment models in the field of insurance. Later in the chapter we move to the imbalance classification problems and review some popular and effective solutions researchers have proposed. In Chapter 3, we describe the data structure, provide some preliminary analysis over certain attributes and discuss the preprocessing techniques used for feature construction. In Chapter 4, we propose a new model with the objective to develop a new risk index which represents clients' potential future risk level. We then compare the performance of our new index with the original risk index used by the insurance company and computational results show that our new index successfully captures clients' financial loss pattern, while the original risk score used by the insurance company fails to do so. In Chapter 5, we propose a multi-layer algorithm to predict whether a specific contract would be profitable or unprofitable for the insurance company. Simulation shows that we can accurately label more than 83 percent of the contracts on record and that our proposed algorithm outperforms traditional classifiers such as Support Vector Machines and Random Forests. Later in the chapter, we define a new imbalance classification problem and propose a hybrid method to improve the recall percentage and prediction accuracy of Support Vector Machines. The method incorporates unsupervised learning techniques into the classical Support Vector Machines algorithm and achieves satisfying results. In Chapter 6, we conclude the thesis and provide future research guidance. This thesis builds models and trains algorithms based on real world business data from a global leading insurance and reinsurance company.
Thesis: S.M., Massachusetts Institute of Technology, Computation for Design and Optimization Program, 2017.S.M. !c Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science 2017Cataloged from PDF version of thesis.Includes bibliographical references (pages 91-94).
DepartmentMassachusetts Institute of Technology. Computation for Design and Optimization Program.; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Massachusetts Institute of Technology
Computation for Design and Optimization Program., Electrical Engineering and Computer Science.