Integrating Machine Learning into Data Analysis and Plant Performance
Author(s)Morey, Zachariah Keith
MetadataShow full item record
In the current manufacturing environment, the push for high levels of plant performance has led to scrutinizing, optimizing, and improving every step of the manufacturing process. While improvements are being made in physical and software technology that enable advancements like automated robots or additive manufacturing, data management and analysis continues to be an area of opportunity. Challenges with data analysis are exacerbated by the ever increasing influx of data from every point of the product manufacturing process as well as the integration of that data with legacy and novel equipment, software, and employee capabilities. Identifying improvements in processing and utilizing data can contribute to a better understanding of the data itself as well as insights to drive improved manufacturing and plant performance. This thesis shows, drawing from a recent project at Nissan's Canton, Mississippi manufacturing facility and utilizing data from a global group of Nissan manufacturing plants, that machine learning can be applied to plant performance data to identify and prioritize metrics and to better understand the impact of those metrics on overall plant performance. Nissan already benchmarks plant performance between its manufacturing facilities and uses that to drive improvement and investment opportunities. By examining the data set used for that benchmarking analysis we gain an understanding of both how plants have performed in recent history and what successful plants are doing that contributes to better performance. We then run this data through a linear regression model and an XGBoost machine learning model to compare how the machine learning model performs when compared to a standard linear regression. We show that while both models perform well, the machine learning model outperforms the linear regression model. Specifically the machine learning model achieves a 10% improvement on R squared with a value of .88 while the linear regression achieves an R squared value of .80. In addition, the machine learning model better handles missing data and shows that the Design Standard Time Ratio and Delivery Scheduled Time Achievement Ratio are metrics that need to be prioritized for better plant performance. This thesis argues that while our project focused on a small benchmarking data set, machine learning and its benefits can be applied more broadly to data from the manufacturing facilities. We conclude by presenting some examples and opportunities for how a manufacturing company like Nissan can set up its data, utilize models, and train employees to take advantage of the growing knowledge base around data management, machine learning, and plant performance.
DepartmentMassachusetts Institute of Technology. Department of Mechanical Engineering; Sloan School of Management
Massachusetts Institute of Technology