Multimodal Data Fusion for Estimating Electricity Access and Demand

Lee, Stephen J.

Author(s)

Lee, Stephen J.

DownloadThesis PDF (17.15Mb)

Advisor

Pérez-Arriaga, Ignacio J.

Fisher III, John W.

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Electric power is a key enabler for economic development; nevertheless, 770 million people live without electricity access and 3.5 billion have unreliable connections. There is general consensus that the global community is off-track from realizing the United Nation’s Sustainable Development Goal #7 (SDG7) target of “universal access to affordable, reliable and modern energy services” by the year 2030. Under the International Energy Agency’s (IEA) central “Stated Policies Scenario,” 670 million people are expected to still be without electricity access in 2030. Simultaneously, we as a global community are off-track from achieving the Paris Agreement ambitions to limit global warming to 1.5 degrees Celsius compared to preindustrial levels. A 2021 U.N. report notes that national mitigation pledges for 2030 will collectively produce only one-seventh of the emissions reductions necessary to achieve the 1.5 degree goal. While electricity and heat together comprise 31.9% of all greenhouse gas (GHG) emissions globally, the electric power sector is expected to play a significant role in virtually all credible pathways towards climate stabilization: power sector emissions must be cut to near-zero by mid-century, and the power sector must also expand to electrify and therefore decarbonize a larger share of total energy use. The IEA’s “Net Zero by 2050” roadmap for net zero emissions models that electricity demand for “emerging market and developing economies” will need to exceed double the electricity demand in “advanced economies” by mid-century. Our development and climate imperatives both rest upon electricity demand in low- and middle-income counties. This dissertation attempts to push the state-of-the-art with regards to understanding, estimating, forecasting electricity demand in underserved contexts. We present four technical chapters towards these ends. First, we assess the importance of accurately estimating aggregate demand levels by performing sensitivity analyses using technoeconomic optimization models. We find that efforts to improve methods for demand forecasting are essential to prospects for right-sizing system designs. Over the domain of aggregate demand values modeled, the average cost of service provision range from $0.13/kWh to $0.37/kWh. This nearly three-fold difference demonstrates the critical influence of economies of scale and improved grid utilization on cost. We additionally find that characterizing building-level consumer type diversity plays a critical role in the outcome of high-resolution infrastructure plans. For our ‘central demand case,” we show that modeling a diversity of consumer types results in least-cost plans that are 9% less costly than modeling assuming demand assuming there is only one customer type. When comparing supply technology shares for cost-optimal designs, modeling consumer type diversity demand decreases prescribed grid extension shares from 89% to 77%. In our second technical chapter, we employ machine learning systems for probabilistic data fusion to the problem of forecasting annual electricity demand at the countrylevel for all African countries. We provide a novel set of probabilistic forecasts for the continent while addressing missing data issues and employing a rigorous framework for cross-validation and backtesting model results. In our third technical chapter, we show how machine learning systems for probabilistic data fusion can be used for estimating electricity access rates at building-level resolutions in low-access countries. Estimating electricity access is a key component to understanding electricity demand because aggregated consumption statistics only reflect demand from buildings with electricity access. Without access information, there is significant ambiguity when attempting to attribute aggregated consumption values to individual buildings. We train and evaluate our model using data describing electrified and non-electrified buildings in Rwanda and we achieve state-of-the-art results relative to existing methods in the literature. For our test set in Rwanda, our method achieves an accuracy score of 80.7% while the closest published baseline in the literature achieves 70.9%. Our system additionally enables explicit uncertainty quantification and has the potential to be scaled across the whole African continent. In our final technical chapter, we develop novel methods for estimating buildinglevel electricity demand. Challengingly, ground truth metered consumption datasets in low-access countries are often only accompanied by noisy geolocation data. This issue is exacerbated by the fact that meter and building connections reflect many-tomany relationships. There may be many electricity meters residing within a single building, and there may also be many buildings that are connected to a single meter. While our consumption data is logged at the meter-level, machine learning features of interest can only be extracted at the building-level. Because standard supervised machine learning models cannot express this complexity, we develop an application-tailored model based on a neural network (NN)-embedded probabilistic graphical model (PGM) for probabilistic data fusion. The PGM-based approach allows us to explicitly define potential relationships between meters and nearby buildings while the NN models employed enable us to effectively to extract information from multimodal features at the building-level. As a result, our model reflects a principled approach to training and running building-level demand estimation models using only meter-level ground truth information. We also make a few additional contributions: we show novelty by providing probabilistic building-level output; training and testing in Rwanda, a country for which building-level estimates are not currently available; and provide demand estimates for commercial and industrial consumers in addition to residential consumers. From a methodological standpoint, ours is the first machine learning model that embeds and trains NNs within PGMs employing Markov chain Monte Carlo (MCMC) sampling algorithms for inference. This application serves as an example for the novel combination of these individually important classes of algorithms. Taken together, the methods and studies presented in this dissertation enable the improved deployment of continuous electricity infrastructure planning across all lowand middle-income countries worldwide. We hope the research community continues to catalyze progress towards enabling continuous planning methodologies and map efficient pathways for achieving our global climate and development goals.

Date issued

2023-09

URI

https://hdl.handle.net/1721.1/152664

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Doctoral Theses