Statistical approaches to leak detection for geological sequestration

Haidari, Arman S

Author(s)

Haidari, Arman S

DownloadFull printable version (23.84Mb)

Other Contributors

Massachusetts Institute of Technology. Dept. of Chemical Engineering.

Advisor

Gregory J. McRae.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Geological sequestration has been proposed as a way to remove CO₂ from the atmosphere by injecting it into deep saline aquifers. Detecting leaks to the atmosphere will be important for ensuring safety and effectiveness of storage. However, a standard set of tools for monitoring does not yet exist. The basic problem for leak detection - and eventually for the inverse problem of determining where and how big a leak is given measurements - is to detect shifts in the mean of atmospheric CO₂ data. Because the data are uncertain, statistical approaches are necessary. The traditional way to detect a shift would be to apply a hypothesis test, such as Z- or t-tests, directly to the data. These methods implicitly assume the data are Gaussian and independent. Analysis of atmospheric CO 2 data suggests these assumptions are often poor. The data are characterized by a high degree of variability, are non-Gaussian, and exhibit obvious systematic trends. Simple Z- or t-tests will lead to higher false positive rates than desired by the operator. Therefore Bayesian methods and methods for handling autocorrelation will be needed to control false positives. A model-based framework for shift detection is introduced that is capable of coping with non-Gaussian data and autocorrelation. Given baseline data, the framework estimates parameters and chooses the best model. When new data arrive, they are compared to forecasts of the baseline model and testing is performed to determine if a shift is present. The key questions are, how to estimate parameters? Which model to use for detrending? And how to test for shifts? The framework is applied to atmospheric CO₂ data from three existing monitoring sites: Mauna Loa Observatory in Hawaii, Harvard Forest in central Massachusetts, and a site from the Salt Lake CO₂ Network in Utah. These sites have been chosen to represent a spectrum of possible monitoring scenarios. The data exhibit obvious trends, including interannual growth and seasonal cycles. Several physical models are proposed for capturing interannual and seasonal trends in atmospheric CO₂ data. The simplest model correlates increases in atmospheric CO₂ with global annual emissions of CO₂ from fossil fuel combustion. Solar radiation and leaf area index models are proposed as alternative ways to explain seasonality in the data. Quantitative normality tests reject normality of the CO₂ data and the seasonal models proposed are nonlinear. A simple reaction kinetics example demonstrates that nonlinearity in the detrending model can lead to non-Gaussian posterior distributions. Therefore Bayesian methods estimation will be necessary. Here, nonlinear least squares is used to reduce computational effort. A Bayesian method of model selection called the deviance information criterion (DIC) is introduced as a way to avoid overfitting. DIC is used to choose between the proposed models and it is determined that a model using a straight line to represent emissions driven growth, the solar radiation model and a 6-month harmonic term does the best job of explaining the data. Improving the model is shown to have two important consequences: reduced variability in the residuals and reduced autocorrelation.

(cont.) Variability in the residuals translates into uncertainty in CO₂ forecasts. Thus by reducing the spread of the residuals, improving the model increases the signal to noise ratio and improves the ability to detect shifts. A least squares example using CO₂ data from Mauna Loa is used to illustrate the effect of autocorrelation due to systematic seasonal variability on the ability to detect. The issue is that ordinary least squares tends to underestimate uncertainty when data are serially correlated, implying high false positive rates. Improving the model reduces autocorrelation in the residuals by eliminating systematic trends. Because the data exhibit gaps, Lomb periodograms are used to test the residuals for systematic signals. The model chosen by DIC removes all of the growing and seasonal trends originally present at the 5% level of significance. Thus improving the model is a way to reduce autocorrelation effects on false positives. A key issue for future monitoring sites will be demonstrating the ability to detect shifts in the absence of leaks. The urban weekend weekday effect on atmospheric CO₂ is introduced to illustrate how this might happen. A seasonal detrending model is used to remove systematic trends in data at Mauna Loa, Harvard Forest and Salt Lake. Residuals indicate the presence of positive shifts at the latter sites, as expected, with the magnitude of the shift being larger at the urban site than the rural one (~ 8 ppm versus ~ 1 ppm). Normality tests indicate the residuals are non-Gaussian, so a Bayesian method based on Bayes factors is proposed for determining the amount of data needed to detect shifts in non-Gaussian data. The method is demonstrated on the Harvard and Salt Lake CO₂ data. Results obtained are sensitive to the form of the error distribution. Empirical distributions should be used to avoid false positives. The weekend weekday shift in CO₂ is detectable in 48-120 samples at the urban site. More samples are required at the rural one. Finally, back-of-the-envelope calculations suggest the weekend weekday shift in emissions detected in Salt Lake is - 0(0.01) MtCO₂km- yr- 1. This is the equivalent of 1% of 1 MtCO₂ stored belowground leaking over an area of 1 km2 The framework developed in this thesis can be used to detect shifts in atmospheric CO₂ (or other types of) data after data is already available. Further research is needed to address questions about what data to collect. For example, what sensors should be used, where should they be located, and how frequently should they be sampled? Optimal monitoring network design at a given location will require balancing the need to gather more information (for example, by adding sensors) against operational constraints including cost, safety, and regulatory requirements.

Description

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Chemical Engineering, June 2011.

"April 2011." Cataloged from PDF version of thesis.

Includes bibliographical references (p. 181-189).

Date issued

2011

URI

http://hdl.handle.net/1721.1/65758

Department

Massachusetts Institute of Technology. Department of Chemical Engineering

Publisher

Massachusetts Institute of Technology

Keywords

Chemical Engineering.

Collections

Doctoral Theses