Deep learning with physical and power-spectral priors for robust image inversion

Deng, Mo,Ph. D.Massachusetts Institute of Technology.

Author(s)

Deng, Mo,Ph. D.Massachusetts Institute of Technology.

Download1191624347-MIT.pdf (28.25Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

George Barbastathis.

Terms of use

MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Computational imaging is the class of imaging systems that utilizes inverse algorithms to recover unknown objects of interest from physical measurements. Deep learning has been used in computational imaging, typically in the supervised mode and in an End-to-End fashion. However, treating the machine learning algorithm as a mere black-box is not the most efficient, as the measurement formation process (a.k.a. the forward operator), which depends on the optical apparatus, is known to us. Therefore, it is inefficient to let the neural network to explain, at least partly, the system physics. Also, some prior knowledge of the class of objects of interest can be leveraged to make the training more efficient. The main theme of this thesis is to design more efficient deep learning algorithms with the help of physical and power-spectral priors.

We first propose the learning to synthesize by DNN (LS-DNN) scheme, where we propose a dual-channel DNN architecture, each designated to low and high frequency band, respectively, to split, process, and subsequently, learns to recombine low and high frequencies for better inverse conversion. Results show that the LS-DNN scheme largely improves reconstruction quality in many applications, especially in the most severely ill-posed case. In this application, we have implicitly incorporated the system physics through data pre-processing; and the power-spectral prior through the design of the band-splitting configuration. We then propose to use the Phase Extraction Neural Networks (PhENN) trained with perceptual loss, that is based on extracted feature maps from pre-trained classification neural networks, to tackle the problem of low-light phase retrieval under low-light conditions.

This essentially transfer the knowledge, or features relevant to classifications, and thus corresponding to human perceptual quality, to the image-transformation network (such as PhENN). We find that the commonly defined perceptual loss need to be refined for the low-light applications, to avoid the strengthened "grid-like" artifacts and achieve superior reconstruction quality. Moreover, we investigate empirically the interplay between the physical and con-tent prior in using deep learning for computational imaging. More specifically, we investigate the effect of training examples to the learning of the underlying physical map and find that using training datasets with higher Shannon entropy is more beneficial to guide the training to correspond better to the system physics and thus the trained mode generalizes better to test examples disjoint from the training set.

Conversely, if more restricted examples are used as training examples, the training can be guided to undesirably "remember" to produce the ones similar as those in training, making the cross-domain generalization problematic. Next, we also propose to use deep learning to greatly accelerate the optical diffraction tomography algorithm. Unlike previous algorithms that involve iterative optimization algorithms, we present significant progresses towards 3D refractive index (RI) maps from a single-shot angle-multiplexing interferogram. Last but not least, we propose to use cascaded neural networks to incorporate the system physics directly into the machine learning algorithms, while leaving the trainable architectures to learn to function as the ideal Proximal mapping associated with the efficient regularization of the data. We show that this unrolled scheme significantly outperforms the End-to-End scheme, in low-light imaging applications.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, May, 2020

Cataloged from the official PDF of thesis.

Includes bibliographical references (pages 169-182).

Date issued

2020

URI

https://hdl.handle.net/1721.1/127013

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Doctoral Theses