Differentiable visual computing

Lo, Tzu-Mao

Author(s)

Lo, Tzu-Mao

Download1122780012-MIT.pdf (40.84Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

Frédo Durand.

Terms of use

MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Derivatives of computer graphics, image processing, and deep learning algorithms have tremendous use in guiding parameter space searches, or solving inverse problems. As the algorithms become more sophisticated, we no longer only need to differentiate simple mathematical functions, but have to deal with general programs which encode complex transformations of data. This dissertation introduces three tools, for addressing the challenges that arise when obtaining and applying the derivatives for complex graphics algorithms. Traditionally, practitioners have been constrained to composing programs with a limited set of coarse-grained operators, or hand-deriving derivatives. We extend the image processing language Halide with reverse-mode automatic differentiation, and the ability to automatically optimize the gradient computations. This enables automatic generation of the gradients of arbitrary Halide programs, at high performance, with little programmer effort.

We demonstrate several applications, including how our system enables quality improvements of even traditional, feed-forward image processing algorithms, blurring the distinction between classical and deep learning methods. In 3D rendering, the gradient is required with respect to variables such as camera parameters, light sources, geometry, and appearance. However, computing the gradient is challenging because the rendering integral includes visibility terms that are not differentiable. We introduce, to our knowledge, the first general-purpose differentiable ray tracer that solves the full rendering equation, while correctly taking the geometric discontinuities into account. We show prototype applications in inverse rendering and the generation of adversarial examples for neural networks. Finally, we demonstrate that the derivatives of light path throughput, especially the second-order ones, can also be useful for guiding sampling in forward rendering.

Simulating light transport in the presence of multi-bounce glossy effects and motion in 3D rendering is challenging due to the high-dimensional integrand and narrow high-contribution areas. We extend the Metropolis Light Transport algorithm by adapting to the local shape of the integrand, thereby increasing sampling efficiency. In particular, the Hessian is able to capture the strong anisotropy of the integrand. We use ideas from Hamiltonian Monte Carlo and simulate physics in Taylor expansion to draw samples from high-contribution region.

Description

This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019

Cataloged from student-submitted PDF version of thesis.

Includes bibliographical references (pages 131-148).

Date issued

2019

URI

https://hdl.handle.net/1721.1/122486

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Doctoral Theses