Applications of deep learning and computer vision in large scale quantification of tree canopy cover and real-time estimation of street parking

Cai, Bill Yang.

Author(s)

Cai, Bill Yang.

Download1103712603-MIT.pdf (7.176Mb)

Other Contributors

Massachusetts Institute of Technology. Computation for Design and Optimization Program.

Advisor

Carlo Ratti.

Terms of use

MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

A modern city generates a large volume of digital information, especially in the form of unstructured image and video data. Recent advancements in deep learning techniques have enabled effective learning and estimation of high-level attributes and meaningful features from large digital datasets of images and videos. In my thesis, I explore the potential of applying deep learning to image and video data to quantify urban tree cover and street parking utilization. Large-scale and accurate quantification of urban tree cover is important towards informing government agencies in their public greenery efforts, and useful for modelling and analyzing city ecology and urban heat island effects. We apply state-of-the-art deep learning models, and compare their performance to a previously established benchmark of an unsupervised method.

Our training procedure for deep learning models is novel; we utilize the abundance of openly available and similarly labelled street-level image datasets to pre-train our model. We then perform additional training on a small training dataset consisting of GSV images. We also employ a recently developed method called gradient-weighted class activation map (Grad-CAM) to interpret the features learned by the end-to-end model. The results demonstrate that deep learning models are highly accurate, can be interpretable, and can also be efficient in terms of data-labelling effort and computational resources. Accurate parking quantification would inform developers and municipalities in space allocation and design, while real-time measurements would provide drivers and parking enforcement with information that saves time and resources. We propose an accurate and real-time video system for future Internet of Things (IoT) and smart cities applications.

Using recent developments in deep convolutional neural networks (DCNNs) and a novel intelligent vehicle tracking filter, the proposed system combines information across multiple image frames in a video sequence to remove noise introduced by occlusions and detection failures. We demonstrate that the proposed system achieves higher accuracy than pure image-based instance segmentation, and is comparable in performance to industry benchmark systems that utilize more expensive sensors such as radar. Furthermore, the proposed system can be easily configured for deployment in different parking scenarios, and can provide spatial information beyond traditional binary occupancy statistics.

Description

Thesis: S.M., Massachusetts Institute of Technology, Computation for Design and Optimization Program, 2018

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 73-77).

Date issued

2018

URI

https://hdl.handle.net/1721.1/122317

Department

Massachusetts Institute of Technology. Computation for Design and Optimization Program

Publisher

Massachusetts Institute of Technology

Keywords

Computation for Design and Optimization Program.

Collections

Graduate Theses