Applications of deep learning and computer vision in large scale quantification of tree canopy cover and real-time estimation of street parking
Author(s)Cai, Bill Yang.
Massachusetts Institute of Technology. Computation for Design and Optimization Program.
MetadataShow full item record
A modern city generates a large volume of digital information, especially in the form of unstructured image and video data. Recent advancements in deep learning techniques have enabled effective learning and estimation of high-level attributes and meaningful features from large digital datasets of images and videos. In my thesis, I explore the potential of applying deep learning to image and video data to quantify urban tree cover and street parking utilization. Large-scale and accurate quantification of urban tree cover is important towards informing government agencies in their public greenery efforts, and useful for modelling and analyzing city ecology and urban heat island effects. We apply state-of-the-art deep learning models, and compare their performance to a previously established benchmark of an unsupervised method.Our training procedure for deep learning models is novel; we utilize the abundance of openly available and similarly labelled street-level image datasets to pre-train our model. We then perform additional training on a small training dataset consisting of GSV images. We also employ a recently developed method called gradient-weighted class activation map (Grad-CAM) to interpret the features learned by the end-to-end model. The results demonstrate that deep learning models are highly accurate, can be interpretable, and can also be efficient in terms of data-labelling effort and computational resources. Accurate parking quantification would inform developers and municipalities in space allocation and design, while real-time measurements would provide drivers and parking enforcement with information that saves time and resources. We propose an accurate and real-time video system for future Internet of Things (IoT) and smart cities applications.Using recent developments in deep convolutional neural networks (DCNNs) and a novel intelligent vehicle tracking filter, the proposed system combines information across multiple image frames in a video sequence to remove noise introduced by occlusions and detection failures. We demonstrate that the proposed system achieves higher accuracy than pure image-based instance segmentation, and is comparable in performance to industry benchmark systems that utilize more expensive sensors such as radar. Furthermore, the proposed system can be easily configured for deployment in different parking scenarios, and can provide spatial information beyond traditional binary occupancy statistics.
Thesis: S.M., Massachusetts Institute of Technology, Computation for Design and Optimization Program, 2018Cataloged from PDF version of thesis.Includes bibliographical references (pages 73-77).
DepartmentMassachusetts Institute of Technology. Computation for Design and Optimization Program
Massachusetts Institute of Technology
Computation for Design and Optimization Program.