MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • Computer Science and Artificial Intelligence Lab (CSAIL)
  • Artificial Intelligence Lab Publications
  • AI Technical Reports (1964 - 2004)
  • View Item
  • DSpace@MIT Home
  • Computer Science and Artificial Intelligence Lab (CSAIL)
  • Artificial Intelligence Lab Publications
  • AI Technical Reports (1964 - 2004)
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

A Trainable System for Object Detection in Images and Video Sequences

Author(s)
Papageorgiou, Constantine P.
Thumbnail
DownloadAITR-1685.ps (69.17Mb)
Additional downloads
AITR-1685.pdf (15.17Mb)
Metadata
Show full item record
Abstract
This thesis presents a general, trainable system for object detection in static images and video sequences. The core system finds a certain class of objects in static images of completely unconstrained, cluttered scenes without using motion, tracking, or handcrafted models and without making any assumptions on the scene structure or the number of objects in the scene. The system uses a set of training data of positive and negative example images as input, transforms the pixel images to a Haar wavelet representation, and uses a support vector machine classifier to learn the difference between in-class and out-of-class patterns. To detect objects in out-of-sample images, we do a brute force search over all the subwindows in the image. This system is applied to face, people, and car detection with excellent results. For our extensions to video sequences, we augment the core static detection system in several ways -- 1) extending the representation to five frames, 2) implementing an approximation to a Kalman filter, and 3) modeling detections in an image as a density and propagating this density through time according to measured features. In addition, we present a real-time version of the system that is currently running in a DaimlerChrysler experimental vehicle. As part of this thesis, we also present a system that, instead of detecting full patterns, uses a component-based approach. We find it to be more robust to occlusions, rotations in depth, and severe lighting conditions for people detection than the full body version. We also experiment with various other representations including pixels and principal components and show results that quantify how the number of features, color, and gray-level affect performance.
Date issued
2000-05-01
URI
http://hdl.handle.net/1721.1/5566
Other identifiers
AITR-1685
CBCL-186
Series/Report no.
AITR-1685CBCL-186
Keywords
AI, MIT, Artificial Intelligence, object detection, pattern recognition, people detection, face detection, car detection

Collections
  • AI Technical Reports (1964 - 2004)
  • CBCL Memos (1993 - 2004)

Related items

Showing items related by title, author, creator and subject.

  • Thumbnail

    Detect What You Can: Detecting and Representing Objects using Holistic Models and Body Parts 

    Chen, Xianjie; Mottaghi, Roozbeh; Liu, Xiaobai; Fidler, Sanja; Urtasun, Raquel; e.a. (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-06-10)
    Detecting objects becomes difficult when we need to deal with large shape deformation, occlusion and low resolution. We propose a novel approach to i) handle large deformations and partial occlusions in animals (as examples ...

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.