Universal artificial intelligence: evaluation and benchmarks

Mishra, Pallavi

Author(s)

Mishra, Pallavi

DownloadFull printable version (4.528Mb)

Other Contributors

Massachusetts Institute of Technology. Engineering Systems Division.

Advisor

James M. Utterback.

Terms of use

MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

The fields of artificial intelligence has struggled since it's inception about fundamental question of what intelligence means and how to measure it. The underlying issue of defining intelligence and it's formal measure are sensitive issues in human culture, both in respect to humans and more so in respect to machines. Several attempts have been made to generalize the definition of universal intelligence and derive formal benchmark tests from such definitions. In this thesis, we will review the definition of universal intelligence and attempt to aggregate the salient features of mathematically formalized tests proposed for the same. The combined theoretical features for benchmark will then be used to analyze one promising platform - the Arcade Learning Environment (ALE) that integrates Atari 2600 games to test domain independent artificial agents. We will suggest practical ways to incorporate these features into the ALE platform to manage limitations of computing resources used to generate required environments for agents. The limitation of resources is not only a practical constraint but also a factor that should be included in defining any practically useful measure of intelligence. We learn from the exercise that defining intelligence by generalizing it is a self-defeating goal and that, intelligence is best defined with respect to the physical, time and computing resource-related constraint in which the agent operates. An agent with unlimited resources can adapt to infinite set of environments, but there can be no practical implementation of such an agent. Since physical universe itself has limited although large set of information encoded in the environment with a possibly finite set of non-repeating states, in order to be of practical use, the benchmarks tests should account for physical resources as well as physical time. This constraint related view calls for context-specific measure of intelligence rather than a cumulative total reward based measure across a defined set of environments.

Description

Thesis: S.M. in Engineering and Management, Massachusetts Institute of Technology, School of Engineering, System Design and Management Program, Engineering and Management Program, 2016.

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 60-64).

Date issued

2016

URI

http://hdl.handle.net/1721.1/107604

Department

Massachusetts Institute of Technology. Engineering and Management Program; System Design and Management Program.

Publisher

Massachusetts Institute of Technology

Keywords

Engineering and Management Program., System Design and Management Program., Engineering Systems Division.

Collections

Graduate Theses