Universal artificial intelligence: evaluation and benchmarks
Author(s)
Mishra, Pallavi
DownloadFull printable version (4.528Mb)
Other Contributors
Massachusetts Institute of Technology. Engineering Systems Division.
Advisor
James M. Utterback.
Terms of use
Metadata
Show full item recordAbstract
The fields of artificial intelligence has struggled since it's inception about fundamental question of what intelligence means and how to measure it. The underlying issue of defining intelligence and it's formal measure are sensitive issues in human culture, both in respect to humans and more so in respect to machines. Several attempts have been made to generalize the definition of universal intelligence and derive formal benchmark tests from such definitions. In this thesis, we will review the definition of universal intelligence and attempt to aggregate the salient features of mathematically formalized tests proposed for the same. The combined theoretical features for benchmark will then be used to analyze one promising platform - the Arcade Learning Environment (ALE) that integrates Atari 2600 games to test domain independent artificial agents. We will suggest practical ways to incorporate these features into the ALE platform to manage limitations of computing resources used to generate required environments for agents. The limitation of resources is not only a practical constraint but also a factor that should be included in defining any practically useful measure of intelligence. We learn from the exercise that defining intelligence by generalizing it is a self-defeating goal and that, intelligence is best defined with respect to the physical, time and computing resource-related constraint in which the agent operates. An agent with unlimited resources can adapt to infinite set of environments, but there can be no practical implementation of such an agent. Since physical universe itself has limited although large set of information encoded in the environment with a possibly finite set of non-repeating states, in order to be of practical use, the benchmarks tests should account for physical resources as well as physical time. This constraint related view calls for context-specific measure of intelligence rather than a cumulative total reward based measure across a defined set of environments.
Description
Thesis: S.M. in Engineering and Management, Massachusetts Institute of Technology, School of Engineering, System Design and Management Program, Engineering and Management Program, 2016. Cataloged from PDF version of thesis. Includes bibliographical references (pages 60-64).
Date issued
2016Department
Massachusetts Institute of Technology. Engineering and Management Program; System Design and Management Program.Publisher
Massachusetts Institute of Technology
Keywords
Engineering and Management Program., System Design and Management Program., Engineering Systems Division.