An apparatus enabling easier and more automated dietary pattern tracking

Wang, Guolong, S.M. Massachusetts Institute of Technology

Author(s)

Wang, Guolong, S.M. Massachusetts Institute of Technology

DownloadFull printable version (3.149Mb)

Other Contributors

Program in Media Arts and Sciences (Massachusetts Institute of Technology)

Advisor

Deb Roy.

Terms of use

MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Nutritional assessment is an important problem for every American. Studies suggest that as many as 90% of Americans fall short of Vitamins D and E as a result of their regular dietary habits, and up to 50% of Americans do not get enough Vitamin A and Calcium. On the other hand, 68.8% of American adults over 20 were considered overweight or obese (had BMI of over 25), with excessive consumption of added sugars, fats, and carbohydrates being a key factor. There are two potential challenges that, if solved, may help many ordinary Americans manage their diets healthily. The first is recording dietary intake so that we have sufficient information regarding an individual's dietary pattern, and the second is interpreting nutritional profiles from the foods people are eating. It's after these two steps that nutritional intake can be inferred and insights into dietary balance can be gained. This thesis focuses on the first challenge, enabling more convenient tracking of dietary patterns supported by automatic image recognition. Our goal was to provide an improved alternative to current mainstream methods of keeping dietary records such as written records for clinical studies, or text input based digital trackers such as MyFitnessPal. Both current methods are quite tiresome, and we saw opportunities in utilizing computer vision methods to automate the recognition of what a user is eating, therefore hoping to reduce the need for manual input and making the process easier. In practice, we implemented an image classifier based on the Inception architecture of GoogLeNet, and trained it on the Food- 101 dataset. The performance of the classifier on the validation set achieved around 87% for top 5 accuracy. We then deployed our image recognition apparatus in the form of a mobile application, to examine the actual performance of this apparatus in an in-field setting with actual consumer eating patterns. The overall in-field recognition performance was around 28% (top 5), however, since only 30% of our meals observed were actually of foods belonging to the 101 classes we had trained the classifier to recognize, the in-field recognition accuracies for when foods to record were of foods we had trained on was around 92%. Furthermore, in subjective user surveys, 67% of users preferred our computer vision based apparatus to existing text input based digital trackers like MyFitnessPal, with 22% being neutral. Therefore, we believe that this approach to diet tracking is a promising one to explore in the future, as the main cause of low in-field recognition performance seems to be mainly caused by lack of coverage of the training data, and if we can curate a training set that captures the visual food domain appropriately, this approach can yield high in-field results and provide a tangibly more convenient tool for users to log and track their diets.

Description

Thesis: S.M., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2017.

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 44-47).

Date issued

2017

URI

http://hdl.handle.net/1721.1/112562

Department

Program in Media Arts and Sciences (Massachusetts Institute of Technology)

Publisher

Massachusetts Institute of Technology

Keywords

Program in Media Arts and Sciences ()

Collections

Graduate Theses