An apparatus enabling easier and more automated dietary pattern tracking
Author(s)Wang, Guolong, S.M. Massachusetts Institute of Technology
Program in Media Arts and Sciences (Massachusetts Institute of Technology)
MetadataShow full item record
Nutritional assessment is an important problem for every American. Studies suggest that as many as 90% of Americans fall short of Vitamins D and E as a result of their regular dietary habits, and up to 50% of Americans do not get enough Vitamin A and Calcium. On the other hand, 68.8% of American adults over 20 were considered overweight or obese (had BMI of over 25), with excessive consumption of added sugars, fats, and carbohydrates being a key factor. There are two potential challenges that, if solved, may help many ordinary Americans manage their diets healthily. The first is recording dietary intake so that we have sufficient information regarding an individual's dietary pattern, and the second is interpreting nutritional profiles from the foods people are eating. It's after these two steps that nutritional intake can be inferred and insights into dietary balance can be gained. This thesis focuses on the first challenge, enabling more convenient tracking of dietary patterns supported by automatic image recognition. Our goal was to provide an improved alternative to current mainstream methods of keeping dietary records such as written records for clinical studies, or text input based digital trackers such as MyFitnessPal. Both current methods are quite tiresome, and we saw opportunities in utilizing computer vision methods to automate the recognition of what a user is eating, therefore hoping to reduce the need for manual input and making the process easier. In practice, we implemented an image classifier based on the Inception architecture of GoogLeNet, and trained it on the Food- 101 dataset. The performance of the classifier on the validation set achieved around 87% for top 5 accuracy. We then deployed our image recognition apparatus in the form of a mobile application, to examine the actual performance of this apparatus in an in-field setting with actual consumer eating patterns. The overall in-field recognition performance was around 28% (top 5), however, since only 30% of our meals observed were actually of foods belonging to the 101 classes we had trained the classifier to recognize, the in-field recognition accuracies for when foods to record were of foods we had trained on was around 92%. Furthermore, in subjective user surveys, 67% of users preferred our computer vision based apparatus to existing text input based digital trackers like MyFitnessPal, with 22% being neutral. Therefore, we believe that this approach to diet tracking is a promising one to explore in the future, as the main cause of low in-field recognition performance seems to be mainly caused by lack of coverage of the training data, and if we can curate a training set that captures the visual food domain appropriately, this approach can yield high in-field results and provide a tangibly more convenient tool for users to log and track their diets.
Thesis: S.M., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2017.Cataloged from PDF version of thesis.Includes bibliographical references (pages 44-47).
DepartmentProgram in Media Arts and Sciences (Massachusetts Institute of Technology)
Massachusetts Institute of Technology
Program in Media Arts and Sciences ()