Google BigQuery for Education
Author(s)
Lopez, Glenn; Seaton, Daniel T.; Ang, Andrew; Tingley, Dustin; Chuang, Isaac L.
DownloadAccepted version (3.280Mb)
Open Access Policy
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Alternative title
Framework for Parsing and Analyzing edX MOOC Data
Terms of use
Metadata
Show full item recordAbstract
© 2017 ACM. The size and complexity of MOOC data present overwhelming challenges to many institutions. This paper details the functionality of edx2bigquery -An open source Python package developed by Harvard and MIT to ingest and report on hundreds of MITx and HarvardX course datasets from edX, making use of Google BigQuery to handle multiple terabytes of learner data. For this application, we find that Google BigQuery provides ease of use in loading the multi-faceted MOOC datasets and near real-Time interactive querying of data, including large clickstream datasets; moreover, we are able to provide flexible research and reporting dashboards, visualizing and aggregating data, by interfacing services associated with BigQuery. This framework makes it feasible for edx2bigquery to be open source, following standards which emphasize the importance of data products that transcend a particular data science platform and allow teams with diverse backgrounds to interact with data. edx2bigquery is being adopted by other institutions with an aim toward future collaboration.
Date issued
2017-04Department
MIT-IBM Watson AI LabPublisher
ACM
Citation
Lopez, Glenn, Seaton, Daniel T., Ang, Andrew, Tingley, Dustin and Chuang, Isaac. 2017. "Google BigQuery for Education."
Version: Author's final manuscript