Show simple item record

dc.contributor.authorSingh, Anjali
dc.contributor.authorFariha, Anna
dc.contributor.authorBrooks, Christopher
dc.contributor.authorSoares, Gustavo
dc.contributor.authorHenley, Austin Z.
dc.contributor.authorTiwari, Ashish
dc.contributor.authorM, Chethan
dc.contributor.authorChoi, Heeryung
dc.contributor.authorGulwani, Sumit
dc.date.accessioned2024-04-04T16:05:56Z
dc.date.available2024-04-04T16:05:56Z
dc.date.issued2024-03-07
dc.identifier.isbn979-8-4007-0423-9
dc.identifier.urihttps://hdl.handle.net/1721.1/154064
dc.descriptionSIGCSE 2024, March 20–23, 2024, Portland, OR, USAen_US
dc.description.abstractData Science (DS) has emerged as a new academic discipline where students are introduced to data-centric thinking and generating data-driven insights through programming. Unlike traditional introductory Computer Science (CS) education, which focuses on program syntax and core CS topics (e.g., algorithms and data structures), introductory DS education emphasizes skills such as analyzing data to gain insights by making effective use of programming libraries (e.g., re, NumPy, pandas, scikit-learn). To better understand learners' needs and pain points when they are introduced to DS programming, we investigated a large online course on data manipulation designed for graduate students who do not have a CS or Statistics undergraduate degree. We qualitatively analyzed students' incorrect code submissions for computational notebook-based assignments in Python. We identified common mistakes and grouped them into the following themes: (1) programming language and environment misconceptions, (2) logical mistakes due to data or problem-statement misunderstanding or incorrectly dealing with missing values, (3) semantic mistakes due to incorrect use of DS libraries, and (4) suboptimal coding. Our work provides instructors insights to understand student needs in introductory DS courses and improve course pedagogy, and recommendations for developing assessment and feedback tools to support students in large courses.en_US
dc.publisherACMen_US
dc.relation.isversionof10.1145/3626252.3630884en_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceACMen_US
dc.titleInvestigating Student Mistakes in Introductory Data Science Programmingen_US
dc.typeArticleen_US
dc.identifier.citationSingh, Anjali, Fariha, Anna, Brooks, Christopher, Soares, Gustavo, Henley, Austin Z. et al. 2024. "Investigating Student Mistakes in Introductory Data Science Programming."
dc.contributor.departmentMIT Open Learning
dc.identifier.mitlicensePUBLISHER_POLICY
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2024-04-01T07:48:15Z
dc.language.rfc3066en
dc.rights.holderThe author(s)
dspace.date.submission2024-04-01T07:48:15Z
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record