| dc.contributor.author | Singh, Anjali | |
| dc.contributor.author | Fariha, Anna | |
| dc.contributor.author | Brooks, Christopher | |
| dc.contributor.author | Soares, Gustavo | |
| dc.contributor.author | Henley, Austin Z. | |
| dc.contributor.author | Tiwari, Ashish | |
| dc.contributor.author | M, Chethan | |
| dc.contributor.author | Choi, Heeryung | |
| dc.contributor.author | Gulwani, Sumit | |
| dc.date.accessioned | 2024-04-04T16:05:56Z | |
| dc.date.available | 2024-04-04T16:05:56Z | |
| dc.date.issued | 2024-03-07 | |
| dc.identifier.isbn | 979-8-4007-0423-9 | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/154064 | |
| dc.description | SIGCSE 2024, March 20–23, 2024, Portland, OR, USA | en_US |
| dc.description.abstract | Data Science (DS) has emerged as a new academic discipline where students are introduced to data-centric thinking and generating data-driven insights through programming. Unlike traditional introductory Computer Science (CS) education, which focuses on program syntax and core CS topics (e.g., algorithms and data structures), introductory DS education emphasizes skills such as analyzing data to gain insights by making effective use of programming libraries (e.g., re, NumPy, pandas, scikit-learn). To better understand learners' needs and pain points when they are introduced to DS programming, we investigated a large online course on data manipulation designed for graduate students who do not have a CS or Statistics undergraduate degree. We qualitatively analyzed students' incorrect code submissions for computational notebook-based assignments in Python. We identified common mistakes and grouped them into the following themes: (1) programming language and environment misconceptions, (2) logical mistakes due to data or problem-statement misunderstanding or incorrectly dealing with missing values, (3) semantic mistakes due to incorrect use of DS libraries, and (4) suboptimal coding. Our work provides instructors insights to understand student needs in introductory DS courses and improve course pedagogy, and recommendations for developing assessment and feedback tools to support students in large courses. | en_US |
| dc.publisher | ACM | en_US |
| dc.relation.isversionof | 10.1145/3626252.3630884 | en_US |
| dc.rights | Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. | en_US |
| dc.source | ACM | en_US |
| dc.title | Investigating Student Mistakes in Introductory Data Science Programming | en_US |
| dc.type | Article | en_US |
| dc.identifier.citation | Singh, Anjali, Fariha, Anna, Brooks, Christopher, Soares, Gustavo, Henley, Austin Z. et al. 2024. "Investigating Student Mistakes in Introductory Data Science Programming." | |
| dc.contributor.department | MIT Open Learning | |
| dc.identifier.mitlicense | PUBLISHER_POLICY | |
| dc.eprint.version | Final published version | en_US |
| dc.type.uri | http://purl.org/eprint/type/ConferencePaper | en_US |
| eprint.status | http://purl.org/eprint/status/NonPeerReviewed | en_US |
| dc.date.updated | 2024-04-01T07:48:15Z | |
| dc.language.rfc3066 | en | |
| dc.rights.holder | The author(s) | |
| dspace.date.submission | 2024-04-01T07:48:15Z | |
| mit.license | PUBLISHER_CC | |
| mit.metadata.status | Authority Work and Publication Information Needed | en_US |