Analysis of Student Pair Teamwork Using GitHub Activities

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Author(s): Gitinabard, Niki (ORCID Gitinabard, Niki (ORCID 0000-0002-9975-1677); Gao, Zhikai; Heckman, Sarah (ORCID Heckman, Sarah (ORCID 0000-0003-4351-8611); Barnes, Tiffany; Lynch, Collin F. (ORCID Lynch, Collin F. (ORCID 0000-0001-6958-9368)
  • Language:
    English
  • Source:
    Journal of Educational Data Mining. 2023 15(1):32-62.
  • Publication Date:
    2023
  • Document Type:
    Journal Articles
    Reports - Research
  • Additional Information
    • Availability:
      International Educational Data Mining. e-mail: [email protected]; Web site: https://jedm.educationaldatamining.org/index.php/JEDM
    • Peer Reviewed:
      Y
    • Source:
      31
    • Sponsoring Agency:
      National Science Foundation (NSF)
    • Contract Number:
      1714538
    • Education Level:
      Higher Education
      Postsecondary Education
    • Subject Terms:
    • ISSN:
      2157-2100
    • Abstract:
      Few studies have analyzed students' teamwork (pairwork) habits in programming projects due to the challenges and high cost of analyzing complex, long-term collaborative processes. In this work, we analyze student teamwork data collected from the GitHub platform with the goal of identifying specific pair teamwork styles. This analysis builds on an initial corpus of commit message data that was manually labeled by subject matter experts. We then extend this annotation through the use of self-supervised, semi-supervised learning to develop a large-scale annotated dataset that covers multiple course offerings from a second-semester CS2 course. Further, we develop a series of predictive models to automatically identify student teamwork styles. Finally, we compare trends in students' performance and team selection for each teamwork style to see if any of them reflected better student outcomes or different trends of help-seeking among students. Our analysis showed that applying self-supervised semi-supervised methods helps us to label larger subsets of data automatically and maintains and even sometimes improves the performance of the fully supervised models on a held-out validation set. Our analysis also showed that members of teams in which all members have significant contributions tend to have better performance in class, but their help-seeking behaviors are not significantly different.
    • Abstract:
      As Provided
    • Publication Date:
      2023
    • Accession Number:
      EJ1383369