Measuring creativity in computer programming: A code distance approach Open Access

Chou, Elijah (Spring 2023)

Permanent URL: https://etd.library.emory.edu/concern/etds/b8515p78f?locale=en
Published

Abstract

We propose a novel approach to measure student creativity in computer programming. We collected a set of Java programming problems and their solutions submitted by multiple students. We parsed the students' code into abstract syntax trees, and calculated the distance among code submissions within problem groups using a tree edit distance algorithm. We estimated each student's creativity as the normalized average distance between their code and the other students' codes. Pearson correlation analysis revealed a negative correlation between students' coding performance (i.e., the degree of correctness of their code) and students' programming creativity in some circumstances. Further analysis comparing state (features of the problem set) and trait (features of the students) for this measure revealed a correlation with trait and no correlation with state. This suggests that our proposed measure is likely measuring specific traits that a student has, possibly originality, and not some coincidental feature of our problem set. We also examined the validity of our proposed measure by observing the frequency at which human graders agree with the measure in ranking the originality of pairs of code. Our proposed creativity measure achieved moderate agreement with the majority vote of human graders in ranking creativity. The Pearson correlation and state vs. trait analyses were repeated on student code written in Python, and similar findings were observed in the Python dataset as well.

Table of Contents

1 Introduction 1

2 Background 3

2.1 Computational Thinking . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.2 Creativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.3 Creativity and Computational Thinking . . . . . . . . . . . . . . . . 5

2.4 Calculating Creativity in Programming . . . . . . . . . . . . . . . . . 6

2.5 Tree Edit Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Approach 9

3.1 Proposed Method and Definitions . . . . . . . . . . . . . . . . . . . . 9

3.2 Creativity Measure Standardization . . . . . . . . . . . . . . . . . . . 11

3.3 State or Trait Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.4 Creativity Measure Validation . . . . . . . . . . . . . . . . . . . . . . 13

3.5 Creativity Measure Stability . . . . . . . . . . . . . . . . . . . . . . . 14

4 Experiments 16

4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4.1.1 Primary Dataset: Java . . . . . . . . . . . . . . . . . . . . . . 16

4.1.2 Java Sample Coding Question and Student Programs . . . . . 17

4.1.3 Secondary Dataset: Python . . . . . . . . . . . . . . . . . . . 21

4.1.4 Python Sample Coding Question and Student Programs . . . 22

4.2 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.3 Tree Edit Distance Calculations . . . . . . . . . . . . . . . . . . . . . 24

4.4 Aggregate by Student . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.5 Examining Data Distributions . . . . . . . . . . . . . . . . . . . . . . 27

4.5.1 Measuring Clustering Tendency of Data . . . . . . . . . . . . 27

4.6 Length vs. Uniqueness of Code . . . . . . . . . . . . . . . . . . . . . 29

4.7 State or Trait Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.8 Validation of Student Creativity Measure . . . . . . . . . . . . . . . . 32

4.9 Proposed Creativity Measure for Python Programs . . . . . . . . . . 34

5 Results 35

5.1 Student Creativity Measure vs. Programming Performance . . . . . . 35

5.1.1 Evaluating Pearson Correlation by Quadrants . . . . . . . . . 36

5.2 State vs. Trait Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.3 Validation of Creativity Measure through Ranking . . . . . . . . . . . 41

5.3.1 Ranking Agreement Among Human Graders . . . . . . . . . . 41

5.3.2 Ranking Agreement Between Humans and System . . . . . . . 42

5.4 Python Creativity Measure Analysis . . . . . . . . . . . . . . . . . . 43

5.4.1 Pearson Correlation in Python . . . . . . . . . . . . . . . . . . 43

5.4.2 State vs. Trait Analysis in Python . . . . . . . . . . . . . . . 46

6 Discussion 47

6.1 Negative Correlation between Average Tree Edit Distance and Programming Performance . . . . . . . . . . . . . . . . . . . . . . . . . . 47

6.2 Trait over State Explanations for Creativity Measure . . . . . . . . . 48

6.3 Moderate Agreement Between Humans and Creativity Measure . . . 49

6.4 Creativity Measure Stability . . . . . . . . . . . . . . . . . . . . . . . 50

6.4.1 Negative Correlation in Python Data between Average Tree Edit Distance and Programming Performance . . . . . . . . . 50

6.4.2 Trait over State Explanations for Creativity Measure in Python 51

7 Conclusion 52

7.1 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . 54

Bibliography 56

About this Honors Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files