From 765d5a82f6b5fbdb01f41c5989e62fd16be04db3 Mon Sep 17 00:00:00 2001
From: gguu1314 <88154940+gguu1314@users.noreply.github.com>
Date: Tue, 10 Aug 2021 11:12:58 +0800
Subject: [PATCH 01/10] [docs] add data analysis to KDD Cup 2010
---
docs/KDD Cup 2010.md | 413 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 413 insertions(+)
create mode 100644 docs/KDD Cup 2010.md
diff --git a/docs/KDD Cup 2010.md b/docs/KDD Cup 2010.md
new file mode 100644
index 0000000..eaddc2a
--- /dev/null
+++ b/docs/KDD Cup 2010.md
@@ -0,0 +1,413 @@
+# KDD Cup 2010 —— Data Analysis on algebra_2006_2007_train
+
+## Data Description
+
+### Column Description
+
+| Attribute | Annotaion |
+| --- | --- |
+| Row | The row number |
+| Anon Student Id | Unique, anonymous identifier for a student |
+| Problem Hierarchy | The hierarchy of curriculum levels containing the problem |
+| Problem Name | Unique identifier for a problem |
+| Problem View | The total number of times the student encountered the problem so far |
+| Step Name | Unique identifier for one of the steps in a problem |
+| Step Start Time | The starting time of the step (Can be null) |
+| First Transaction Time | The time of the first transaction toward the step |
+| Correct Transaction Time | The time of the correct attempt toward the step, if there was one |
+| Step End Time | The time of the last transaction toward the step |
+| Step Duration (sec) | The elapsed time of the step in seconds, calculated by adding all of the durations for transactions that were attributed to the step (Can be null if step start time is null) |
+| Correct Step Duration (sec) | The step duration if the first attempt for the step was correct |
+| Error Step Duration (sec) | The step duration if the first attempt for the step was an error (incorrect attempt or hint request) |
+| Correct First Attempt | The tutor's evaluation of the student's first attempt on the step—1 if correct, 0 if an error |
+| Incorrects | Total number of incorrect attempts by the student on the step |
+| Hints | Total number of hints requested by the student for the step |
+| Corrects | Total correct attempts by the student for the step (only increases if the step is encountered more than once) |
+| KC(KC Model Name) | The identified skills that are used in a problem, where available |
+| Opportunity(KC Model Name) | A count that increases by one each time the student encounters a step with the listed knowledge component |
+| | Additional KC models, which exist for the challenge data sets, will appear as additional pairs of columns (KC and Opportunity columns for each model) |
+
+For the test portion of the challenge data sets, values will not be provided for the following columns:
+
+♦ Step Start Time
+
+♦ First Transaction Time
+
+♦ Correct Transaction Time
+
+♦ Step End Time
+
+♦ Step Duration (sec)
+
+♦ Correct Step Duration (sec)
+
+♦ Error Step Duration (sec)
+
+♦ Correct First Attempt
+
+♦ Incorrects
+
+♦ Hints
+
+♦ Corrects
+
+In [1]:
+
+```python
+import pandas as pd
+import plotly.express as px
+```
+
+In [2]:
+
+```python
+path = "algebra_2006_2007_train.txt"
+data = pd.read_table(path, encoding="ISO-8859-15", low_memory=False)
+```
+
+## Record Examples
+
+In [3]:
+
+```python
+pd.set_option('display.max_column', 500)
+print(data.head())
+```
+
+Out [3]:
+
+| | Row | Anon Student Id | Problem Hierarchy | Problem Name | Problem View | Step Name | Step Start Time | First Transaction Time | Correct Transaction Time | Step End Time | Step Duration (sec) | Correct Step Duration (sec) | Error Step Duration (sec) | Correct First Attempt | Incorrects | Hints|Corrects|KC(Default) | Opportunity(Default)|
+| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
+| 0 | 1 | JG4Tz | Unit CTA1_01, Section CTA1_01-1 | LDEMO_WKST | 1 | R1C1 | 2006-10-26 09:51:58.0 | 2006-10-26 09:52:30.0 | 2006-10-26 09:53:30.0 | 2006-10-26 09:53:30.0 | 92.0 | NaN | 92.0 | 0 | 2 | 0 | 1 | NaN | NaN |
+| 1 | 2 | JG4Tz | Unit CTA1_01, Section CTA1_01-1 | LDEMO_WKST | 1 | R1C2 | 2006-10-26 09:53:30.0 | 2006-10-26 09:53:41.0 | 2006-10-26 09:53:41.0 | 2006-10-26 09:53:41.0 | 11.0 | 11.0 | NaN | 1 | 0 | 0 | 1 | NaN | NaN |
+| 2 | 3 | JG4Tz | Unit CTA1_01, Section CTA1_01-1 | LDEMO_WKST | 1 | R2C1 | 2006-10-26 09:53:41.0 | 2006-10-26 09:53:46.0 | 2006-10-26 09:53:46.0 | 2006-10-26 09:53:46.0 | 5.0 | 5.0 | NaN | 1 | 0 | 0 | 1 | Identifying units | 1 |
+| 3 | 4 | JG4Tz | Unit CTA1_01, Section CTA1_01-1 | LDEMO_WKST | 1 | R2C2 | 2006-10-26 09:53:46.0 | 2006-10-26 09:53:50.0 | 2006-10-26 09:53:50.0 | 2006-10-26 09:53:50.0 | 4.0 | 4.0 | NaN | 1 | 0 | 0 | 1 | Identifying units | 2 |
+| 4 | 5 | JG4Tz | Unit CTA1_01, Section CTA1_01-1 | LDEMO_WKST | 1 | R4C1 | 2006-10-26 09:53:50.0 | 2006-10-26 09:54:05.0 | 2006-10-26 09:54:05.0 | 2006-10-26 09:54:05.0 | 15.0 | 15.0 | NaN | 1 | 0 | 0 | 1 | Entering a given | 1 |
+
+## General features
+
+In [4]:
+
+```python
+print(data.describe())
+```
+
+Out [4]:
+
+| | Row | Problem View | Step Duration (sec) | Correct Step Duration (sec) | Error Step Duration (sec) | Correct First Attempt | Incorrects | Hints | Corrects |
+| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
+| count| 2.270384e+06 | 2.270384e+06 | 2.267551e+06 | 1.751638e+06 | 515913.000000 | 2.270384e+06 | 2.270384e+06 | 2.270384e+06 | 2.270384e+06 |
+| mean | 1.513120e+06 | 1.092910e+00 | 1.958364e+01 | 1.171716e+01 | 46.292087 | 7.722359e-01 | 4.455044e-01 | 1.184311e-01 | 1.062878e+00 |
+| std | 8.736198e+05 | 3.448857e-01 | 4.768345e+01 | 2.645318e+01 | 81.817794 | 4.193897e-01 | 2.000914e+00 | 6.199071e-01 | 6.894285e-01 |
+| min | 1.000000e+00 | 1.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 |
+| 25% | 7.577408e+05 | 1.000000e+00 | 3.000000e+00 | 3.000000e+00 | 11.000000 | 1.000000e+00 | 0.000000e+00 | 0.000000e+00 | 1.000000e+00 |
+| 50% | 1.511844e+06 | 1.000000e+00 | 7.000000e+00 | 5.000000e+00 | 22.000000 | 1.000000e+00 | 0.000000e+00 | 0.000000e+00 | 1.000000e+00 |
+| 75% | 2.269432e+06 | 1.000000e+00 | 1.700000e+01 | 1.100000e+01 | 47.000000 | 1.000000e+00 | 0.000000e+00 | 0.000000e+00 | 1.000000e+00 |
+| max | 3.025933e+06 | 1.000000e+01 | 3.208000e+03 | 1.204000e+03 | 3208.000000 | 1.000000e+00 | 3.600000e+02 | 1.020000e+02 | 9.200000e+01 |
+
+In [5]:
+
+```python
+print("Part of missing values for every column")
+print(data.isnull().sum() / len(data))
+```
+
+Out [5]:
+
+```
+Part of missing values for every column
+Row 0.000000
+Anon Student Id 0.000000
+Problem Hierarchy 0.000000
+Problem Name 0.000000
+Problem View 0.000000
+Step Name 0.000000
+Step Start Time 0.001103
+First Transaction Time 0.000000
+Correct Transaction Time 0.034757
+Step End Time 0.000000
+Step Duration (sec) 0.001248
+Correct Step Duration (sec) 0.228484
+Error Step Duration (sec) 0.772764
+Correct First Attempt 0.000000
+Incorrects 0.000000
+Hints 0.000000
+Corrects 0.000000
+KC(Default) 0.203407
+Opportunity(Default) 0.203407
+dtype: float64
+```
+
+In [6]:
+
+```python
+print("the number of records:")
+print(len(data))
+```
+
+Out [6]:
+
+```
+the number of records:
+2270384
+```
+
+In [7]:
+
+```python
+print("how many students are there in the table:")
+print(len(data['Anon Student Id'].unique()))
+```
+
+Out [7]:
+
+```
+how many students are there in the table:
+1338
+```
+
+In [8]:
+
+```python
+print("how many problems are there in the table:")
+print(len(data['Problem Name'].unique()))
+```
+
+Out [8]:
+
+```
+how many problems are there in the table:
+91913
+```
+
+## Sort by Anon Student Id
+
+In [9]:
+
+```python
+ds = data['Anon Student Id'].value_counts().reset_index()
+ds.columns = [
+ 'Anon Student Id',
+ 'count'
+]
+ds['Anon Student Id'] = ds['Anon Student Id'].astype(str) + '-'
+ds = ds.sort_values('count').tail(40)
+
+fig = px.bar(
+ ds,
+ x='count',
+ y='Anon Student Id',
+ orientation='h',
+ title='Top 40 students by number of steps they have done'
+)
+fig.show()
+```
+
+Out [9]:
+
+![1.png](https://i.loli.net/2021/08/10/HZ1wjWaurLzesp8.png)
+
+## Percent of corrects, hints and incorrects
+
+In [10]:
+
+```python
+count_corrects = data['Corrects'].sum()
+count_hints = data['Hints'].sum()
+count_incorrects = data['Incorrects'].sum()
+
+total = count_corrects + count_hints + count_incorrects
+
+percent_corrects = count_corrects / total
+percent_hints = count_hints / total
+percent_incorrects = count_incorrects / total
+
+dfl = [['corrects', percent_corrects], ['hints', percent_hints], ['incorrects', percent_incorrects]]
+
+df = pd.DataFrame(dfl, columns=['transaction type', 'percent'])
+
+fig = px.pie(
+ df,
+ names=['corrects', 'hints', 'incorrects'],
+ values='percent',
+ title='Percent of corrects, hints and incorrects'
+)
+fig.show()
+```
+
+Out [10]:
+
+![2.png](https://i.loli.net/2021/08/10/bGOVywLHvBlQW7o.png)
+
+## Sort by Problem Name
+
+In [11]:
+
+```python
+storeProblemCount = [1]
+storeProblemName = [data['Problem Name'][0]]
+currentProblemName = data['Problem Name'][0]
+currentStepName = [data['Step Name'][0]]
+lastIndex = 0
+
+for i in range(1, len(data), 1):
+ pbNameI = data['Problem Name'][i]
+ stNameI = data['Step Name'][i]
+ if pbNameI != data['Problem Name'][lastIndex]:
+ currentStepName = [stNameI]
+ currentProblemName = pbNameI
+ if pbNameI not in storeProblemName:
+ storeProblemName.append(pbNameI)
+ storeProblemCount.append(1)
+ else:
+ storeProblemCount[storeProblemName.index(pbNameI)] += 1
+ lastIndex = i
+ elif stNameI not in currentStepName:
+ currentStepName.append(stNameI)
+ lastIndex = i
+ else:
+ currentStepName = [stNameI]
+ storeProblemCount[storeProblemName.index(pbNameI)] += 1
+ lastIndex = i
+
+dfData = {
+ 'Problem Name': storeProblemName,
+ 'count': storeProblemCount
+}
+df = pd.DataFrame(dfData).sort_values('count').tail(40)
+df["Problem Name"] += '-'
+
+fig = px.bar(
+ df,
+ x='count',
+ y='Problem Name',
+ orientation='h',
+ title='Top 40 useful problem'
+)
+fig.show()
+```
+
+Out [11]:
+
+![3.png](https://i.loli.net/2021/08/10/1cY7IhaFelbt35j.png)
+
+In [12]:
+
+```python
+data['total transactions'] = data['Incorrects'] + data['Hints'] + data['Corrects']
+df1 = data.groupby('Problem Name')['total transactions'].sum().reset_index()
+df2 = data.groupby('Problem Name')['Corrects'].sum().reset_index()
+df1['Corrects'] = df2['Corrects']
+df1['Correct rate'] = df1['Corrects'] / df1['total transactions']
+
+df1 = df1.sort_values('total transactions')
+count = 0
+standard = 500
+for i in df1['total transactions']:
+ if i > standard:
+ count += 1
+df1 = df1.tail(count)
+
+df1 = df1.sort_values('Correct rate')
+
+df1['Problem Name'] = df1['Problem Name'].astype(str) + "-"
+
+df_px = df1.tail(20)
+
+fig = px.bar(
+ df_px,
+ x='Correct rate',
+ y='Problem Name',
+ orientation='h',
+ title='Correct rate of each problem (top 20) (total transactions of \
+each problem are required to be more than 500)',
+ text='Problem Name'
+)
+fig.show()
+
+df_px = df1.head(20)
+
+fig = px.bar(
+ df_px,
+ x='Correct rate',
+ y='Problem Name',
+ orientation='h',
+ title='Correct rate of each problem (bottom 20) (total transactions of \
+each problem are required to be more than 500)',
+ text='Problem Name'
+)
+fig.show()
+```
+
+Out [12]:
+
+![4.png](https://i.loli.net/2021/08/10/3ihSFvYnc6Lgp1Z.png)
+
+![5.png](https://i.loli.net/2021/08/10/GCjJ7zWAPaLdu5m.png)
+
+These two figures present the correct rate of problems. Problems with low correct rate deserve more attention from teachers and students.
+
+## Sort by KC
+
+In [13]:
+
+```python
+data.dropna(subset=['KC(Default)'], inplace=True)
+
+data['total transactions'] = data['Corrects'] + data['Hints'] + data['Incorrects']
+df1 = data.groupby('KC(Default)')['total transactions'].sum().reset_index()
+df2 = data.groupby('KC(Default)')['Corrects'].sum().reset_index()
+df1['Corrects'] = df2['Corrects']
+df1['correct rate'] = df1['Corrects'] / df1['total transactions']
+
+count = 0
+standard = 300
+for i in df1['total transactions']:
+ if i > standard:
+ count += 1
+df1 = df1.sort_values('total transactions').tail(count)
+
+df1 = df1.sort_values('correct rate')
+
+df1['KC(Default)'] = df1['KC(Default)'].astype(str) + '-'
+
+df_px = df1.tail(20)
+
+fig = px.bar(
+ df_px,
+ x='correct rate',
+ y='KC(Default)',
+ orientation='h',
+ title='Correct rate of each KC(Default) (top 20) (total transactions of \
+each KC are required to be more than 300)',
+ text='KC(Default)'
+)
+fig.update_yaxes(visible=False)
+fig.show()
+
+df_px = df1.head(20)
+
+fig = px.bar(
+ df_px,
+ x='correct rate',
+ y='KC(Default)',
+ orientation='h',
+ title='Correct rate of each KC(Default) (bottom 20) (total transactions of \
+each KC are required to be more than 300)',
+ text='KC(Default)'
+)
+fig.update_yaxes(visible=False)
+fig.show()
+```
+
+Out [13]:
+
+![6.png](https://i.loli.net/2021/08/10/OtpSM63K7uUwzJL.png)
+
+![7.png](https://i.loli.net/2021/08/10/koi8rvUqncDONZ1.png)
+
+These two figures present the correct rate of KCs. KCs with low correct rate deserve more attention from teachers and students.
+
+## Postscript
+
+Given that the whole data package is composed of 5 data sets and data files in these 5 data sets that can be used to conduct data analysis share the same data format, the following analysis based on "algebra_2006_2007_train" is just an example of data analysis on KDD Cup, and the code can be used to analyse other data files with some small changes on the file path and column names.
From bcf6eb36d796cfa65493610f23b6e45d1de8f6d1 Mon Sep 17 00:00:00 2001
From: gguu1314 <88154940+gguu1314@users.noreply.github.com>
Date: Sat, 14 Aug 2021 12:15:20 +0800
Subject: [PATCH 02/10] Delete KDD Cup 2010.md
---
docs/KDD Cup 2010.md | 413 -------------------------------------------
1 file changed, 413 deletions(-)
delete mode 100644 docs/KDD Cup 2010.md
diff --git a/docs/KDD Cup 2010.md b/docs/KDD Cup 2010.md
deleted file mode 100644
index eaddc2a..0000000
--- a/docs/KDD Cup 2010.md
+++ /dev/null
@@ -1,413 +0,0 @@
-# KDD Cup 2010 —— Data Analysis on algebra_2006_2007_train
-
-## Data Description
-
-### Column Description
-
-| Attribute | Annotaion |
-| --- | --- |
-| Row | The row number |
-| Anon Student Id | Unique, anonymous identifier for a student |
-| Problem Hierarchy | The hierarchy of curriculum levels containing the problem |
-| Problem Name | Unique identifier for a problem |
-| Problem View | The total number of times the student encountered the problem so far |
-| Step Name | Unique identifier for one of the steps in a problem |
-| Step Start Time | The starting time of the step (Can be null) |
-| First Transaction Time | The time of the first transaction toward the step |
-| Correct Transaction Time | The time of the correct attempt toward the step, if there was one |
-| Step End Time | The time of the last transaction toward the step |
-| Step Duration (sec) | The elapsed time of the step in seconds, calculated by adding all of the durations for transactions that were attributed to the step (Can be null if step start time is null) |
-| Correct Step Duration (sec) | The step duration if the first attempt for the step was correct |
-| Error Step Duration (sec) | The step duration if the first attempt for the step was an error (incorrect attempt or hint request) |
-| Correct First Attempt | The tutor's evaluation of the student's first attempt on the step—1 if correct, 0 if an error |
-| Incorrects | Total number of incorrect attempts by the student on the step |
-| Hints | Total number of hints requested by the student for the step |
-| Corrects | Total correct attempts by the student for the step (only increases if the step is encountered more than once) |
-| KC(KC Model Name) | The identified skills that are used in a problem, where available |
-| Opportunity(KC Model Name) | A count that increases by one each time the student encounters a step with the listed knowledge component |
-| | Additional KC models, which exist for the challenge data sets, will appear as additional pairs of columns (KC and Opportunity columns for each model) |
-
-For the test portion of the challenge data sets, values will not be provided for the following columns:
-
-♦ Step Start Time
-
-♦ First Transaction Time
-
-♦ Correct Transaction Time
-
-♦ Step End Time
-
-♦ Step Duration (sec)
-
-♦ Correct Step Duration (sec)
-
-♦ Error Step Duration (sec)
-
-♦ Correct First Attempt
-
-♦ Incorrects
-
-♦ Hints
-
-♦ Corrects
-
-In [1]:
-
-```python
-import pandas as pd
-import plotly.express as px
-```
-
-In [2]:
-
-```python
-path = "algebra_2006_2007_train.txt"
-data = pd.read_table(path, encoding="ISO-8859-15", low_memory=False)
-```
-
-## Record Examples
-
-In [3]:
-
-```python
-pd.set_option('display.max_column', 500)
-print(data.head())
-```
-
-Out [3]:
-
-| | Row | Anon Student Id | Problem Hierarchy | Problem Name | Problem View | Step Name | Step Start Time | First Transaction Time | Correct Transaction Time | Step End Time | Step Duration (sec) | Correct Step Duration (sec) | Error Step Duration (sec) | Correct First Attempt | Incorrects | Hints|Corrects|KC(Default) | Opportunity(Default)|
-| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
-| 0 | 1 | JG4Tz | Unit CTA1_01, Section CTA1_01-1 | LDEMO_WKST | 1 | R1C1 | 2006-10-26 09:51:58.0 | 2006-10-26 09:52:30.0 | 2006-10-26 09:53:30.0 | 2006-10-26 09:53:30.0 | 92.0 | NaN | 92.0 | 0 | 2 | 0 | 1 | NaN | NaN |
-| 1 | 2 | JG4Tz | Unit CTA1_01, Section CTA1_01-1 | LDEMO_WKST | 1 | R1C2 | 2006-10-26 09:53:30.0 | 2006-10-26 09:53:41.0 | 2006-10-26 09:53:41.0 | 2006-10-26 09:53:41.0 | 11.0 | 11.0 | NaN | 1 | 0 | 0 | 1 | NaN | NaN |
-| 2 | 3 | JG4Tz | Unit CTA1_01, Section CTA1_01-1 | LDEMO_WKST | 1 | R2C1 | 2006-10-26 09:53:41.0 | 2006-10-26 09:53:46.0 | 2006-10-26 09:53:46.0 | 2006-10-26 09:53:46.0 | 5.0 | 5.0 | NaN | 1 | 0 | 0 | 1 | Identifying units | 1 |
-| 3 | 4 | JG4Tz | Unit CTA1_01, Section CTA1_01-1 | LDEMO_WKST | 1 | R2C2 | 2006-10-26 09:53:46.0 | 2006-10-26 09:53:50.0 | 2006-10-26 09:53:50.0 | 2006-10-26 09:53:50.0 | 4.0 | 4.0 | NaN | 1 | 0 | 0 | 1 | Identifying units | 2 |
-| 4 | 5 | JG4Tz | Unit CTA1_01, Section CTA1_01-1 | LDEMO_WKST | 1 | R4C1 | 2006-10-26 09:53:50.0 | 2006-10-26 09:54:05.0 | 2006-10-26 09:54:05.0 | 2006-10-26 09:54:05.0 | 15.0 | 15.0 | NaN | 1 | 0 | 0 | 1 | Entering a given | 1 |
-
-## General features
-
-In [4]:
-
-```python
-print(data.describe())
-```
-
-Out [4]:
-
-| | Row | Problem View | Step Duration (sec) | Correct Step Duration (sec) | Error Step Duration (sec) | Correct First Attempt | Incorrects | Hints | Corrects |
-| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
-| count| 2.270384e+06 | 2.270384e+06 | 2.267551e+06 | 1.751638e+06 | 515913.000000 | 2.270384e+06 | 2.270384e+06 | 2.270384e+06 | 2.270384e+06 |
-| mean | 1.513120e+06 | 1.092910e+00 | 1.958364e+01 | 1.171716e+01 | 46.292087 | 7.722359e-01 | 4.455044e-01 | 1.184311e-01 | 1.062878e+00 |
-| std | 8.736198e+05 | 3.448857e-01 | 4.768345e+01 | 2.645318e+01 | 81.817794 | 4.193897e-01 | 2.000914e+00 | 6.199071e-01 | 6.894285e-01 |
-| min | 1.000000e+00 | 1.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 |
-| 25% | 7.577408e+05 | 1.000000e+00 | 3.000000e+00 | 3.000000e+00 | 11.000000 | 1.000000e+00 | 0.000000e+00 | 0.000000e+00 | 1.000000e+00 |
-| 50% | 1.511844e+06 | 1.000000e+00 | 7.000000e+00 | 5.000000e+00 | 22.000000 | 1.000000e+00 | 0.000000e+00 | 0.000000e+00 | 1.000000e+00 |
-| 75% | 2.269432e+06 | 1.000000e+00 | 1.700000e+01 | 1.100000e+01 | 47.000000 | 1.000000e+00 | 0.000000e+00 | 0.000000e+00 | 1.000000e+00 |
-| max | 3.025933e+06 | 1.000000e+01 | 3.208000e+03 | 1.204000e+03 | 3208.000000 | 1.000000e+00 | 3.600000e+02 | 1.020000e+02 | 9.200000e+01 |
-
-In [5]:
-
-```python
-print("Part of missing values for every column")
-print(data.isnull().sum() / len(data))
-```
-
-Out [5]:
-
-```
-Part of missing values for every column
-Row 0.000000
-Anon Student Id 0.000000
-Problem Hierarchy 0.000000
-Problem Name 0.000000
-Problem View 0.000000
-Step Name 0.000000
-Step Start Time 0.001103
-First Transaction Time 0.000000
-Correct Transaction Time 0.034757
-Step End Time 0.000000
-Step Duration (sec) 0.001248
-Correct Step Duration (sec) 0.228484
-Error Step Duration (sec) 0.772764
-Correct First Attempt 0.000000
-Incorrects 0.000000
-Hints 0.000000
-Corrects 0.000000
-KC(Default) 0.203407
-Opportunity(Default) 0.203407
-dtype: float64
-```
-
-In [6]:
-
-```python
-print("the number of records:")
-print(len(data))
-```
-
-Out [6]:
-
-```
-the number of records:
-2270384
-```
-
-In [7]:
-
-```python
-print("how many students are there in the table:")
-print(len(data['Anon Student Id'].unique()))
-```
-
-Out [7]:
-
-```
-how many students are there in the table:
-1338
-```
-
-In [8]:
-
-```python
-print("how many problems are there in the table:")
-print(len(data['Problem Name'].unique()))
-```
-
-Out [8]:
-
-```
-how many problems are there in the table:
-91913
-```
-
-## Sort by Anon Student Id
-
-In [9]:
-
-```python
-ds = data['Anon Student Id'].value_counts().reset_index()
-ds.columns = [
- 'Anon Student Id',
- 'count'
-]
-ds['Anon Student Id'] = ds['Anon Student Id'].astype(str) + '-'
-ds = ds.sort_values('count').tail(40)
-
-fig = px.bar(
- ds,
- x='count',
- y='Anon Student Id',
- orientation='h',
- title='Top 40 students by number of steps they have done'
-)
-fig.show()
-```
-
-Out [9]:
-
-![1.png](https://i.loli.net/2021/08/10/HZ1wjWaurLzesp8.png)
-
-## Percent of corrects, hints and incorrects
-
-In [10]:
-
-```python
-count_corrects = data['Corrects'].sum()
-count_hints = data['Hints'].sum()
-count_incorrects = data['Incorrects'].sum()
-
-total = count_corrects + count_hints + count_incorrects
-
-percent_corrects = count_corrects / total
-percent_hints = count_hints / total
-percent_incorrects = count_incorrects / total
-
-dfl = [['corrects', percent_corrects], ['hints', percent_hints], ['incorrects', percent_incorrects]]
-
-df = pd.DataFrame(dfl, columns=['transaction type', 'percent'])
-
-fig = px.pie(
- df,
- names=['corrects', 'hints', 'incorrects'],
- values='percent',
- title='Percent of corrects, hints and incorrects'
-)
-fig.show()
-```
-
-Out [10]:
-
-![2.png](https://i.loli.net/2021/08/10/bGOVywLHvBlQW7o.png)
-
-## Sort by Problem Name
-
-In [11]:
-
-```python
-storeProblemCount = [1]
-storeProblemName = [data['Problem Name'][0]]
-currentProblemName = data['Problem Name'][0]
-currentStepName = [data['Step Name'][0]]
-lastIndex = 0
-
-for i in range(1, len(data), 1):
- pbNameI = data['Problem Name'][i]
- stNameI = data['Step Name'][i]
- if pbNameI != data['Problem Name'][lastIndex]:
- currentStepName = [stNameI]
- currentProblemName = pbNameI
- if pbNameI not in storeProblemName:
- storeProblemName.append(pbNameI)
- storeProblemCount.append(1)
- else:
- storeProblemCount[storeProblemName.index(pbNameI)] += 1
- lastIndex = i
- elif stNameI not in currentStepName:
- currentStepName.append(stNameI)
- lastIndex = i
- else:
- currentStepName = [stNameI]
- storeProblemCount[storeProblemName.index(pbNameI)] += 1
- lastIndex = i
-
-dfData = {
- 'Problem Name': storeProblemName,
- 'count': storeProblemCount
-}
-df = pd.DataFrame(dfData).sort_values('count').tail(40)
-df["Problem Name"] += '-'
-
-fig = px.bar(
- df,
- x='count',
- y='Problem Name',
- orientation='h',
- title='Top 40 useful problem'
-)
-fig.show()
-```
-
-Out [11]:
-
-![3.png](https://i.loli.net/2021/08/10/1cY7IhaFelbt35j.png)
-
-In [12]:
-
-```python
-data['total transactions'] = data['Incorrects'] + data['Hints'] + data['Corrects']
-df1 = data.groupby('Problem Name')['total transactions'].sum().reset_index()
-df2 = data.groupby('Problem Name')['Corrects'].sum().reset_index()
-df1['Corrects'] = df2['Corrects']
-df1['Correct rate'] = df1['Corrects'] / df1['total transactions']
-
-df1 = df1.sort_values('total transactions')
-count = 0
-standard = 500
-for i in df1['total transactions']:
- if i > standard:
- count += 1
-df1 = df1.tail(count)
-
-df1 = df1.sort_values('Correct rate')
-
-df1['Problem Name'] = df1['Problem Name'].astype(str) + "-"
-
-df_px = df1.tail(20)
-
-fig = px.bar(
- df_px,
- x='Correct rate',
- y='Problem Name',
- orientation='h',
- title='Correct rate of each problem (top 20) (total transactions of \
-each problem are required to be more than 500)',
- text='Problem Name'
-)
-fig.show()
-
-df_px = df1.head(20)
-
-fig = px.bar(
- df_px,
- x='Correct rate',
- y='Problem Name',
- orientation='h',
- title='Correct rate of each problem (bottom 20) (total transactions of \
-each problem are required to be more than 500)',
- text='Problem Name'
-)
-fig.show()
-```
-
-Out [12]:
-
-![4.png](https://i.loli.net/2021/08/10/3ihSFvYnc6Lgp1Z.png)
-
-![5.png](https://i.loli.net/2021/08/10/GCjJ7zWAPaLdu5m.png)
-
-These two figures present the correct rate of problems. Problems with low correct rate deserve more attention from teachers and students.
-
-## Sort by KC
-
-In [13]:
-
-```python
-data.dropna(subset=['KC(Default)'], inplace=True)
-
-data['total transactions'] = data['Corrects'] + data['Hints'] + data['Incorrects']
-df1 = data.groupby('KC(Default)')['total transactions'].sum().reset_index()
-df2 = data.groupby('KC(Default)')['Corrects'].sum().reset_index()
-df1['Corrects'] = df2['Corrects']
-df1['correct rate'] = df1['Corrects'] / df1['total transactions']
-
-count = 0
-standard = 300
-for i in df1['total transactions']:
- if i > standard:
- count += 1
-df1 = df1.sort_values('total transactions').tail(count)
-
-df1 = df1.sort_values('correct rate')
-
-df1['KC(Default)'] = df1['KC(Default)'].astype(str) + '-'
-
-df_px = df1.tail(20)
-
-fig = px.bar(
- df_px,
- x='correct rate',
- y='KC(Default)',
- orientation='h',
- title='Correct rate of each KC(Default) (top 20) (total transactions of \
-each KC are required to be more than 300)',
- text='KC(Default)'
-)
-fig.update_yaxes(visible=False)
-fig.show()
-
-df_px = df1.head(20)
-
-fig = px.bar(
- df_px,
- x='correct rate',
- y='KC(Default)',
- orientation='h',
- title='Correct rate of each KC(Default) (bottom 20) (total transactions of \
-each KC are required to be more than 300)',
- text='KC(Default)'
-)
-fig.update_yaxes(visible=False)
-fig.show()
-```
-
-Out [13]:
-
-![6.png](https://i.loli.net/2021/08/10/OtpSM63K7uUwzJL.png)
-
-![7.png](https://i.loli.net/2021/08/10/koi8rvUqncDONZ1.png)
-
-These two figures present the correct rate of KCs. KCs with low correct rate deserve more attention from teachers and students.
-
-## Postscript
-
-Given that the whole data package is composed of 5 data sets and data files in these 5 data sets that can be used to conduct data analysis share the same data format, the following analysis based on "algebra_2006_2007_train" is just an example of data analysis on KDD Cup, and the code can be used to analyse other data files with some small changes on the file path and column names.
From 0b4125ad727503674bf90f5da3dcc60c76871582 Mon Sep 17 00:00:00 2001
From: gguu1314 <88154940+gguu1314@users.noreply.github.com>
Date: Sat, 14 Aug 2021 12:16:02 +0800
Subject: [PATCH 03/10] Add files via upload
---
docs/KDD Cup 2010.ipynb | 4925 +++++++++++++++++++++++++++++++++++++++
1 file changed, 4925 insertions(+)
create mode 100644 docs/KDD Cup 2010.ipynb
diff --git a/docs/KDD Cup 2010.ipynb b/docs/KDD Cup 2010.ipynb
new file mode 100644
index 0000000..bd7876d
--- /dev/null
+++ b/docs/KDD Cup 2010.ipynb
@@ -0,0 +1,4925 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "e002fdf8",
+ "metadata": {},
+ "source": [
+ "# KDD Cup 2010 —— Data Analysis on algebra_2006_2007_train"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "429152ff",
+ "metadata": {},
+ "source": [
+ "## Data Description"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2c89d116",
+ "metadata": {},
+ "source": [
+ "### Column Description"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f590eee5",
+ "metadata": {},
+ "source": [
+ "| Attribute | Annotaion |\n",
+ "|:--:|---|\n",
+ "|Row|The row number|\n",
+ "| Anon Student Id | Unique, anonymous identifier for a student |\n",
+ "| Problem Hierarchy | The hierarchy of curriculum levels containing the problem |\n",
+ "| Problem Name | Unique identifier for a problem |\n",
+ "| Problem View | The total number of times the student encountered the problem so far |\n",
+ "| Step Name | Unique identifier for one of the steps in a problem |\n",
+ "| Step Start Time | The starting time of the step (Can be null) |\n",
+ "| First Transaction Time | The time of the first transaction toward the step |\n",
+ "| Correct Transaction Time | The time of the correct attempt toward the step, if there was one |\n",
+ "| Step End Time | The time of the last transaction toward the step |\n",
+ "| Step Duration (sec) | The elapsed time of the step in seconds, calculated by adding all of the durations for transactions that were attributed to the step (Can be null if step start time is null) |\n",
+ "| Correct Step Duration (sec) | The step duration if the first attempt for the step was correct |\n",
+ "| Error Step Duration (sec) | The step duration if the first attempt for the step was an error (incorrect attempt or hint request) |\n",
+ "| Correct First Attempt | The tutor's evaluation of the student's first attempt on the step—1 if correct, 0 if an error |\n",
+ "| Incorrects | Total number of incorrect attempts by the student on the step |\n",
+ "| Hints | Total number of hints requested by the student for the step |\n",
+ "| Corrects | Total correct attempts by the student for the step (only increases if the step is encountered more than once) |\n",
+ "| KC(KC Model Name) | The identified skills that are used in a problem, where available |\n",
+ "| Opportunity(KC Model Name) | A count that increases by one each time the student encounters a step with the listed knowledge component |\n",
+ "|| Additional KC models, which exist for the challenge data sets, will appear as additional pairs of columns (KC and Opportunity columns for each model) |"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2c2a2d3e",
+ "metadata": {},
+ "source": [
+ "For the test portion of the challenge data sets, values will not be provided for the following columns:"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f19eb949",
+ "metadata": {},
+ "source": [
+ "♦ Step Start Time\n",
+ "\n",
+ "♦ First Transaction Time\n",
+ "\n",
+ "♦ Correct Transaction Time\n",
+ "\n",
+ "♦ Step End Time\n",
+ "\n",
+ "♦ Step Duration (sec)\n",
+ "\n",
+ "♦ Correct Step Duration (sec)\n",
+ "\n",
+ "♦ Error Step Duration (sec)\n",
+ "\n",
+ "♦ Correct First Attempt\n",
+ "\n",
+ "♦ Incorrects\n",
+ "\n",
+ "♦ Hints\n",
+ "\n",
+ "♦ Corrects"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "123674b7",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import pandas as pd\n",
+ "import plotly.express as px"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "id": "efa6be16",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "path = \"algebra_2006_2007_train.txt\"\n",
+ "data = pd.read_table(path, encoding=\"ISO-8859-15\", low_memory=False)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "993f1986",
+ "metadata": {},
+ "source": [
+ "## Record Examples"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "8b2af14e",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " Row \n",
+ " Anon Student Id \n",
+ " Problem Hierarchy \n",
+ " Problem Name \n",
+ " Problem View \n",
+ " Step Name \n",
+ " Step Start Time \n",
+ " First Transaction Time \n",
+ " Correct Transaction Time \n",
+ " Step End Time \n",
+ " Step Duration (sec) \n",
+ " Correct Step Duration (sec) \n",
+ " Error Step Duration (sec) \n",
+ " Correct First Attempt \n",
+ " Incorrects \n",
+ " Hints \n",
+ " Corrects \n",
+ " KC(Default) \n",
+ " Opportunity(Default) \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 \n",
+ " 1 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R1C1 \n",
+ " 2006-10-26 09:51:58.0 \n",
+ " 2006-10-26 09:52:30.0 \n",
+ " 2006-10-26 09:53:30.0 \n",
+ " 2006-10-26 09:53:30.0 \n",
+ " 92.0 \n",
+ " NaN \n",
+ " 92.0 \n",
+ " 0 \n",
+ " 2 \n",
+ " 0 \n",
+ " 1 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " 1 \n",
+ " 2 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R1C2 \n",
+ " 2006-10-26 09:53:30.0 \n",
+ " 2006-10-26 09:53:41.0 \n",
+ " 2006-10-26 09:53:41.0 \n",
+ " 2006-10-26 09:53:41.0 \n",
+ " 11.0 \n",
+ " 11.0 \n",
+ " NaN \n",
+ " 1 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " 2 \n",
+ " 3 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R2C1 \n",
+ " 2006-10-26 09:53:41.0 \n",
+ " 2006-10-26 09:53:46.0 \n",
+ " 2006-10-26 09:53:46.0 \n",
+ " 2006-10-26 09:53:46.0 \n",
+ " 5.0 \n",
+ " 5.0 \n",
+ " NaN \n",
+ " 1 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " Identifying units \n",
+ " 1 \n",
+ " \n",
+ " \n",
+ " 3 \n",
+ " 4 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R2C2 \n",
+ " 2006-10-26 09:53:46.0 \n",
+ " 2006-10-26 09:53:50.0 \n",
+ " 2006-10-26 09:53:50.0 \n",
+ " 2006-10-26 09:53:50.0 \n",
+ " 4.0 \n",
+ " 4.0 \n",
+ " NaN \n",
+ " 1 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " Identifying units \n",
+ " 2 \n",
+ " \n",
+ " \n",
+ " 4 \n",
+ " 5 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R4C1 \n",
+ " 2006-10-26 09:53:50.0 \n",
+ " 2006-10-26 09:54:05.0 \n",
+ " 2006-10-26 09:54:05.0 \n",
+ " 2006-10-26 09:54:05.0 \n",
+ " 15.0 \n",
+ " 15.0 \n",
+ " NaN \n",
+ " 1 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " Entering a given \n",
+ " 1 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Row Anon Student Id Problem Hierarchy Problem Name \\\n",
+ "0 1 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "1 2 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "2 3 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "3 4 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "4 5 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "\n",
+ " Problem View Step Name Step Start Time First Transaction Time \\\n",
+ "0 1 R1C1 2006-10-26 09:51:58.0 2006-10-26 09:52:30.0 \n",
+ "1 1 R1C2 2006-10-26 09:53:30.0 2006-10-26 09:53:41.0 \n",
+ "2 1 R2C1 2006-10-26 09:53:41.0 2006-10-26 09:53:46.0 \n",
+ "3 1 R2C2 2006-10-26 09:53:46.0 2006-10-26 09:53:50.0 \n",
+ "4 1 R4C1 2006-10-26 09:53:50.0 2006-10-26 09:54:05.0 \n",
+ "\n",
+ " Correct Transaction Time Step End Time Step Duration (sec) \\\n",
+ "0 2006-10-26 09:53:30.0 2006-10-26 09:53:30.0 92.0 \n",
+ "1 2006-10-26 09:53:41.0 2006-10-26 09:53:41.0 11.0 \n",
+ "2 2006-10-26 09:53:46.0 2006-10-26 09:53:46.0 5.0 \n",
+ "3 2006-10-26 09:53:50.0 2006-10-26 09:53:50.0 4.0 \n",
+ "4 2006-10-26 09:54:05.0 2006-10-26 09:54:05.0 15.0 \n",
+ "\n",
+ " Correct Step Duration (sec) Error Step Duration (sec) \\\n",
+ "0 NaN 92.0 \n",
+ "1 11.0 NaN \n",
+ "2 5.0 NaN \n",
+ "3 4.0 NaN \n",
+ "4 15.0 NaN \n",
+ "\n",
+ " Correct First Attempt Incorrects Hints Corrects KC(Default) \\\n",
+ "0 0 2 0 1 NaN \n",
+ "1 1 0 0 1 NaN \n",
+ "2 1 0 0 1 Identifying units \n",
+ "3 1 0 0 1 Identifying units \n",
+ "4 1 0 0 1 Entering a given \n",
+ "\n",
+ " Opportunity(Default) \n",
+ "0 NaN \n",
+ "1 NaN \n",
+ "2 1 \n",
+ "3 2 \n",
+ "4 1 "
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "pd.set_option('display.max_column', 500)\n",
+ "data.head()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "id": "9d5e5859",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " Row \n",
+ " Problem View \n",
+ " Step Duration (sec) \n",
+ " Correct Step Duration (sec) \n",
+ " Error Step Duration (sec) \n",
+ " Correct First Attempt \n",
+ " Incorrects \n",
+ " Hints \n",
+ " Corrects \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " count \n",
+ " 2.270384e+06 \n",
+ " 2.270384e+06 \n",
+ " 2.267551e+06 \n",
+ " 1.751638e+06 \n",
+ " 515913.000000 \n",
+ " 2.270384e+06 \n",
+ " 2.270384e+06 \n",
+ " 2.270384e+06 \n",
+ " 2.270384e+06 \n",
+ " \n",
+ " \n",
+ " mean \n",
+ " 1.513120e+06 \n",
+ " 1.092910e+00 \n",
+ " 1.958364e+01 \n",
+ " 1.171716e+01 \n",
+ " 46.292087 \n",
+ " 7.722359e-01 \n",
+ " 4.455044e-01 \n",
+ " 1.184311e-01 \n",
+ " 1.062878e+00 \n",
+ " \n",
+ " \n",
+ " std \n",
+ " 8.736198e+05 \n",
+ " 3.448857e-01 \n",
+ " 4.768345e+01 \n",
+ " 2.645318e+01 \n",
+ " 81.817794 \n",
+ " 4.193897e-01 \n",
+ " 2.000914e+00 \n",
+ " 6.199071e-01 \n",
+ " 6.894285e-01 \n",
+ " \n",
+ " \n",
+ " min \n",
+ " 1.000000e+00 \n",
+ " 1.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " \n",
+ " \n",
+ " 25% \n",
+ " 7.577408e+05 \n",
+ " 1.000000e+00 \n",
+ " 3.000000e+00 \n",
+ " 3.000000e+00 \n",
+ " 11.000000 \n",
+ " 1.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 1.000000e+00 \n",
+ " \n",
+ " \n",
+ " 50% \n",
+ " 1.511844e+06 \n",
+ " 1.000000e+00 \n",
+ " 7.000000e+00 \n",
+ " 5.000000e+00 \n",
+ " 22.000000 \n",
+ " 1.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 1.000000e+00 \n",
+ " \n",
+ " \n",
+ " 75% \n",
+ " 2.269432e+06 \n",
+ " 1.000000e+00 \n",
+ " 1.700000e+01 \n",
+ " 1.100000e+01 \n",
+ " 47.000000 \n",
+ " 1.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 1.000000e+00 \n",
+ " \n",
+ " \n",
+ " max \n",
+ " 3.025933e+06 \n",
+ " 1.000000e+01 \n",
+ " 3.208000e+03 \n",
+ " 1.204000e+03 \n",
+ " 3208.000000 \n",
+ " 1.000000e+00 \n",
+ " 3.600000e+02 \n",
+ " 1.020000e+02 \n",
+ " 9.200000e+01 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Row Problem View Step Duration (sec) \\\n",
+ "count 2.270384e+06 2.270384e+06 2.267551e+06 \n",
+ "mean 1.513120e+06 1.092910e+00 1.958364e+01 \n",
+ "std 8.736198e+05 3.448857e-01 4.768345e+01 \n",
+ "min 1.000000e+00 1.000000e+00 0.000000e+00 \n",
+ "25% 7.577408e+05 1.000000e+00 3.000000e+00 \n",
+ "50% 1.511844e+06 1.000000e+00 7.000000e+00 \n",
+ "75% 2.269432e+06 1.000000e+00 1.700000e+01 \n",
+ "max 3.025933e+06 1.000000e+01 3.208000e+03 \n",
+ "\n",
+ " Correct Step Duration (sec) Error Step Duration (sec) \\\n",
+ "count 1.751638e+06 515913.000000 \n",
+ "mean 1.171716e+01 46.292087 \n",
+ "std 2.645318e+01 81.817794 \n",
+ "min 0.000000e+00 0.000000 \n",
+ "25% 3.000000e+00 11.000000 \n",
+ "50% 5.000000e+00 22.000000 \n",
+ "75% 1.100000e+01 47.000000 \n",
+ "max 1.204000e+03 3208.000000 \n",
+ "\n",
+ " Correct First Attempt Incorrects Hints Corrects \n",
+ "count 2.270384e+06 2.270384e+06 2.270384e+06 2.270384e+06 \n",
+ "mean 7.722359e-01 4.455044e-01 1.184311e-01 1.062878e+00 \n",
+ "std 4.193897e-01 2.000914e+00 6.199071e-01 6.894285e-01 \n",
+ "min 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 \n",
+ "25% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
+ "50% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
+ "75% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
+ "max 1.000000e+00 3.600000e+02 1.020000e+02 9.200000e+01 "
+ ]
+ },
+ "execution_count": 4,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "data.describe()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "id": "92cc0aab",
+ "metadata": {
+ "scrolled": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Part of missing values for every column\n",
+ "Row 0.000000\n",
+ "Anon Student Id 0.000000\n",
+ "Problem Hierarchy 0.000000\n",
+ "Problem Name 0.000000\n",
+ "Problem View 0.000000\n",
+ "Step Name 0.000000\n",
+ "Step Start Time 0.001103\n",
+ "First Transaction Time 0.000000\n",
+ "Correct Transaction Time 0.034757\n",
+ "Step End Time 0.000000\n",
+ "Step Duration (sec) 0.001248\n",
+ "Correct Step Duration (sec) 0.228484\n",
+ "Error Step Duration (sec) 0.772764\n",
+ "Correct First Attempt 0.000000\n",
+ "Incorrects 0.000000\n",
+ "Hints 0.000000\n",
+ "Corrects 0.000000\n",
+ "KC(Default) 0.203407\n",
+ "Opportunity(Default) 0.203407\n",
+ "dtype: float64\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"Part of missing values for every column\")\n",
+ "print(data.isnull().sum() / len(data))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "id": "0187b3b5",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "the number of records:\n",
+ "2270384\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"the number of records:\")\n",
+ "print(len(data))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "id": "701b6633",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "how many students are there in the table:\n",
+ "1338\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"how many students are there in the table:\")\n",
+ "print(len(data['Anon Student Id'].unique()))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "id": "bf7b246f",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "how many problems are there in the table:\n",
+ "91913\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"how many problems are there in the table:\")\n",
+ "print(len(data['Problem Name'].unique()))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e0602c47",
+ "metadata": {},
+ "source": [
+ "## Sort by Anon Student Id"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "id": "8051cc2b",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ "0 2000 4000 6000 8000 271emtbxq8pa- 3cjD21W- 271zbdm1lcgj- 271swzglvvxm- 271jonvpgijj- 24841uicq- E9dzBix- 7OalbuD- a8YLu01- 271sjweu45ee- MYmjG5R- ug982yk- 271g8nye4tne- 271zzbwqtqht- 271k4y8incfb- 271tt6j61n7d- 271rvro73lce- 7LZr10z- F713eQN- 271g7beuc4s1- Top 40 students by number of steps they have done count Anon Student Id "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "ds = data['Anon Student Id'].value_counts().reset_index()\n",
+ "ds.columns = [\n",
+ " 'Anon Student Id',\n",
+ " 'count'\n",
+ "]\n",
+ "ds['Anon Student Id'] = ds['Anon Student Id'].astype(str) + '-'\n",
+ "ds = ds.sort_values('count').tail(40)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " ds,\n",
+ " x='count',\n",
+ " y='Anon Student Id',\n",
+ " orientation='h',\n",
+ " title='Top 40 students by number of steps they have done'\n",
+ ")\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "66ef53ad",
+ "metadata": {},
+ "source": [
+ "## Percent of corrects, hints and incorrects"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "id": "c8f1539c",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ "65.3% 27.4% 7.28% corrects incorrects hints Percent of corrects, hints and incorrects "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "count_corrects = data['Corrects'].sum()\n",
+ "count_hints = data['Hints'].sum()\n",
+ "count_incorrects = data['Incorrects'].sum()\n",
+ "\n",
+ "total = count_corrects + count_hints + count_incorrects\n",
+ "\n",
+ "percent_corrects = count_corrects / total\n",
+ "percent_hints = count_hints / total\n",
+ "percent_incorrects = count_incorrects / total\n",
+ "\n",
+ "dfl = [['corrects', percent_corrects], ['hints', percent_hints], ['incorrects', percent_incorrects]]\n",
+ "\n",
+ "df = pd.DataFrame(dfl, columns=['transaction type', 'percent'])\n",
+ "\n",
+ "fig = px.pie(\n",
+ " df,\n",
+ " names=['corrects', 'hints', 'incorrects'],\n",
+ " values='percent',\n",
+ " title='Percent of corrects, hints and incorrects'\n",
+ ")\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3b097141",
+ "metadata": {},
+ "source": [
+ "## Sort by Problem Name"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "id": "6d668c43",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ "0 200 400 600 800 1000 1PTFB11- DISTFB03_SP- DIST11_SP- PROP06- DIST09_SP- DIST10_SP- EG4-CONSTANT 3(x+2) = 15- JAN06- FEB04- NOV13- PROP03- PROP12- FEB11- RATIO2-001- PROP05- JAN09- PROP04- EG1-CONSTANT 7(8+2)- LDEMO_WSLVR- L5FB16- Top 40 useful problem count Problem Name "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "storeProblemCount = [1]\n",
+ "storeProblemName = [data['Problem Name'][0]]\n",
+ "currentProblemName = data['Problem Name'][0]\n",
+ "currentStepName = [data['Step Name'][0]]\n",
+ "lastIndex = 0\n",
+ "\n",
+ "for i in range(1, len(data), 1):\n",
+ " pbNameI = data['Problem Name'][i]\n",
+ " stNameI = data['Step Name'][i]\n",
+ " if pbNameI != data['Problem Name'][lastIndex]:\n",
+ " currentStepName = [stNameI]\n",
+ " currentProblemName = pbNameI\n",
+ " if pbNameI not in storeProblemName:\n",
+ " storeProblemName.append(pbNameI)\n",
+ " storeProblemCount.append(1)\n",
+ " else:\n",
+ " storeProblemCount[storeProblemName.index(pbNameI)] += 1\n",
+ " lastIndex = i\n",
+ " elif stNameI not in currentStepName:\n",
+ " currentStepName.append(stNameI)\n",
+ " lastIndex = i\n",
+ " else:\n",
+ " currentStepName = [stNameI]\n",
+ " storeProblemCount[storeProblemName.index(pbNameI)] += 1\n",
+ " lastIndex = i\n",
+ "\n",
+ "dfData = {\n",
+ " 'Problem Name': storeProblemName,\n",
+ " 'count': storeProblemCount\n",
+ "}\n",
+ "df = pd.DataFrame(dfData).sort_values('count').tail(40)\n",
+ "df[\"Problem Name\"] += '-'\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df,\n",
+ " x='count',\n",
+ " y='Problem Name',\n",
+ " orientation='h',\n",
+ " title='Top 40 useful problem'\n",
+ ")\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "id": "1b965aa4",
+ "metadata": {
+ "scrolled": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ " \n",
+ " "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "application/vnd.plotly.v1+json": {
+ "config": {
+ "plotlyServerURL": "https://plot.ly"
+ },
+ "data": [
+ {
+ "alignmentgroup": "True",
+ "hovertemplate": "Correct rate=%{x} Problem Name=%{text} ",
+ "legendgroup": "",
+ "marker": {
+ "color": "#636efa",
+ "pattern": {
+ "shape": ""
+ }
+ },
+ "name": "",
+ "offsetgroup": "",
+ "orientation": "h",
+ "showlegend": false,
+ "text": [
+ "SYLT-2X&YGE-2X+9-",
+ "ROOTS1-001-",
+ "SY=2X&Y=-3X+5-",
+ "PROBABILITY1-006-",
+ "EG-RE-DISTRIB-05 (2x+4)/(3x+6)-",
+ "PROBABILITY1-070-",
+ "G3X-YLE5&3X-YGE15-",
+ "BUSES-",
+ "PEANUTS-CASHEWS-",
+ "EXPONENT2-012-",
+ "TVS3-",
+ "PROBABILITY5-001-",
+ "EXPONENT3-071-",
+ "EXPONENT2-046-",
+ "EXPONENT5-001-",
+ "PROBABILITY2-001-",
+ "GLFM-BUSES-",
+ "EG-EPS01-CONSTANT (4(x^2)*y^4)(3(x^3)*y^3)-",
+ "PROBABILITY6-002-",
+ "PROBABILITY6-001-"
+ ],
+ "textposition": "auto",
+ "type": "bar",
+ "x": [
+ 0.84,
+ 0.8442477876106195,
+ 0.848,
+ 0.8492462311557789,
+ 0.8505096262740657,
+ 0.8557142857142858,
+ 0.8559622195985832,
+ 0.8581843429960077,
+ 0.8608278344331134,
+ 0.8679504814305364,
+ 0.86878612716763,
+ 0.878392305049811,
+ 0.8799212598425197,
+ 0.8896797153024911,
+ 0.8966360856269113,
+ 0.9195046439628483,
+ 0.924791086350975,
+ 0.934416715031921,
+ 0.9473684210526315,
+ 0.9579413392363033
+ ],
+ "xaxis": "x",
+ "y": [
+ "SYLT-2X&YGE-2X+9-",
+ "ROOTS1-001-",
+ "SY=2X&Y=-3X+5-",
+ "PROBABILITY1-006-",
+ "EG-RE-DISTRIB-05 (2x+4)/(3x+6)-",
+ "PROBABILITY1-070-",
+ "G3X-YLE5&3X-YGE15-",
+ "BUSES-",
+ "PEANUTS-CASHEWS-",
+ "EXPONENT2-012-",
+ "TVS3-",
+ "PROBABILITY5-001-",
+ "EXPONENT3-071-",
+ "EXPONENT2-046-",
+ "EXPONENT5-001-",
+ "PROBABILITY2-001-",
+ "GLFM-BUSES-",
+ "EG-EPS01-CONSTANT (4(x^2)*y^4)(3(x^3)*y^3)-",
+ "PROBABILITY6-002-",
+ "PROBABILITY6-001-"
+ ],
+ "yaxis": "y"
+ }
+ ],
+ "layout": {
+ "barmode": "relative",
+ "legend": {
+ "tracegroupgap": 0
+ },
+ "template": {
+ "data": {
+ "bar": [
+ {
+ "error_x": {
+ "color": "#2a3f5f"
+ },
+ "error_y": {
+ "color": "#2a3f5f"
+ },
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "bar"
+ }
+ ],
+ "barpolar": [
+ {
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "barpolar"
+ }
+ ],
+ "carpet": [
+ {
+ "aaxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "baxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "type": "carpet"
+ }
+ ],
+ "choropleth": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "choropleth"
+ }
+ ],
+ "contour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "contour"
+ }
+ ],
+ "contourcarpet": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "contourcarpet"
+ }
+ ],
+ "heatmap": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmap"
+ }
+ ],
+ "heatmapgl": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmapgl"
+ }
+ ],
+ "histogram": [
+ {
+ "marker": {
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "histogram"
+ }
+ ],
+ "histogram2d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2d"
+ }
+ ],
+ "histogram2dcontour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2dcontour"
+ }
+ ],
+ "mesh3d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "mesh3d"
+ }
+ ],
+ "parcoords": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "parcoords"
+ }
+ ],
+ "pie": [
+ {
+ "automargin": true,
+ "type": "pie"
+ }
+ ],
+ "scatter": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter"
+ }
+ ],
+ "scatter3d": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter3d"
+ }
+ ],
+ "scattercarpet": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattercarpet"
+ }
+ ],
+ "scattergeo": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergeo"
+ }
+ ],
+ "scattergl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergl"
+ }
+ ],
+ "scattermapbox": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattermapbox"
+ }
+ ],
+ "scatterpolar": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolar"
+ }
+ ],
+ "scatterpolargl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolargl"
+ }
+ ],
+ "scatterternary": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterternary"
+ }
+ ],
+ "surface": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "surface"
+ }
+ ],
+ "table": [
+ {
+ "cells": {
+ "fill": {
+ "color": "#EBF0F8"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "header": {
+ "fill": {
+ "color": "#C8D4E3"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "type": "table"
+ }
+ ]
+ },
+ "layout": {
+ "annotationdefaults": {
+ "arrowcolor": "#2a3f5f",
+ "arrowhead": 0,
+ "arrowwidth": 1
+ },
+ "autotypenumbers": "strict",
+ "coloraxis": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "colorscale": {
+ "diverging": [
+ [
+ 0,
+ "#8e0152"
+ ],
+ [
+ 0.1,
+ "#c51b7d"
+ ],
+ [
+ 0.2,
+ "#de77ae"
+ ],
+ [
+ 0.3,
+ "#f1b6da"
+ ],
+ [
+ 0.4,
+ "#fde0ef"
+ ],
+ [
+ 0.5,
+ "#f7f7f7"
+ ],
+ [
+ 0.6,
+ "#e6f5d0"
+ ],
+ [
+ 0.7,
+ "#b8e186"
+ ],
+ [
+ 0.8,
+ "#7fbc41"
+ ],
+ [
+ 0.9,
+ "#4d9221"
+ ],
+ [
+ 1,
+ "#276419"
+ ]
+ ],
+ "sequential": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "sequentialminus": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ]
+ },
+ "colorway": [
+ "#636efa",
+ "#EF553B",
+ "#00cc96",
+ "#ab63fa",
+ "#FFA15A",
+ "#19d3f3",
+ "#FF6692",
+ "#B6E880",
+ "#FF97FF",
+ "#FECB52"
+ ],
+ "font": {
+ "color": "#2a3f5f"
+ },
+ "geo": {
+ "bgcolor": "white",
+ "lakecolor": "white",
+ "landcolor": "#E5ECF6",
+ "showlakes": true,
+ "showland": true,
+ "subunitcolor": "white"
+ },
+ "hoverlabel": {
+ "align": "left"
+ },
+ "hovermode": "closest",
+ "mapbox": {
+ "style": "light"
+ },
+ "paper_bgcolor": "white",
+ "plot_bgcolor": "#E5ECF6",
+ "polar": {
+ "angularaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "radialaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "scene": {
+ "xaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "yaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "zaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ }
+ },
+ "shapedefaults": {
+ "line": {
+ "color": "#2a3f5f"
+ }
+ },
+ "ternary": {
+ "aaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "baxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "caxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "title": {
+ "x": 0.05
+ },
+ "xaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ },
+ "yaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ }
+ }
+ },
+ "title": {
+ "text": "Correct rate of each problem (top 20) (total transactions of each problem are required to be more than 500)"
+ },
+ "xaxis": {
+ "anchor": "y",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "Correct rate"
+ }
+ },
+ "yaxis": {
+ "anchor": "x",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "Problem Name"
+ }
+ }
+ }
+ },
+ "text/html": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "application/vnd.plotly.v1+json": {
+ "config": {
+ "plotlyServerURL": "https://plot.ly"
+ },
+ "data": [
+ {
+ "alignmentgroup": "True",
+ "hovertemplate": "Correct rate=%{x} Problem Name=%{text} ",
+ "legendgroup": "",
+ "marker": {
+ "color": "#636efa",
+ "pattern": {
+ "shape": ""
+ }
+ },
+ "name": "",
+ "offsetgroup": "",
+ "orientation": "h",
+ "showlegend": false,
+ "text": [
+ "RATIONAL1-091-",
+ "RATIONAL1-165-",
+ "RATIONAL1-034-",
+ "RATIONAL1-058-",
+ "RATIONAL1-075-",
+ "RATIONAL1-035-",
+ "RATIONAL1-281-",
+ "RATIONAL1-261-",
+ "RATIONAL1-177-",
+ "RATIONAL1-064-",
+ "RATIONAL1-147-",
+ "RATIONAL1-021-",
+ "RATIONAL1-121-",
+ "RATIONAL1-008-",
+ "RATIONAL1-288-",
+ "RATIONAL1-109-",
+ "BH1T31B-",
+ "RXMX_3C-",
+ "EG-RE-CANCEL-04 ((x-9)^3)/(x-9)^4-",
+ "RATIONAL2-205-"
+ ],
+ "textposition": "auto",
+ "type": "bar",
+ "x": [
+ 0.07532467532467532,
+ 0.07955596669750231,
+ 0.0807799442896936,
+ 0.0975609756097561,
+ 0.09904153354632587,
+ 0.1016566265060241,
+ 0.1198501872659176,
+ 0.1217564870259481,
+ 0.1288888888888889,
+ 0.13678756476683937,
+ 0.14023732470334413,
+ 0.1404303510758777,
+ 0.14186851211072665,
+ 0.16796875,
+ 0.17764471057884232,
+ 0.24912280701754386,
+ 0.3157172271791352,
+ 0.329510366122629,
+ 0.3380703066566941,
+ 0.3424479166666667
+ ],
+ "xaxis": "x",
+ "y": [
+ "RATIONAL1-091-",
+ "RATIONAL1-165-",
+ "RATIONAL1-034-",
+ "RATIONAL1-058-",
+ "RATIONAL1-075-",
+ "RATIONAL1-035-",
+ "RATIONAL1-281-",
+ "RATIONAL1-261-",
+ "RATIONAL1-177-",
+ "RATIONAL1-064-",
+ "RATIONAL1-147-",
+ "RATIONAL1-021-",
+ "RATIONAL1-121-",
+ "RATIONAL1-008-",
+ "RATIONAL1-288-",
+ "RATIONAL1-109-",
+ "BH1T31B-",
+ "RXMX_3C-",
+ "EG-RE-CANCEL-04 ((x-9)^3)/(x-9)^4-",
+ "RATIONAL2-205-"
+ ],
+ "yaxis": "y"
+ }
+ ],
+ "layout": {
+ "barmode": "relative",
+ "legend": {
+ "tracegroupgap": 0
+ },
+ "template": {
+ "data": {
+ "bar": [
+ {
+ "error_x": {
+ "color": "#2a3f5f"
+ },
+ "error_y": {
+ "color": "#2a3f5f"
+ },
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "bar"
+ }
+ ],
+ "barpolar": [
+ {
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "barpolar"
+ }
+ ],
+ "carpet": [
+ {
+ "aaxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "baxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "type": "carpet"
+ }
+ ],
+ "choropleth": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "choropleth"
+ }
+ ],
+ "contour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "contour"
+ }
+ ],
+ "contourcarpet": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "contourcarpet"
+ }
+ ],
+ "heatmap": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmap"
+ }
+ ],
+ "heatmapgl": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmapgl"
+ }
+ ],
+ "histogram": [
+ {
+ "marker": {
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "histogram"
+ }
+ ],
+ "histogram2d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2d"
+ }
+ ],
+ "histogram2dcontour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2dcontour"
+ }
+ ],
+ "mesh3d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "mesh3d"
+ }
+ ],
+ "parcoords": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "parcoords"
+ }
+ ],
+ "pie": [
+ {
+ "automargin": true,
+ "type": "pie"
+ }
+ ],
+ "scatter": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter"
+ }
+ ],
+ "scatter3d": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter3d"
+ }
+ ],
+ "scattercarpet": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattercarpet"
+ }
+ ],
+ "scattergeo": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergeo"
+ }
+ ],
+ "scattergl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergl"
+ }
+ ],
+ "scattermapbox": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattermapbox"
+ }
+ ],
+ "scatterpolar": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolar"
+ }
+ ],
+ "scatterpolargl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolargl"
+ }
+ ],
+ "scatterternary": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterternary"
+ }
+ ],
+ "surface": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "surface"
+ }
+ ],
+ "table": [
+ {
+ "cells": {
+ "fill": {
+ "color": "#EBF0F8"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "header": {
+ "fill": {
+ "color": "#C8D4E3"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "type": "table"
+ }
+ ]
+ },
+ "layout": {
+ "annotationdefaults": {
+ "arrowcolor": "#2a3f5f",
+ "arrowhead": 0,
+ "arrowwidth": 1
+ },
+ "autotypenumbers": "strict",
+ "coloraxis": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "colorscale": {
+ "diverging": [
+ [
+ 0,
+ "#8e0152"
+ ],
+ [
+ 0.1,
+ "#c51b7d"
+ ],
+ [
+ 0.2,
+ "#de77ae"
+ ],
+ [
+ 0.3,
+ "#f1b6da"
+ ],
+ [
+ 0.4,
+ "#fde0ef"
+ ],
+ [
+ 0.5,
+ "#f7f7f7"
+ ],
+ [
+ 0.6,
+ "#e6f5d0"
+ ],
+ [
+ 0.7,
+ "#b8e186"
+ ],
+ [
+ 0.8,
+ "#7fbc41"
+ ],
+ [
+ 0.9,
+ "#4d9221"
+ ],
+ [
+ 1,
+ "#276419"
+ ]
+ ],
+ "sequential": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "sequentialminus": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ]
+ },
+ "colorway": [
+ "#636efa",
+ "#EF553B",
+ "#00cc96",
+ "#ab63fa",
+ "#FFA15A",
+ "#19d3f3",
+ "#FF6692",
+ "#B6E880",
+ "#FF97FF",
+ "#FECB52"
+ ],
+ "font": {
+ "color": "#2a3f5f"
+ },
+ "geo": {
+ "bgcolor": "white",
+ "lakecolor": "white",
+ "landcolor": "#E5ECF6",
+ "showlakes": true,
+ "showland": true,
+ "subunitcolor": "white"
+ },
+ "hoverlabel": {
+ "align": "left"
+ },
+ "hovermode": "closest",
+ "mapbox": {
+ "style": "light"
+ },
+ "paper_bgcolor": "white",
+ "plot_bgcolor": "#E5ECF6",
+ "polar": {
+ "angularaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "radialaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "scene": {
+ "xaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "yaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "zaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ }
+ },
+ "shapedefaults": {
+ "line": {
+ "color": "#2a3f5f"
+ }
+ },
+ "ternary": {
+ "aaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "baxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "caxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "title": {
+ "x": 0.05
+ },
+ "xaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ },
+ "yaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ }
+ }
+ },
+ "title": {
+ "font": {
+ "size": 15
+ },
+ "text": "Correct rate of each problem (bottom 20) (total transactions of each problem are required to be more than 500)"
+ },
+ "xaxis": {
+ "anchor": "y",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "Correct rate"
+ }
+ },
+ "yaxis": {
+ "anchor": "x",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "Problem Name"
+ }
+ }
+ }
+ },
+ "text/html": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "data['total transactions'] = data['Incorrects'] + data['Hints'] + data['Corrects']\n",
+ "df1 = data.groupby('Problem Name')['total transactions'].sum().reset_index()\n",
+ "df2 = data.groupby('Problem Name')['Corrects'].sum().reset_index()\n",
+ "df1['Corrects'] = df2['Corrects']\n",
+ "df1['Correct rate'] = df1['Corrects'] / df1['total transactions']\n",
+ "\n",
+ "df1 = df1.sort_values('total transactions')\n",
+ "count = 0\n",
+ "standard = 500\n",
+ "for i in df1['total transactions']:\n",
+ " if i > standard:\n",
+ " count += 1\n",
+ "df1 = df1.tail(count)\n",
+ "\n",
+ "df1 = df1.sort_values('Correct rate')\n",
+ "\n",
+ "df1['Problem Name'] = df1['Problem Name'].astype(str) + \"-\"\n",
+ "\n",
+ "df_px = df1.tail(20)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df_px,\n",
+ " x='Correct rate',\n",
+ " y='Problem Name',\n",
+ " orientation='h',\n",
+ " title='Correct rate of each problem (top 20) (total transactions of \\\n",
+ "each problem are required to be more than 500)',\n",
+ " text='Problem Name'\n",
+ ")\n",
+ "fig.show()\n",
+ "\n",
+ "df_px = df1.head(20)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df_px,\n",
+ " x='Correct rate',\n",
+ " y='Problem Name',\n",
+ " orientation='h',\n",
+ " title='Correct rate of each problem (bottom 20) (total transactions of \\\n",
+ "each problem are required to be more than 500)',\n",
+ " text='Problem Name'\n",
+ ")\n",
+ "fig.update_layout(title_font_size=15)\n",
+ "fig.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2bee6e04",
+ "metadata": {},
+ "source": [
+ "These two figures present the correct rate of problems. Problems with low correct rate deserve more attention from teachers and students."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0047e839",
+ "metadata": {},
+ "source": [
+ "## Sort by KC"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "id": "c6e910e0",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.plotly.v1+json": {
+ "config": {
+ "plotlyServerURL": "https://plot.ly"
+ },
+ "data": [
+ {
+ "alignmentgroup": "True",
+ "hovertemplate": "correct rate=%{x} KC(Default)=%{text} ",
+ "legendgroup": "",
+ "marker": {
+ "color": "#636efa",
+ "pattern": {
+ "shape": ""
+ }
+ },
+ "name": "",
+ "offsetgroup": "",
+ "orientation": "h",
+ "showlegend": false,
+ "text": [
+ "[Rule: RF right ([SolverOperation rf],{components with property canReduceFractionsNoMultWhole of right})]-",
+ "Enter second extreme in equation-",
+ "Proportional-Constant-Expression-Gla:Student-Modeling-Analysis-",
+ "Enter Calculated value of rate-",
+ "[SkillRule: Combine like terms, no var; CLT]-",
+ "Enter square of leg label-",
+ "Compare medians - removed outlier-",
+ "Enter number of total outcomes in table-",
+ "Find square of given leg-",
+ "Enter ratio quantity to right of \"to\"-",
+ "Enter fractional probability of event-",
+ "[SkillRule: Select Multiply; {MT; MT no fraction coeff}]-",
+ "Write base of exponential from given whole number as product-",
+ "Write decimal multiplier from given scientific notation-",
+ "Enter ratio quantity to right of colon-",
+ "PM-ROW-1-",
+ "Changing axis intervals-",
+ "unspecified-",
+ "[Rule: unnec-elems ([SolverOperation unnec-elems],)]-",
+ "Select second event-"
+ ],
+ "textposition": "auto",
+ "type": "bar",
+ "x": [
+ 0.9094659771050032,
+ 0.9125055236411843,
+ 0.9135446685878963,
+ 0.9155947136563877,
+ 0.9156931229676021,
+ 0.9194915254237288,
+ 0.9218241042345277,
+ 0.9247135842880524,
+ 0.9323943661971831,
+ 0.9386369807222373,
+ 0.9389944134078212,
+ 0.9399566697616837,
+ 0.9405666897028334,
+ 0.940814757878555,
+ 0.9422934648581998,
+ 0.9431375603676585,
+ 0.9706148701992291,
+ 0.9769823066841415,
+ 0.9838308457711443,
+ 0.987603305785124
+ ],
+ "xaxis": "x",
+ "y": [
+ "[Rule: RF right ([SolverOperation rf],{components with property canReduceFractionsNoMultWhole of right})]-",
+ "Enter second extreme in equation-",
+ "Proportional-Constant-Expression-Gla:Student-Modeling-Analysis-",
+ "Enter Calculated value of rate-",
+ "[SkillRule: Combine like terms, no var; CLT]-",
+ "Enter square of leg label-",
+ "Compare medians - removed outlier-",
+ "Enter number of total outcomes in table-",
+ "Find square of given leg-",
+ "Enter ratio quantity to right of \"to\"-",
+ "Enter fractional probability of event-",
+ "[SkillRule: Select Multiply; {MT; MT no fraction coeff}]-",
+ "Write base of exponential from given whole number as product-",
+ "Write decimal multiplier from given scientific notation-",
+ "Enter ratio quantity to right of colon-",
+ "PM-ROW-1-",
+ "Changing axis intervals-",
+ "unspecified-",
+ "[Rule: unnec-elems ([SolverOperation unnec-elems],)]-",
+ "Select second event-"
+ ],
+ "yaxis": "y"
+ }
+ ],
+ "layout": {
+ "barmode": "relative",
+ "legend": {
+ "tracegroupgap": 0
+ },
+ "template": {
+ "data": {
+ "bar": [
+ {
+ "error_x": {
+ "color": "#2a3f5f"
+ },
+ "error_y": {
+ "color": "#2a3f5f"
+ },
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "bar"
+ }
+ ],
+ "barpolar": [
+ {
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "barpolar"
+ }
+ ],
+ "carpet": [
+ {
+ "aaxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "baxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "type": "carpet"
+ }
+ ],
+ "choropleth": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "choropleth"
+ }
+ ],
+ "contour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "contour"
+ }
+ ],
+ "contourcarpet": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "contourcarpet"
+ }
+ ],
+ "heatmap": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmap"
+ }
+ ],
+ "heatmapgl": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmapgl"
+ }
+ ],
+ "histogram": [
+ {
+ "marker": {
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "histogram"
+ }
+ ],
+ "histogram2d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2d"
+ }
+ ],
+ "histogram2dcontour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2dcontour"
+ }
+ ],
+ "mesh3d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "mesh3d"
+ }
+ ],
+ "parcoords": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "parcoords"
+ }
+ ],
+ "pie": [
+ {
+ "automargin": true,
+ "type": "pie"
+ }
+ ],
+ "scatter": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter"
+ }
+ ],
+ "scatter3d": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter3d"
+ }
+ ],
+ "scattercarpet": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattercarpet"
+ }
+ ],
+ "scattergeo": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergeo"
+ }
+ ],
+ "scattergl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergl"
+ }
+ ],
+ "scattermapbox": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattermapbox"
+ }
+ ],
+ "scatterpolar": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolar"
+ }
+ ],
+ "scatterpolargl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolargl"
+ }
+ ],
+ "scatterternary": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterternary"
+ }
+ ],
+ "surface": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "surface"
+ }
+ ],
+ "table": [
+ {
+ "cells": {
+ "fill": {
+ "color": "#EBF0F8"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "header": {
+ "fill": {
+ "color": "#C8D4E3"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "type": "table"
+ }
+ ]
+ },
+ "layout": {
+ "annotationdefaults": {
+ "arrowcolor": "#2a3f5f",
+ "arrowhead": 0,
+ "arrowwidth": 1
+ },
+ "autotypenumbers": "strict",
+ "coloraxis": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "colorscale": {
+ "diverging": [
+ [
+ 0,
+ "#8e0152"
+ ],
+ [
+ 0.1,
+ "#c51b7d"
+ ],
+ [
+ 0.2,
+ "#de77ae"
+ ],
+ [
+ 0.3,
+ "#f1b6da"
+ ],
+ [
+ 0.4,
+ "#fde0ef"
+ ],
+ [
+ 0.5,
+ "#f7f7f7"
+ ],
+ [
+ 0.6,
+ "#e6f5d0"
+ ],
+ [
+ 0.7,
+ "#b8e186"
+ ],
+ [
+ 0.8,
+ "#7fbc41"
+ ],
+ [
+ 0.9,
+ "#4d9221"
+ ],
+ [
+ 1,
+ "#276419"
+ ]
+ ],
+ "sequential": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "sequentialminus": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ]
+ },
+ "colorway": [
+ "#636efa",
+ "#EF553B",
+ "#00cc96",
+ "#ab63fa",
+ "#FFA15A",
+ "#19d3f3",
+ "#FF6692",
+ "#B6E880",
+ "#FF97FF",
+ "#FECB52"
+ ],
+ "font": {
+ "color": "#2a3f5f"
+ },
+ "geo": {
+ "bgcolor": "white",
+ "lakecolor": "white",
+ "landcolor": "#E5ECF6",
+ "showlakes": true,
+ "showland": true,
+ "subunitcolor": "white"
+ },
+ "hoverlabel": {
+ "align": "left"
+ },
+ "hovermode": "closest",
+ "mapbox": {
+ "style": "light"
+ },
+ "paper_bgcolor": "white",
+ "plot_bgcolor": "#E5ECF6",
+ "polar": {
+ "angularaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "radialaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "scene": {
+ "xaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "yaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "zaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ }
+ },
+ "shapedefaults": {
+ "line": {
+ "color": "#2a3f5f"
+ }
+ },
+ "ternary": {
+ "aaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "baxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "caxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "title": {
+ "x": 0.05
+ },
+ "xaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ },
+ "yaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ }
+ }
+ },
+ "title": {
+ "text": "Correct rate of each KC(Default) (top 20) (total transactions of each KC are required to be more than 300)"
+ },
+ "xaxis": {
+ "anchor": "y",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "correct rate"
+ }
+ },
+ "yaxis": {
+ "anchor": "x",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "KC(Default)"
+ },
+ "visible": false
+ }
+ }
+ },
+ "text/html": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "application/vnd.plotly.v1+json": {
+ "config": {
+ "plotlyServerURL": "https://plot.ly"
+ },
+ "data": [
+ {
+ "alignmentgroup": "True",
+ "hovertemplate": "correct rate=%{x} KC(Default)=%{text} ",
+ "legendgroup": "",
+ "marker": {
+ "color": "#636efa",
+ "pattern": {
+ "shape": ""
+ }
+ },
+ "name": "",
+ "offsetgroup": "",
+ "orientation": "h",
+ "showlegend": false,
+ "text": [
+ "unknown bug element-",
+ "[SkillRule: Done?; {doneleft; doneright; doneleft, no menu; doneright, nomenu; done no solution; Done No Solution, domain exception; Done No Solution, range exception; done infinite solutions; done expr, standard form; done expr, standard form, no menu}]-",
+ "unknown bug element~~CLT-ROW-1-COEFF-",
+ "unknown bug element~~CLT-ROW-1-",
+ "Plot terminating improper fractions-",
+ "Plot decimal - thousandths-",
+ "Plot terminating mixed number-",
+ "Plot imperfect radical-",
+ "Plot decimal - hundredths-",
+ "Setting the slope-",
+ "Plot terminating proper fraction-",
+ "Plot percent-",
+ "Plot non-terminating proper fraction-",
+ "Plot non-terminating improper fraction-",
+ "Entering slope, GLF-",
+ "Placing coordinate point-",
+ "Plot decimal - tenths-",
+ "Finding the intersection, SIF-",
+ "Finding the intersection, Mixed-",
+ "[Rule: CLT nested, no LCD ([SolverOperation clt],)]-"
+ ],
+ "textposition": "auto",
+ "type": "bar",
+ "x": [
+ 0,
+ 0,
+ 0.014072847682119206,
+ 0.014477766287487074,
+ 0.14357160282995854,
+ 0.19899605593402653,
+ 0.20798727690404664,
+ 0.2222222222222222,
+ 0.22783707253321717,
+ 0.2335486778846154,
+ 0.24001726742931145,
+ 0.2440093512565751,
+ 0.26698670605613,
+ 0.29059485530546625,
+ 0.3,
+ 0.3083741984156922,
+ 0.3113859585303747,
+ 0.3129411764705882,
+ 0.31905781584582443,
+ 0.3395784543325527
+ ],
+ "xaxis": "x",
+ "y": [
+ "unknown bug element-",
+ "[SkillRule: Done?; {doneleft; doneright; doneleft, no menu; doneright, nomenu; done no solution; Done No Solution, domain exception; Done No Solution, range exception; done infinite solutions; done expr, standard form; done expr, standard form, no menu}]-",
+ "unknown bug element~~CLT-ROW-1-COEFF-",
+ "unknown bug element~~CLT-ROW-1-",
+ "Plot terminating improper fractions-",
+ "Plot decimal - thousandths-",
+ "Plot terminating mixed number-",
+ "Plot imperfect radical-",
+ "Plot decimal - hundredths-",
+ "Setting the slope-",
+ "Plot terminating proper fraction-",
+ "Plot percent-",
+ "Plot non-terminating proper fraction-",
+ "Plot non-terminating improper fraction-",
+ "Entering slope, GLF-",
+ "Placing coordinate point-",
+ "Plot decimal - tenths-",
+ "Finding the intersection, SIF-",
+ "Finding the intersection, Mixed-",
+ "[Rule: CLT nested, no LCD ([SolverOperation clt],)]-"
+ ],
+ "yaxis": "y"
+ }
+ ],
+ "layout": {
+ "barmode": "relative",
+ "legend": {
+ "tracegroupgap": 0
+ },
+ "template": {
+ "data": {
+ "bar": [
+ {
+ "error_x": {
+ "color": "#2a3f5f"
+ },
+ "error_y": {
+ "color": "#2a3f5f"
+ },
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "bar"
+ }
+ ],
+ "barpolar": [
+ {
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "barpolar"
+ }
+ ],
+ "carpet": [
+ {
+ "aaxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "baxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "type": "carpet"
+ }
+ ],
+ "choropleth": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "choropleth"
+ }
+ ],
+ "contour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "contour"
+ }
+ ],
+ "contourcarpet": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "contourcarpet"
+ }
+ ],
+ "heatmap": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmap"
+ }
+ ],
+ "heatmapgl": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmapgl"
+ }
+ ],
+ "histogram": [
+ {
+ "marker": {
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "histogram"
+ }
+ ],
+ "histogram2d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2d"
+ }
+ ],
+ "histogram2dcontour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2dcontour"
+ }
+ ],
+ "mesh3d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "mesh3d"
+ }
+ ],
+ "parcoords": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "parcoords"
+ }
+ ],
+ "pie": [
+ {
+ "automargin": true,
+ "type": "pie"
+ }
+ ],
+ "scatter": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter"
+ }
+ ],
+ "scatter3d": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter3d"
+ }
+ ],
+ "scattercarpet": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattercarpet"
+ }
+ ],
+ "scattergeo": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergeo"
+ }
+ ],
+ "scattergl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergl"
+ }
+ ],
+ "scattermapbox": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattermapbox"
+ }
+ ],
+ "scatterpolar": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolar"
+ }
+ ],
+ "scatterpolargl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolargl"
+ }
+ ],
+ "scatterternary": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterternary"
+ }
+ ],
+ "surface": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "surface"
+ }
+ ],
+ "table": [
+ {
+ "cells": {
+ "fill": {
+ "color": "#EBF0F8"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "header": {
+ "fill": {
+ "color": "#C8D4E3"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "type": "table"
+ }
+ ]
+ },
+ "layout": {
+ "annotationdefaults": {
+ "arrowcolor": "#2a3f5f",
+ "arrowhead": 0,
+ "arrowwidth": 1
+ },
+ "autotypenumbers": "strict",
+ "coloraxis": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "colorscale": {
+ "diverging": [
+ [
+ 0,
+ "#8e0152"
+ ],
+ [
+ 0.1,
+ "#c51b7d"
+ ],
+ [
+ 0.2,
+ "#de77ae"
+ ],
+ [
+ 0.3,
+ "#f1b6da"
+ ],
+ [
+ 0.4,
+ "#fde0ef"
+ ],
+ [
+ 0.5,
+ "#f7f7f7"
+ ],
+ [
+ 0.6,
+ "#e6f5d0"
+ ],
+ [
+ 0.7,
+ "#b8e186"
+ ],
+ [
+ 0.8,
+ "#7fbc41"
+ ],
+ [
+ 0.9,
+ "#4d9221"
+ ],
+ [
+ 1,
+ "#276419"
+ ]
+ ],
+ "sequential": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "sequentialminus": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ]
+ },
+ "colorway": [
+ "#636efa",
+ "#EF553B",
+ "#00cc96",
+ "#ab63fa",
+ "#FFA15A",
+ "#19d3f3",
+ "#FF6692",
+ "#B6E880",
+ "#FF97FF",
+ "#FECB52"
+ ],
+ "font": {
+ "color": "#2a3f5f"
+ },
+ "geo": {
+ "bgcolor": "white",
+ "lakecolor": "white",
+ "landcolor": "#E5ECF6",
+ "showlakes": true,
+ "showland": true,
+ "subunitcolor": "white"
+ },
+ "hoverlabel": {
+ "align": "left"
+ },
+ "hovermode": "closest",
+ "mapbox": {
+ "style": "light"
+ },
+ "paper_bgcolor": "white",
+ "plot_bgcolor": "#E5ECF6",
+ "polar": {
+ "angularaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "radialaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "scene": {
+ "xaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "yaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "zaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ }
+ },
+ "shapedefaults": {
+ "line": {
+ "color": "#2a3f5f"
+ }
+ },
+ "ternary": {
+ "aaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "baxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "caxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "title": {
+ "x": 0.05
+ },
+ "xaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ },
+ "yaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ }
+ }
+ },
+ "title": {
+ "font": {
+ "size": 15
+ },
+ "text": "Correct rate of each KC(Default) (bottom 20) (total transactions of each KC are required to be more than 300)"
+ },
+ "xaxis": {
+ "anchor": "y",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "correct rate"
+ }
+ },
+ "yaxis": {
+ "anchor": "x",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "KC(Default)"
+ },
+ "visible": false
+ }
+ }
+ },
+ "text/html": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "data.dropna(subset=['KC(Default)'], inplace=True)\n",
+ "\n",
+ "data['total transactions'] = data['Corrects'] + data['Hints'] + data['Incorrects']\n",
+ "df1 = data.groupby('KC(Default)')['total transactions'].sum().reset_index()\n",
+ "df2 = data.groupby('KC(Default)')['Corrects'].sum().reset_index()\n",
+ "df1['Corrects'] = df2['Corrects']\n",
+ "df1['correct rate'] = df1['Corrects'] / df1['total transactions']\n",
+ "\n",
+ "count = 0\n",
+ "standard = 300\n",
+ "for i in df1['total transactions']:\n",
+ " if i > standard:\n",
+ " count += 1\n",
+ "df1 = df1.sort_values('total transactions').tail(count)\n",
+ "\n",
+ "df1 = df1.sort_values('correct rate')\n",
+ "\n",
+ "df1['KC(Default)'] = df1['KC(Default)'].astype(str) + '-'\n",
+ "\n",
+ "df_px = df1.tail(20)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df_px,\n",
+ " x='correct rate',\n",
+ " y='KC(Default)',\n",
+ " orientation='h',\n",
+ " title='Correct rate of each KC(Default) (top 20) (total transactions of \\\n",
+ "each KC are required to be more than 300)',\n",
+ " text='KC(Default)'\n",
+ ")\n",
+ "fig.update_yaxes(visible=False)\n",
+ "fig.show()\n",
+ "\n",
+ "df_px = df1.head(20)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df_px,\n",
+ " x='correct rate',\n",
+ " y='KC(Default)',\n",
+ " orientation='h',\n",
+ " title='Correct rate of each KC(Default) (bottom 20) (total transactions of \\\n",
+ "each KC are required to be more than 300)',\n",
+ " text='KC(Default)'\n",
+ ")\n",
+ "fig.update_yaxes(visible=False)\n",
+ "fig.update_layout(title_font_size=15)\n",
+ "fig.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0feef8a1",
+ "metadata": {},
+ "source": [
+ "These two figures present the correct rate of KCs. KCs with low correct rate deserve more attention from teachers and students."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "22d99527",
+ "metadata": {},
+ "source": [
+ "## Postscript"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "09bc0903",
+ "metadata": {},
+ "source": [
+ "Given that the whole data package is composed of 5 data sets and data files in these 5 data sets that can be used to conduct data analysis share the same data format, the following analysis based on \"algebra_2006_2007_train\" is just an example of data analysis on KDD Cup, and the code can be used to analyse other data files with some small changes on the file path and column names.\n"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.9.6"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
From a536ee34ff426576880742b6c6541d591f3b014f Mon Sep 17 00:00:00 2001
From: gguu1314 <88154940+gguu1314@users.noreply.github.com>
Date: Sat, 14 Aug 2021 12:20:39 +0800
Subject: [PATCH 04/10] Delete KDD Cup 2010.ipynb
---
docs/KDD Cup 2010.ipynb | 4925 ---------------------------------------
1 file changed, 4925 deletions(-)
delete mode 100644 docs/KDD Cup 2010.ipynb
diff --git a/docs/KDD Cup 2010.ipynb b/docs/KDD Cup 2010.ipynb
deleted file mode 100644
index bd7876d..0000000
--- a/docs/KDD Cup 2010.ipynb
+++ /dev/null
@@ -1,4925 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "id": "e002fdf8",
- "metadata": {},
- "source": [
- "# KDD Cup 2010 —— Data Analysis on algebra_2006_2007_train"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "429152ff",
- "metadata": {},
- "source": [
- "## Data Description"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "2c89d116",
- "metadata": {},
- "source": [
- "### Column Description"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "f590eee5",
- "metadata": {},
- "source": [
- "| Attribute | Annotaion |\n",
- "|:--:|---|\n",
- "|Row|The row number|\n",
- "| Anon Student Id | Unique, anonymous identifier for a student |\n",
- "| Problem Hierarchy | The hierarchy of curriculum levels containing the problem |\n",
- "| Problem Name | Unique identifier for a problem |\n",
- "| Problem View | The total number of times the student encountered the problem so far |\n",
- "| Step Name | Unique identifier for one of the steps in a problem |\n",
- "| Step Start Time | The starting time of the step (Can be null) |\n",
- "| First Transaction Time | The time of the first transaction toward the step |\n",
- "| Correct Transaction Time | The time of the correct attempt toward the step, if there was one |\n",
- "| Step End Time | The time of the last transaction toward the step |\n",
- "| Step Duration (sec) | The elapsed time of the step in seconds, calculated by adding all of the durations for transactions that were attributed to the step (Can be null if step start time is null) |\n",
- "| Correct Step Duration (sec) | The step duration if the first attempt for the step was correct |\n",
- "| Error Step Duration (sec) | The step duration if the first attempt for the step was an error (incorrect attempt or hint request) |\n",
- "| Correct First Attempt | The tutor's evaluation of the student's first attempt on the step—1 if correct, 0 if an error |\n",
- "| Incorrects | Total number of incorrect attempts by the student on the step |\n",
- "| Hints | Total number of hints requested by the student for the step |\n",
- "| Corrects | Total correct attempts by the student for the step (only increases if the step is encountered more than once) |\n",
- "| KC(KC Model Name) | The identified skills that are used in a problem, where available |\n",
- "| Opportunity(KC Model Name) | A count that increases by one each time the student encounters a step with the listed knowledge component |\n",
- "|| Additional KC models, which exist for the challenge data sets, will appear as additional pairs of columns (KC and Opportunity columns for each model) |"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "2c2a2d3e",
- "metadata": {},
- "source": [
- "For the test portion of the challenge data sets, values will not be provided for the following columns:"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "f19eb949",
- "metadata": {},
- "source": [
- "♦ Step Start Time\n",
- "\n",
- "♦ First Transaction Time\n",
- "\n",
- "♦ Correct Transaction Time\n",
- "\n",
- "♦ Step End Time\n",
- "\n",
- "♦ Step Duration (sec)\n",
- "\n",
- "♦ Correct Step Duration (sec)\n",
- "\n",
- "♦ Error Step Duration (sec)\n",
- "\n",
- "♦ Correct First Attempt\n",
- "\n",
- "♦ Incorrects\n",
- "\n",
- "♦ Hints\n",
- "\n",
- "♦ Corrects"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "id": "123674b7",
- "metadata": {},
- "outputs": [],
- "source": [
- "import pandas as pd\n",
- "import plotly.express as px"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "id": "efa6be16",
- "metadata": {},
- "outputs": [],
- "source": [
- "path = \"algebra_2006_2007_train.txt\"\n",
- "data = pd.read_table(path, encoding=\"ISO-8859-15\", low_memory=False)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "993f1986",
- "metadata": {},
- "source": [
- "## Record Examples"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 3,
- "id": "8b2af14e",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " Row \n",
- " Anon Student Id \n",
- " Problem Hierarchy \n",
- " Problem Name \n",
- " Problem View \n",
- " Step Name \n",
- " Step Start Time \n",
- " First Transaction Time \n",
- " Correct Transaction Time \n",
- " Step End Time \n",
- " Step Duration (sec) \n",
- " Correct Step Duration (sec) \n",
- " Error Step Duration (sec) \n",
- " Correct First Attempt \n",
- " Incorrects \n",
- " Hints \n",
- " Corrects \n",
- " KC(Default) \n",
- " Opportunity(Default) \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " 0 \n",
- " 1 \n",
- " JG4Tz \n",
- " Unit CTA1_01, Section CTA1_01-1 \n",
- " LDEMO_WKST \n",
- " 1 \n",
- " R1C1 \n",
- " 2006-10-26 09:51:58.0 \n",
- " 2006-10-26 09:52:30.0 \n",
- " 2006-10-26 09:53:30.0 \n",
- " 2006-10-26 09:53:30.0 \n",
- " 92.0 \n",
- " NaN \n",
- " 92.0 \n",
- " 0 \n",
- " 2 \n",
- " 0 \n",
- " 1 \n",
- " NaN \n",
- " NaN \n",
- " \n",
- " \n",
- " 1 \n",
- " 2 \n",
- " JG4Tz \n",
- " Unit CTA1_01, Section CTA1_01-1 \n",
- " LDEMO_WKST \n",
- " 1 \n",
- " R1C2 \n",
- " 2006-10-26 09:53:30.0 \n",
- " 2006-10-26 09:53:41.0 \n",
- " 2006-10-26 09:53:41.0 \n",
- " 2006-10-26 09:53:41.0 \n",
- " 11.0 \n",
- " 11.0 \n",
- " NaN \n",
- " 1 \n",
- " 0 \n",
- " 0 \n",
- " 1 \n",
- " NaN \n",
- " NaN \n",
- " \n",
- " \n",
- " 2 \n",
- " 3 \n",
- " JG4Tz \n",
- " Unit CTA1_01, Section CTA1_01-1 \n",
- " LDEMO_WKST \n",
- " 1 \n",
- " R2C1 \n",
- " 2006-10-26 09:53:41.0 \n",
- " 2006-10-26 09:53:46.0 \n",
- " 2006-10-26 09:53:46.0 \n",
- " 2006-10-26 09:53:46.0 \n",
- " 5.0 \n",
- " 5.0 \n",
- " NaN \n",
- " 1 \n",
- " 0 \n",
- " 0 \n",
- " 1 \n",
- " Identifying units \n",
- " 1 \n",
- " \n",
- " \n",
- " 3 \n",
- " 4 \n",
- " JG4Tz \n",
- " Unit CTA1_01, Section CTA1_01-1 \n",
- " LDEMO_WKST \n",
- " 1 \n",
- " R2C2 \n",
- " 2006-10-26 09:53:46.0 \n",
- " 2006-10-26 09:53:50.0 \n",
- " 2006-10-26 09:53:50.0 \n",
- " 2006-10-26 09:53:50.0 \n",
- " 4.0 \n",
- " 4.0 \n",
- " NaN \n",
- " 1 \n",
- " 0 \n",
- " 0 \n",
- " 1 \n",
- " Identifying units \n",
- " 2 \n",
- " \n",
- " \n",
- " 4 \n",
- " 5 \n",
- " JG4Tz \n",
- " Unit CTA1_01, Section CTA1_01-1 \n",
- " LDEMO_WKST \n",
- " 1 \n",
- " R4C1 \n",
- " 2006-10-26 09:53:50.0 \n",
- " 2006-10-26 09:54:05.0 \n",
- " 2006-10-26 09:54:05.0 \n",
- " 2006-10-26 09:54:05.0 \n",
- " 15.0 \n",
- " 15.0 \n",
- " NaN \n",
- " 1 \n",
- " 0 \n",
- " 0 \n",
- " 1 \n",
- " Entering a given \n",
- " 1 \n",
- " \n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " Row Anon Student Id Problem Hierarchy Problem Name \\\n",
- "0 1 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
- "1 2 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
- "2 3 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
- "3 4 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
- "4 5 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
- "\n",
- " Problem View Step Name Step Start Time First Transaction Time \\\n",
- "0 1 R1C1 2006-10-26 09:51:58.0 2006-10-26 09:52:30.0 \n",
- "1 1 R1C2 2006-10-26 09:53:30.0 2006-10-26 09:53:41.0 \n",
- "2 1 R2C1 2006-10-26 09:53:41.0 2006-10-26 09:53:46.0 \n",
- "3 1 R2C2 2006-10-26 09:53:46.0 2006-10-26 09:53:50.0 \n",
- "4 1 R4C1 2006-10-26 09:53:50.0 2006-10-26 09:54:05.0 \n",
- "\n",
- " Correct Transaction Time Step End Time Step Duration (sec) \\\n",
- "0 2006-10-26 09:53:30.0 2006-10-26 09:53:30.0 92.0 \n",
- "1 2006-10-26 09:53:41.0 2006-10-26 09:53:41.0 11.0 \n",
- "2 2006-10-26 09:53:46.0 2006-10-26 09:53:46.0 5.0 \n",
- "3 2006-10-26 09:53:50.0 2006-10-26 09:53:50.0 4.0 \n",
- "4 2006-10-26 09:54:05.0 2006-10-26 09:54:05.0 15.0 \n",
- "\n",
- " Correct Step Duration (sec) Error Step Duration (sec) \\\n",
- "0 NaN 92.0 \n",
- "1 11.0 NaN \n",
- "2 5.0 NaN \n",
- "3 4.0 NaN \n",
- "4 15.0 NaN \n",
- "\n",
- " Correct First Attempt Incorrects Hints Corrects KC(Default) \\\n",
- "0 0 2 0 1 NaN \n",
- "1 1 0 0 1 NaN \n",
- "2 1 0 0 1 Identifying units \n",
- "3 1 0 0 1 Identifying units \n",
- "4 1 0 0 1 Entering a given \n",
- "\n",
- " Opportunity(Default) \n",
- "0 NaN \n",
- "1 NaN \n",
- "2 1 \n",
- "3 2 \n",
- "4 1 "
- ]
- },
- "execution_count": 3,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "pd.set_option('display.max_column', 500)\n",
- "data.head()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 4,
- "id": "9d5e5859",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " Row \n",
- " Problem View \n",
- " Step Duration (sec) \n",
- " Correct Step Duration (sec) \n",
- " Error Step Duration (sec) \n",
- " Correct First Attempt \n",
- " Incorrects \n",
- " Hints \n",
- " Corrects \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " count \n",
- " 2.270384e+06 \n",
- " 2.270384e+06 \n",
- " 2.267551e+06 \n",
- " 1.751638e+06 \n",
- " 515913.000000 \n",
- " 2.270384e+06 \n",
- " 2.270384e+06 \n",
- " 2.270384e+06 \n",
- " 2.270384e+06 \n",
- " \n",
- " \n",
- " mean \n",
- " 1.513120e+06 \n",
- " 1.092910e+00 \n",
- " 1.958364e+01 \n",
- " 1.171716e+01 \n",
- " 46.292087 \n",
- " 7.722359e-01 \n",
- " 4.455044e-01 \n",
- " 1.184311e-01 \n",
- " 1.062878e+00 \n",
- " \n",
- " \n",
- " std \n",
- " 8.736198e+05 \n",
- " 3.448857e-01 \n",
- " 4.768345e+01 \n",
- " 2.645318e+01 \n",
- " 81.817794 \n",
- " 4.193897e-01 \n",
- " 2.000914e+00 \n",
- " 6.199071e-01 \n",
- " 6.894285e-01 \n",
- " \n",
- " \n",
- " min \n",
- " 1.000000e+00 \n",
- " 1.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " \n",
- " \n",
- " 25% \n",
- " 7.577408e+05 \n",
- " 1.000000e+00 \n",
- " 3.000000e+00 \n",
- " 3.000000e+00 \n",
- " 11.000000 \n",
- " 1.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 1.000000e+00 \n",
- " \n",
- " \n",
- " 50% \n",
- " 1.511844e+06 \n",
- " 1.000000e+00 \n",
- " 7.000000e+00 \n",
- " 5.000000e+00 \n",
- " 22.000000 \n",
- " 1.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 1.000000e+00 \n",
- " \n",
- " \n",
- " 75% \n",
- " 2.269432e+06 \n",
- " 1.000000e+00 \n",
- " 1.700000e+01 \n",
- " 1.100000e+01 \n",
- " 47.000000 \n",
- " 1.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 1.000000e+00 \n",
- " \n",
- " \n",
- " max \n",
- " 3.025933e+06 \n",
- " 1.000000e+01 \n",
- " 3.208000e+03 \n",
- " 1.204000e+03 \n",
- " 3208.000000 \n",
- " 1.000000e+00 \n",
- " 3.600000e+02 \n",
- " 1.020000e+02 \n",
- " 9.200000e+01 \n",
- " \n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " Row Problem View Step Duration (sec) \\\n",
- "count 2.270384e+06 2.270384e+06 2.267551e+06 \n",
- "mean 1.513120e+06 1.092910e+00 1.958364e+01 \n",
- "std 8.736198e+05 3.448857e-01 4.768345e+01 \n",
- "min 1.000000e+00 1.000000e+00 0.000000e+00 \n",
- "25% 7.577408e+05 1.000000e+00 3.000000e+00 \n",
- "50% 1.511844e+06 1.000000e+00 7.000000e+00 \n",
- "75% 2.269432e+06 1.000000e+00 1.700000e+01 \n",
- "max 3.025933e+06 1.000000e+01 3.208000e+03 \n",
- "\n",
- " Correct Step Duration (sec) Error Step Duration (sec) \\\n",
- "count 1.751638e+06 515913.000000 \n",
- "mean 1.171716e+01 46.292087 \n",
- "std 2.645318e+01 81.817794 \n",
- "min 0.000000e+00 0.000000 \n",
- "25% 3.000000e+00 11.000000 \n",
- "50% 5.000000e+00 22.000000 \n",
- "75% 1.100000e+01 47.000000 \n",
- "max 1.204000e+03 3208.000000 \n",
- "\n",
- " Correct First Attempt Incorrects Hints Corrects \n",
- "count 2.270384e+06 2.270384e+06 2.270384e+06 2.270384e+06 \n",
- "mean 7.722359e-01 4.455044e-01 1.184311e-01 1.062878e+00 \n",
- "std 4.193897e-01 2.000914e+00 6.199071e-01 6.894285e-01 \n",
- "min 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 \n",
- "25% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
- "50% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
- "75% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
- "max 1.000000e+00 3.600000e+02 1.020000e+02 9.200000e+01 "
- ]
- },
- "execution_count": 4,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "data.describe()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 5,
- "id": "92cc0aab",
- "metadata": {
- "scrolled": false
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Part of missing values for every column\n",
- "Row 0.000000\n",
- "Anon Student Id 0.000000\n",
- "Problem Hierarchy 0.000000\n",
- "Problem Name 0.000000\n",
- "Problem View 0.000000\n",
- "Step Name 0.000000\n",
- "Step Start Time 0.001103\n",
- "First Transaction Time 0.000000\n",
- "Correct Transaction Time 0.034757\n",
- "Step End Time 0.000000\n",
- "Step Duration (sec) 0.001248\n",
- "Correct Step Duration (sec) 0.228484\n",
- "Error Step Duration (sec) 0.772764\n",
- "Correct First Attempt 0.000000\n",
- "Incorrects 0.000000\n",
- "Hints 0.000000\n",
- "Corrects 0.000000\n",
- "KC(Default) 0.203407\n",
- "Opportunity(Default) 0.203407\n",
- "dtype: float64\n"
- ]
- }
- ],
- "source": [
- "print(\"Part of missing values for every column\")\n",
- "print(data.isnull().sum() / len(data))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 6,
- "id": "0187b3b5",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "the number of records:\n",
- "2270384\n"
- ]
- }
- ],
- "source": [
- "print(\"the number of records:\")\n",
- "print(len(data))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 7,
- "id": "701b6633",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "how many students are there in the table:\n",
- "1338\n"
- ]
- }
- ],
- "source": [
- "print(\"how many students are there in the table:\")\n",
- "print(len(data['Anon Student Id'].unique()))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 8,
- "id": "bf7b246f",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "how many problems are there in the table:\n",
- "91913\n"
- ]
- }
- ],
- "source": [
- "print(\"how many problems are there in the table:\")\n",
- "print(len(data['Problem Name'].unique()))"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "e0602c47",
- "metadata": {},
- "source": [
- "## Sort by Anon Student Id"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 9,
- "id": "8051cc2b",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "image/svg+xml": [
- "0 2000 4000 6000 8000 271emtbxq8pa- 3cjD21W- 271zbdm1lcgj- 271swzglvvxm- 271jonvpgijj- 24841uicq- E9dzBix- 7OalbuD- a8YLu01- 271sjweu45ee- MYmjG5R- ug982yk- 271g8nye4tne- 271zzbwqtqht- 271k4y8incfb- 271tt6j61n7d- 271rvro73lce- 7LZr10z- F713eQN- 271g7beuc4s1- Top 40 students by number of steps they have done count Anon Student Id "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "ds = data['Anon Student Id'].value_counts().reset_index()\n",
- "ds.columns = [\n",
- " 'Anon Student Id',\n",
- " 'count'\n",
- "]\n",
- "ds['Anon Student Id'] = ds['Anon Student Id'].astype(str) + '-'\n",
- "ds = ds.sort_values('count').tail(40)\n",
- "\n",
- "fig = px.bar(\n",
- " ds,\n",
- " x='count',\n",
- " y='Anon Student Id',\n",
- " orientation='h',\n",
- " title='Top 40 students by number of steps they have done'\n",
- ")\n",
- "fig.show(\"svg\")"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "66ef53ad",
- "metadata": {},
- "source": [
- "## Percent of corrects, hints and incorrects"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 10,
- "id": "c8f1539c",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "image/svg+xml": [
- "65.3% 27.4% 7.28% corrects incorrects hints Percent of corrects, hints and incorrects "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "count_corrects = data['Corrects'].sum()\n",
- "count_hints = data['Hints'].sum()\n",
- "count_incorrects = data['Incorrects'].sum()\n",
- "\n",
- "total = count_corrects + count_hints + count_incorrects\n",
- "\n",
- "percent_corrects = count_corrects / total\n",
- "percent_hints = count_hints / total\n",
- "percent_incorrects = count_incorrects / total\n",
- "\n",
- "dfl = [['corrects', percent_corrects], ['hints', percent_hints], ['incorrects', percent_incorrects]]\n",
- "\n",
- "df = pd.DataFrame(dfl, columns=['transaction type', 'percent'])\n",
- "\n",
- "fig = px.pie(\n",
- " df,\n",
- " names=['corrects', 'hints', 'incorrects'],\n",
- " values='percent',\n",
- " title='Percent of corrects, hints and incorrects'\n",
- ")\n",
- "fig.show(\"svg\")"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "3b097141",
- "metadata": {},
- "source": [
- "## Sort by Problem Name"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 11,
- "id": "6d668c43",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "image/svg+xml": [
- "0 200 400 600 800 1000 1PTFB11- DISTFB03_SP- DIST11_SP- PROP06- DIST09_SP- DIST10_SP- EG4-CONSTANT 3(x+2) = 15- JAN06- FEB04- NOV13- PROP03- PROP12- FEB11- RATIO2-001- PROP05- JAN09- PROP04- EG1-CONSTANT 7(8+2)- LDEMO_WSLVR- L5FB16- Top 40 useful problem count Problem Name "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "storeProblemCount = [1]\n",
- "storeProblemName = [data['Problem Name'][0]]\n",
- "currentProblemName = data['Problem Name'][0]\n",
- "currentStepName = [data['Step Name'][0]]\n",
- "lastIndex = 0\n",
- "\n",
- "for i in range(1, len(data), 1):\n",
- " pbNameI = data['Problem Name'][i]\n",
- " stNameI = data['Step Name'][i]\n",
- " if pbNameI != data['Problem Name'][lastIndex]:\n",
- " currentStepName = [stNameI]\n",
- " currentProblemName = pbNameI\n",
- " if pbNameI not in storeProblemName:\n",
- " storeProblemName.append(pbNameI)\n",
- " storeProblemCount.append(1)\n",
- " else:\n",
- " storeProblemCount[storeProblemName.index(pbNameI)] += 1\n",
- " lastIndex = i\n",
- " elif stNameI not in currentStepName:\n",
- " currentStepName.append(stNameI)\n",
- " lastIndex = i\n",
- " else:\n",
- " currentStepName = [stNameI]\n",
- " storeProblemCount[storeProblemName.index(pbNameI)] += 1\n",
- " lastIndex = i\n",
- "\n",
- "dfData = {\n",
- " 'Problem Name': storeProblemName,\n",
- " 'count': storeProblemCount\n",
- "}\n",
- "df = pd.DataFrame(dfData).sort_values('count').tail(40)\n",
- "df[\"Problem Name\"] += '-'\n",
- "\n",
- "fig = px.bar(\n",
- " df,\n",
- " x='count',\n",
- " y='Problem Name',\n",
- " orientation='h',\n",
- " title='Top 40 useful problem'\n",
- ")\n",
- "fig.show(\"svg\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 12,
- "id": "1b965aa4",
- "metadata": {
- "scrolled": false
- },
- "outputs": [
- {
- "data": {
- "text/html": [
- " \n",
- " "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "application/vnd.plotly.v1+json": {
- "config": {
- "plotlyServerURL": "https://plot.ly"
- },
- "data": [
- {
- "alignmentgroup": "True",
- "hovertemplate": "Correct rate=%{x} Problem Name=%{text} ",
- "legendgroup": "",
- "marker": {
- "color": "#636efa",
- "pattern": {
- "shape": ""
- }
- },
- "name": "",
- "offsetgroup": "",
- "orientation": "h",
- "showlegend": false,
- "text": [
- "SYLT-2X&YGE-2X+9-",
- "ROOTS1-001-",
- "SY=2X&Y=-3X+5-",
- "PROBABILITY1-006-",
- "EG-RE-DISTRIB-05 (2x+4)/(3x+6)-",
- "PROBABILITY1-070-",
- "G3X-YLE5&3X-YGE15-",
- "BUSES-",
- "PEANUTS-CASHEWS-",
- "EXPONENT2-012-",
- "TVS3-",
- "PROBABILITY5-001-",
- "EXPONENT3-071-",
- "EXPONENT2-046-",
- "EXPONENT5-001-",
- "PROBABILITY2-001-",
- "GLFM-BUSES-",
- "EG-EPS01-CONSTANT (4(x^2)*y^4)(3(x^3)*y^3)-",
- "PROBABILITY6-002-",
- "PROBABILITY6-001-"
- ],
- "textposition": "auto",
- "type": "bar",
- "x": [
- 0.84,
- 0.8442477876106195,
- 0.848,
- 0.8492462311557789,
- 0.8505096262740657,
- 0.8557142857142858,
- 0.8559622195985832,
- 0.8581843429960077,
- 0.8608278344331134,
- 0.8679504814305364,
- 0.86878612716763,
- 0.878392305049811,
- 0.8799212598425197,
- 0.8896797153024911,
- 0.8966360856269113,
- 0.9195046439628483,
- 0.924791086350975,
- 0.934416715031921,
- 0.9473684210526315,
- 0.9579413392363033
- ],
- "xaxis": "x",
- "y": [
- "SYLT-2X&YGE-2X+9-",
- "ROOTS1-001-",
- "SY=2X&Y=-3X+5-",
- "PROBABILITY1-006-",
- "EG-RE-DISTRIB-05 (2x+4)/(3x+6)-",
- "PROBABILITY1-070-",
- "G3X-YLE5&3X-YGE15-",
- "BUSES-",
- "PEANUTS-CASHEWS-",
- "EXPONENT2-012-",
- "TVS3-",
- "PROBABILITY5-001-",
- "EXPONENT3-071-",
- "EXPONENT2-046-",
- "EXPONENT5-001-",
- "PROBABILITY2-001-",
- "GLFM-BUSES-",
- "EG-EPS01-CONSTANT (4(x^2)*y^4)(3(x^3)*y^3)-",
- "PROBABILITY6-002-",
- "PROBABILITY6-001-"
- ],
- "yaxis": "y"
- }
- ],
- "layout": {
- "barmode": "relative",
- "legend": {
- "tracegroupgap": 0
- },
- "template": {
- "data": {
- "bar": [
- {
- "error_x": {
- "color": "#2a3f5f"
- },
- "error_y": {
- "color": "#2a3f5f"
- },
- "marker": {
- "line": {
- "color": "#E5ECF6",
- "width": 0.5
- },
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "bar"
- }
- ],
- "barpolar": [
- {
- "marker": {
- "line": {
- "color": "#E5ECF6",
- "width": 0.5
- },
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "barpolar"
- }
- ],
- "carpet": [
- {
- "aaxis": {
- "endlinecolor": "#2a3f5f",
- "gridcolor": "white",
- "linecolor": "white",
- "minorgridcolor": "white",
- "startlinecolor": "#2a3f5f"
- },
- "baxis": {
- "endlinecolor": "#2a3f5f",
- "gridcolor": "white",
- "linecolor": "white",
- "minorgridcolor": "white",
- "startlinecolor": "#2a3f5f"
- },
- "type": "carpet"
- }
- ],
- "choropleth": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "choropleth"
- }
- ],
- "contour": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "contour"
- }
- ],
- "contourcarpet": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "contourcarpet"
- }
- ],
- "heatmap": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "heatmap"
- }
- ],
- "heatmapgl": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "heatmapgl"
- }
- ],
- "histogram": [
- {
- "marker": {
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "histogram"
- }
- ],
- "histogram2d": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "histogram2d"
- }
- ],
- "histogram2dcontour": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "histogram2dcontour"
- }
- ],
- "mesh3d": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "mesh3d"
- }
- ],
- "parcoords": [
- {
- "line": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "parcoords"
- }
- ],
- "pie": [
- {
- "automargin": true,
- "type": "pie"
- }
- ],
- "scatter": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatter"
- }
- ],
- "scatter3d": [
- {
- "line": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatter3d"
- }
- ],
- "scattercarpet": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattercarpet"
- }
- ],
- "scattergeo": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattergeo"
- }
- ],
- "scattergl": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattergl"
- }
- ],
- "scattermapbox": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattermapbox"
- }
- ],
- "scatterpolar": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterpolar"
- }
- ],
- "scatterpolargl": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterpolargl"
- }
- ],
- "scatterternary": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterternary"
- }
- ],
- "surface": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "surface"
- }
- ],
- "table": [
- {
- "cells": {
- "fill": {
- "color": "#EBF0F8"
- },
- "line": {
- "color": "white"
- }
- },
- "header": {
- "fill": {
- "color": "#C8D4E3"
- },
- "line": {
- "color": "white"
- }
- },
- "type": "table"
- }
- ]
- },
- "layout": {
- "annotationdefaults": {
- "arrowcolor": "#2a3f5f",
- "arrowhead": 0,
- "arrowwidth": 1
- },
- "autotypenumbers": "strict",
- "coloraxis": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "colorscale": {
- "diverging": [
- [
- 0,
- "#8e0152"
- ],
- [
- 0.1,
- "#c51b7d"
- ],
- [
- 0.2,
- "#de77ae"
- ],
- [
- 0.3,
- "#f1b6da"
- ],
- [
- 0.4,
- "#fde0ef"
- ],
- [
- 0.5,
- "#f7f7f7"
- ],
- [
- 0.6,
- "#e6f5d0"
- ],
- [
- 0.7,
- "#b8e186"
- ],
- [
- 0.8,
- "#7fbc41"
- ],
- [
- 0.9,
- "#4d9221"
- ],
- [
- 1,
- "#276419"
- ]
- ],
- "sequential": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "sequentialminus": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ]
- },
- "colorway": [
- "#636efa",
- "#EF553B",
- "#00cc96",
- "#ab63fa",
- "#FFA15A",
- "#19d3f3",
- "#FF6692",
- "#B6E880",
- "#FF97FF",
- "#FECB52"
- ],
- "font": {
- "color": "#2a3f5f"
- },
- "geo": {
- "bgcolor": "white",
- "lakecolor": "white",
- "landcolor": "#E5ECF6",
- "showlakes": true,
- "showland": true,
- "subunitcolor": "white"
- },
- "hoverlabel": {
- "align": "left"
- },
- "hovermode": "closest",
- "mapbox": {
- "style": "light"
- },
- "paper_bgcolor": "white",
- "plot_bgcolor": "#E5ECF6",
- "polar": {
- "angularaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "bgcolor": "#E5ECF6",
- "radialaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- }
- },
- "scene": {
- "xaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- },
- "yaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- },
- "zaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- }
- },
- "shapedefaults": {
- "line": {
- "color": "#2a3f5f"
- }
- },
- "ternary": {
- "aaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "baxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "bgcolor": "#E5ECF6",
- "caxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- }
- },
- "title": {
- "x": 0.05
- },
- "xaxis": {
- "automargin": true,
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": "",
- "title": {
- "standoff": 15
- },
- "zerolinecolor": "white",
- "zerolinewidth": 2
- },
- "yaxis": {
- "automargin": true,
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": "",
- "title": {
- "standoff": 15
- },
- "zerolinecolor": "white",
- "zerolinewidth": 2
- }
- }
- },
- "title": {
- "text": "Correct rate of each problem (top 20) (total transactions of each problem are required to be more than 500)"
- },
- "xaxis": {
- "anchor": "y",
- "domain": [
- 0,
- 1
- ],
- "title": {
- "text": "Correct rate"
- }
- },
- "yaxis": {
- "anchor": "x",
- "domain": [
- 0,
- 1
- ],
- "title": {
- "text": "Problem Name"
- }
- }
- }
- },
- "text/html": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "application/vnd.plotly.v1+json": {
- "config": {
- "plotlyServerURL": "https://plot.ly"
- },
- "data": [
- {
- "alignmentgroup": "True",
- "hovertemplate": "Correct rate=%{x} Problem Name=%{text} ",
- "legendgroup": "",
- "marker": {
- "color": "#636efa",
- "pattern": {
- "shape": ""
- }
- },
- "name": "",
- "offsetgroup": "",
- "orientation": "h",
- "showlegend": false,
- "text": [
- "RATIONAL1-091-",
- "RATIONAL1-165-",
- "RATIONAL1-034-",
- "RATIONAL1-058-",
- "RATIONAL1-075-",
- "RATIONAL1-035-",
- "RATIONAL1-281-",
- "RATIONAL1-261-",
- "RATIONAL1-177-",
- "RATIONAL1-064-",
- "RATIONAL1-147-",
- "RATIONAL1-021-",
- "RATIONAL1-121-",
- "RATIONAL1-008-",
- "RATIONAL1-288-",
- "RATIONAL1-109-",
- "BH1T31B-",
- "RXMX_3C-",
- "EG-RE-CANCEL-04 ((x-9)^3)/(x-9)^4-",
- "RATIONAL2-205-"
- ],
- "textposition": "auto",
- "type": "bar",
- "x": [
- 0.07532467532467532,
- 0.07955596669750231,
- 0.0807799442896936,
- 0.0975609756097561,
- 0.09904153354632587,
- 0.1016566265060241,
- 0.1198501872659176,
- 0.1217564870259481,
- 0.1288888888888889,
- 0.13678756476683937,
- 0.14023732470334413,
- 0.1404303510758777,
- 0.14186851211072665,
- 0.16796875,
- 0.17764471057884232,
- 0.24912280701754386,
- 0.3157172271791352,
- 0.329510366122629,
- 0.3380703066566941,
- 0.3424479166666667
- ],
- "xaxis": "x",
- "y": [
- "RATIONAL1-091-",
- "RATIONAL1-165-",
- "RATIONAL1-034-",
- "RATIONAL1-058-",
- "RATIONAL1-075-",
- "RATIONAL1-035-",
- "RATIONAL1-281-",
- "RATIONAL1-261-",
- "RATIONAL1-177-",
- "RATIONAL1-064-",
- "RATIONAL1-147-",
- "RATIONAL1-021-",
- "RATIONAL1-121-",
- "RATIONAL1-008-",
- "RATIONAL1-288-",
- "RATIONAL1-109-",
- "BH1T31B-",
- "RXMX_3C-",
- "EG-RE-CANCEL-04 ((x-9)^3)/(x-9)^4-",
- "RATIONAL2-205-"
- ],
- "yaxis": "y"
- }
- ],
- "layout": {
- "barmode": "relative",
- "legend": {
- "tracegroupgap": 0
- },
- "template": {
- "data": {
- "bar": [
- {
- "error_x": {
- "color": "#2a3f5f"
- },
- "error_y": {
- "color": "#2a3f5f"
- },
- "marker": {
- "line": {
- "color": "#E5ECF6",
- "width": 0.5
- },
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "bar"
- }
- ],
- "barpolar": [
- {
- "marker": {
- "line": {
- "color": "#E5ECF6",
- "width": 0.5
- },
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "barpolar"
- }
- ],
- "carpet": [
- {
- "aaxis": {
- "endlinecolor": "#2a3f5f",
- "gridcolor": "white",
- "linecolor": "white",
- "minorgridcolor": "white",
- "startlinecolor": "#2a3f5f"
- },
- "baxis": {
- "endlinecolor": "#2a3f5f",
- "gridcolor": "white",
- "linecolor": "white",
- "minorgridcolor": "white",
- "startlinecolor": "#2a3f5f"
- },
- "type": "carpet"
- }
- ],
- "choropleth": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "choropleth"
- }
- ],
- "contour": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "contour"
- }
- ],
- "contourcarpet": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "contourcarpet"
- }
- ],
- "heatmap": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "heatmap"
- }
- ],
- "heatmapgl": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "heatmapgl"
- }
- ],
- "histogram": [
- {
- "marker": {
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "histogram"
- }
- ],
- "histogram2d": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "histogram2d"
- }
- ],
- "histogram2dcontour": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "histogram2dcontour"
- }
- ],
- "mesh3d": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "mesh3d"
- }
- ],
- "parcoords": [
- {
- "line": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "parcoords"
- }
- ],
- "pie": [
- {
- "automargin": true,
- "type": "pie"
- }
- ],
- "scatter": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatter"
- }
- ],
- "scatter3d": [
- {
- "line": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatter3d"
- }
- ],
- "scattercarpet": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattercarpet"
- }
- ],
- "scattergeo": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattergeo"
- }
- ],
- "scattergl": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattergl"
- }
- ],
- "scattermapbox": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattermapbox"
- }
- ],
- "scatterpolar": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterpolar"
- }
- ],
- "scatterpolargl": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterpolargl"
- }
- ],
- "scatterternary": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterternary"
- }
- ],
- "surface": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "surface"
- }
- ],
- "table": [
- {
- "cells": {
- "fill": {
- "color": "#EBF0F8"
- },
- "line": {
- "color": "white"
- }
- },
- "header": {
- "fill": {
- "color": "#C8D4E3"
- },
- "line": {
- "color": "white"
- }
- },
- "type": "table"
- }
- ]
- },
- "layout": {
- "annotationdefaults": {
- "arrowcolor": "#2a3f5f",
- "arrowhead": 0,
- "arrowwidth": 1
- },
- "autotypenumbers": "strict",
- "coloraxis": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "colorscale": {
- "diverging": [
- [
- 0,
- "#8e0152"
- ],
- [
- 0.1,
- "#c51b7d"
- ],
- [
- 0.2,
- "#de77ae"
- ],
- [
- 0.3,
- "#f1b6da"
- ],
- [
- 0.4,
- "#fde0ef"
- ],
- [
- 0.5,
- "#f7f7f7"
- ],
- [
- 0.6,
- "#e6f5d0"
- ],
- [
- 0.7,
- "#b8e186"
- ],
- [
- 0.8,
- "#7fbc41"
- ],
- [
- 0.9,
- "#4d9221"
- ],
- [
- 1,
- "#276419"
- ]
- ],
- "sequential": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "sequentialminus": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ]
- },
- "colorway": [
- "#636efa",
- "#EF553B",
- "#00cc96",
- "#ab63fa",
- "#FFA15A",
- "#19d3f3",
- "#FF6692",
- "#B6E880",
- "#FF97FF",
- "#FECB52"
- ],
- "font": {
- "color": "#2a3f5f"
- },
- "geo": {
- "bgcolor": "white",
- "lakecolor": "white",
- "landcolor": "#E5ECF6",
- "showlakes": true,
- "showland": true,
- "subunitcolor": "white"
- },
- "hoverlabel": {
- "align": "left"
- },
- "hovermode": "closest",
- "mapbox": {
- "style": "light"
- },
- "paper_bgcolor": "white",
- "plot_bgcolor": "#E5ECF6",
- "polar": {
- "angularaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "bgcolor": "#E5ECF6",
- "radialaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- }
- },
- "scene": {
- "xaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- },
- "yaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- },
- "zaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- }
- },
- "shapedefaults": {
- "line": {
- "color": "#2a3f5f"
- }
- },
- "ternary": {
- "aaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "baxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "bgcolor": "#E5ECF6",
- "caxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- }
- },
- "title": {
- "x": 0.05
- },
- "xaxis": {
- "automargin": true,
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": "",
- "title": {
- "standoff": 15
- },
- "zerolinecolor": "white",
- "zerolinewidth": 2
- },
- "yaxis": {
- "automargin": true,
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": "",
- "title": {
- "standoff": 15
- },
- "zerolinecolor": "white",
- "zerolinewidth": 2
- }
- }
- },
- "title": {
- "font": {
- "size": 15
- },
- "text": "Correct rate of each problem (bottom 20) (total transactions of each problem are required to be more than 500)"
- },
- "xaxis": {
- "anchor": "y",
- "domain": [
- 0,
- 1
- ],
- "title": {
- "text": "Correct rate"
- }
- },
- "yaxis": {
- "anchor": "x",
- "domain": [
- 0,
- 1
- ],
- "title": {
- "text": "Problem Name"
- }
- }
- }
- },
- "text/html": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "data['total transactions'] = data['Incorrects'] + data['Hints'] + data['Corrects']\n",
- "df1 = data.groupby('Problem Name')['total transactions'].sum().reset_index()\n",
- "df2 = data.groupby('Problem Name')['Corrects'].sum().reset_index()\n",
- "df1['Corrects'] = df2['Corrects']\n",
- "df1['Correct rate'] = df1['Corrects'] / df1['total transactions']\n",
- "\n",
- "df1 = df1.sort_values('total transactions')\n",
- "count = 0\n",
- "standard = 500\n",
- "for i in df1['total transactions']:\n",
- " if i > standard:\n",
- " count += 1\n",
- "df1 = df1.tail(count)\n",
- "\n",
- "df1 = df1.sort_values('Correct rate')\n",
- "\n",
- "df1['Problem Name'] = df1['Problem Name'].astype(str) + \"-\"\n",
- "\n",
- "df_px = df1.tail(20)\n",
- "\n",
- "fig = px.bar(\n",
- " df_px,\n",
- " x='Correct rate',\n",
- " y='Problem Name',\n",
- " orientation='h',\n",
- " title='Correct rate of each problem (top 20) (total transactions of \\\n",
- "each problem are required to be more than 500)',\n",
- " text='Problem Name'\n",
- ")\n",
- "fig.show()\n",
- "\n",
- "df_px = df1.head(20)\n",
- "\n",
- "fig = px.bar(\n",
- " df_px,\n",
- " x='Correct rate',\n",
- " y='Problem Name',\n",
- " orientation='h',\n",
- " title='Correct rate of each problem (bottom 20) (total transactions of \\\n",
- "each problem are required to be more than 500)',\n",
- " text='Problem Name'\n",
- ")\n",
- "fig.update_layout(title_font_size=15)\n",
- "fig.show()"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "2bee6e04",
- "metadata": {},
- "source": [
- "These two figures present the correct rate of problems. Problems with low correct rate deserve more attention from teachers and students."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "0047e839",
- "metadata": {},
- "source": [
- "## Sort by KC"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 13,
- "id": "c6e910e0",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "application/vnd.plotly.v1+json": {
- "config": {
- "plotlyServerURL": "https://plot.ly"
- },
- "data": [
- {
- "alignmentgroup": "True",
- "hovertemplate": "correct rate=%{x} KC(Default)=%{text} ",
- "legendgroup": "",
- "marker": {
- "color": "#636efa",
- "pattern": {
- "shape": ""
- }
- },
- "name": "",
- "offsetgroup": "",
- "orientation": "h",
- "showlegend": false,
- "text": [
- "[Rule: RF right ([SolverOperation rf],{components with property canReduceFractionsNoMultWhole of right})]-",
- "Enter second extreme in equation-",
- "Proportional-Constant-Expression-Gla:Student-Modeling-Analysis-",
- "Enter Calculated value of rate-",
- "[SkillRule: Combine like terms, no var; CLT]-",
- "Enter square of leg label-",
- "Compare medians - removed outlier-",
- "Enter number of total outcomes in table-",
- "Find square of given leg-",
- "Enter ratio quantity to right of \"to\"-",
- "Enter fractional probability of event-",
- "[SkillRule: Select Multiply; {MT; MT no fraction coeff}]-",
- "Write base of exponential from given whole number as product-",
- "Write decimal multiplier from given scientific notation-",
- "Enter ratio quantity to right of colon-",
- "PM-ROW-1-",
- "Changing axis intervals-",
- "unspecified-",
- "[Rule: unnec-elems ([SolverOperation unnec-elems],)]-",
- "Select second event-"
- ],
- "textposition": "auto",
- "type": "bar",
- "x": [
- 0.9094659771050032,
- 0.9125055236411843,
- 0.9135446685878963,
- 0.9155947136563877,
- 0.9156931229676021,
- 0.9194915254237288,
- 0.9218241042345277,
- 0.9247135842880524,
- 0.9323943661971831,
- 0.9386369807222373,
- 0.9389944134078212,
- 0.9399566697616837,
- 0.9405666897028334,
- 0.940814757878555,
- 0.9422934648581998,
- 0.9431375603676585,
- 0.9706148701992291,
- 0.9769823066841415,
- 0.9838308457711443,
- 0.987603305785124
- ],
- "xaxis": "x",
- "y": [
- "[Rule: RF right ([SolverOperation rf],{components with property canReduceFractionsNoMultWhole of right})]-",
- "Enter second extreme in equation-",
- "Proportional-Constant-Expression-Gla:Student-Modeling-Analysis-",
- "Enter Calculated value of rate-",
- "[SkillRule: Combine like terms, no var; CLT]-",
- "Enter square of leg label-",
- "Compare medians - removed outlier-",
- "Enter number of total outcomes in table-",
- "Find square of given leg-",
- "Enter ratio quantity to right of \"to\"-",
- "Enter fractional probability of event-",
- "[SkillRule: Select Multiply; {MT; MT no fraction coeff}]-",
- "Write base of exponential from given whole number as product-",
- "Write decimal multiplier from given scientific notation-",
- "Enter ratio quantity to right of colon-",
- "PM-ROW-1-",
- "Changing axis intervals-",
- "unspecified-",
- "[Rule: unnec-elems ([SolverOperation unnec-elems],)]-",
- "Select second event-"
- ],
- "yaxis": "y"
- }
- ],
- "layout": {
- "barmode": "relative",
- "legend": {
- "tracegroupgap": 0
- },
- "template": {
- "data": {
- "bar": [
- {
- "error_x": {
- "color": "#2a3f5f"
- },
- "error_y": {
- "color": "#2a3f5f"
- },
- "marker": {
- "line": {
- "color": "#E5ECF6",
- "width": 0.5
- },
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "bar"
- }
- ],
- "barpolar": [
- {
- "marker": {
- "line": {
- "color": "#E5ECF6",
- "width": 0.5
- },
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "barpolar"
- }
- ],
- "carpet": [
- {
- "aaxis": {
- "endlinecolor": "#2a3f5f",
- "gridcolor": "white",
- "linecolor": "white",
- "minorgridcolor": "white",
- "startlinecolor": "#2a3f5f"
- },
- "baxis": {
- "endlinecolor": "#2a3f5f",
- "gridcolor": "white",
- "linecolor": "white",
- "minorgridcolor": "white",
- "startlinecolor": "#2a3f5f"
- },
- "type": "carpet"
- }
- ],
- "choropleth": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "choropleth"
- }
- ],
- "contour": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "contour"
- }
- ],
- "contourcarpet": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "contourcarpet"
- }
- ],
- "heatmap": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "heatmap"
- }
- ],
- "heatmapgl": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "heatmapgl"
- }
- ],
- "histogram": [
- {
- "marker": {
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "histogram"
- }
- ],
- "histogram2d": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "histogram2d"
- }
- ],
- "histogram2dcontour": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "histogram2dcontour"
- }
- ],
- "mesh3d": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "mesh3d"
- }
- ],
- "parcoords": [
- {
- "line": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "parcoords"
- }
- ],
- "pie": [
- {
- "automargin": true,
- "type": "pie"
- }
- ],
- "scatter": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatter"
- }
- ],
- "scatter3d": [
- {
- "line": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatter3d"
- }
- ],
- "scattercarpet": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattercarpet"
- }
- ],
- "scattergeo": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattergeo"
- }
- ],
- "scattergl": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattergl"
- }
- ],
- "scattermapbox": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattermapbox"
- }
- ],
- "scatterpolar": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterpolar"
- }
- ],
- "scatterpolargl": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterpolargl"
- }
- ],
- "scatterternary": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterternary"
- }
- ],
- "surface": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "surface"
- }
- ],
- "table": [
- {
- "cells": {
- "fill": {
- "color": "#EBF0F8"
- },
- "line": {
- "color": "white"
- }
- },
- "header": {
- "fill": {
- "color": "#C8D4E3"
- },
- "line": {
- "color": "white"
- }
- },
- "type": "table"
- }
- ]
- },
- "layout": {
- "annotationdefaults": {
- "arrowcolor": "#2a3f5f",
- "arrowhead": 0,
- "arrowwidth": 1
- },
- "autotypenumbers": "strict",
- "coloraxis": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "colorscale": {
- "diverging": [
- [
- 0,
- "#8e0152"
- ],
- [
- 0.1,
- "#c51b7d"
- ],
- [
- 0.2,
- "#de77ae"
- ],
- [
- 0.3,
- "#f1b6da"
- ],
- [
- 0.4,
- "#fde0ef"
- ],
- [
- 0.5,
- "#f7f7f7"
- ],
- [
- 0.6,
- "#e6f5d0"
- ],
- [
- 0.7,
- "#b8e186"
- ],
- [
- 0.8,
- "#7fbc41"
- ],
- [
- 0.9,
- "#4d9221"
- ],
- [
- 1,
- "#276419"
- ]
- ],
- "sequential": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "sequentialminus": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ]
- },
- "colorway": [
- "#636efa",
- "#EF553B",
- "#00cc96",
- "#ab63fa",
- "#FFA15A",
- "#19d3f3",
- "#FF6692",
- "#B6E880",
- "#FF97FF",
- "#FECB52"
- ],
- "font": {
- "color": "#2a3f5f"
- },
- "geo": {
- "bgcolor": "white",
- "lakecolor": "white",
- "landcolor": "#E5ECF6",
- "showlakes": true,
- "showland": true,
- "subunitcolor": "white"
- },
- "hoverlabel": {
- "align": "left"
- },
- "hovermode": "closest",
- "mapbox": {
- "style": "light"
- },
- "paper_bgcolor": "white",
- "plot_bgcolor": "#E5ECF6",
- "polar": {
- "angularaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "bgcolor": "#E5ECF6",
- "radialaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- }
- },
- "scene": {
- "xaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- },
- "yaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- },
- "zaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- }
- },
- "shapedefaults": {
- "line": {
- "color": "#2a3f5f"
- }
- },
- "ternary": {
- "aaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "baxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "bgcolor": "#E5ECF6",
- "caxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- }
- },
- "title": {
- "x": 0.05
- },
- "xaxis": {
- "automargin": true,
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": "",
- "title": {
- "standoff": 15
- },
- "zerolinecolor": "white",
- "zerolinewidth": 2
- },
- "yaxis": {
- "automargin": true,
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": "",
- "title": {
- "standoff": 15
- },
- "zerolinecolor": "white",
- "zerolinewidth": 2
- }
- }
- },
- "title": {
- "text": "Correct rate of each KC(Default) (top 20) (total transactions of each KC are required to be more than 300)"
- },
- "xaxis": {
- "anchor": "y",
- "domain": [
- 0,
- 1
- ],
- "title": {
- "text": "correct rate"
- }
- },
- "yaxis": {
- "anchor": "x",
- "domain": [
- 0,
- 1
- ],
- "title": {
- "text": "KC(Default)"
- },
- "visible": false
- }
- }
- },
- "text/html": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "application/vnd.plotly.v1+json": {
- "config": {
- "plotlyServerURL": "https://plot.ly"
- },
- "data": [
- {
- "alignmentgroup": "True",
- "hovertemplate": "correct rate=%{x} KC(Default)=%{text} ",
- "legendgroup": "",
- "marker": {
- "color": "#636efa",
- "pattern": {
- "shape": ""
- }
- },
- "name": "",
- "offsetgroup": "",
- "orientation": "h",
- "showlegend": false,
- "text": [
- "unknown bug element-",
- "[SkillRule: Done?; {doneleft; doneright; doneleft, no menu; doneright, nomenu; done no solution; Done No Solution, domain exception; Done No Solution, range exception; done infinite solutions; done expr, standard form; done expr, standard form, no menu}]-",
- "unknown bug element~~CLT-ROW-1-COEFF-",
- "unknown bug element~~CLT-ROW-1-",
- "Plot terminating improper fractions-",
- "Plot decimal - thousandths-",
- "Plot terminating mixed number-",
- "Plot imperfect radical-",
- "Plot decimal - hundredths-",
- "Setting the slope-",
- "Plot terminating proper fraction-",
- "Plot percent-",
- "Plot non-terminating proper fraction-",
- "Plot non-terminating improper fraction-",
- "Entering slope, GLF-",
- "Placing coordinate point-",
- "Plot decimal - tenths-",
- "Finding the intersection, SIF-",
- "Finding the intersection, Mixed-",
- "[Rule: CLT nested, no LCD ([SolverOperation clt],)]-"
- ],
- "textposition": "auto",
- "type": "bar",
- "x": [
- 0,
- 0,
- 0.014072847682119206,
- 0.014477766287487074,
- 0.14357160282995854,
- 0.19899605593402653,
- 0.20798727690404664,
- 0.2222222222222222,
- 0.22783707253321717,
- 0.2335486778846154,
- 0.24001726742931145,
- 0.2440093512565751,
- 0.26698670605613,
- 0.29059485530546625,
- 0.3,
- 0.3083741984156922,
- 0.3113859585303747,
- 0.3129411764705882,
- 0.31905781584582443,
- 0.3395784543325527
- ],
- "xaxis": "x",
- "y": [
- "unknown bug element-",
- "[SkillRule: Done?; {doneleft; doneright; doneleft, no menu; doneright, nomenu; done no solution; Done No Solution, domain exception; Done No Solution, range exception; done infinite solutions; done expr, standard form; done expr, standard form, no menu}]-",
- "unknown bug element~~CLT-ROW-1-COEFF-",
- "unknown bug element~~CLT-ROW-1-",
- "Plot terminating improper fractions-",
- "Plot decimal - thousandths-",
- "Plot terminating mixed number-",
- "Plot imperfect radical-",
- "Plot decimal - hundredths-",
- "Setting the slope-",
- "Plot terminating proper fraction-",
- "Plot percent-",
- "Plot non-terminating proper fraction-",
- "Plot non-terminating improper fraction-",
- "Entering slope, GLF-",
- "Placing coordinate point-",
- "Plot decimal - tenths-",
- "Finding the intersection, SIF-",
- "Finding the intersection, Mixed-",
- "[Rule: CLT nested, no LCD ([SolverOperation clt],)]-"
- ],
- "yaxis": "y"
- }
- ],
- "layout": {
- "barmode": "relative",
- "legend": {
- "tracegroupgap": 0
- },
- "template": {
- "data": {
- "bar": [
- {
- "error_x": {
- "color": "#2a3f5f"
- },
- "error_y": {
- "color": "#2a3f5f"
- },
- "marker": {
- "line": {
- "color": "#E5ECF6",
- "width": 0.5
- },
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "bar"
- }
- ],
- "barpolar": [
- {
- "marker": {
- "line": {
- "color": "#E5ECF6",
- "width": 0.5
- },
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "barpolar"
- }
- ],
- "carpet": [
- {
- "aaxis": {
- "endlinecolor": "#2a3f5f",
- "gridcolor": "white",
- "linecolor": "white",
- "minorgridcolor": "white",
- "startlinecolor": "#2a3f5f"
- },
- "baxis": {
- "endlinecolor": "#2a3f5f",
- "gridcolor": "white",
- "linecolor": "white",
- "minorgridcolor": "white",
- "startlinecolor": "#2a3f5f"
- },
- "type": "carpet"
- }
- ],
- "choropleth": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "choropleth"
- }
- ],
- "contour": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "contour"
- }
- ],
- "contourcarpet": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "contourcarpet"
- }
- ],
- "heatmap": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "heatmap"
- }
- ],
- "heatmapgl": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "heatmapgl"
- }
- ],
- "histogram": [
- {
- "marker": {
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "histogram"
- }
- ],
- "histogram2d": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "histogram2d"
- }
- ],
- "histogram2dcontour": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "histogram2dcontour"
- }
- ],
- "mesh3d": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "mesh3d"
- }
- ],
- "parcoords": [
- {
- "line": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "parcoords"
- }
- ],
- "pie": [
- {
- "automargin": true,
- "type": "pie"
- }
- ],
- "scatter": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatter"
- }
- ],
- "scatter3d": [
- {
- "line": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatter3d"
- }
- ],
- "scattercarpet": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattercarpet"
- }
- ],
- "scattergeo": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattergeo"
- }
- ],
- "scattergl": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattergl"
- }
- ],
- "scattermapbox": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattermapbox"
- }
- ],
- "scatterpolar": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterpolar"
- }
- ],
- "scatterpolargl": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterpolargl"
- }
- ],
- "scatterternary": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterternary"
- }
- ],
- "surface": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "surface"
- }
- ],
- "table": [
- {
- "cells": {
- "fill": {
- "color": "#EBF0F8"
- },
- "line": {
- "color": "white"
- }
- },
- "header": {
- "fill": {
- "color": "#C8D4E3"
- },
- "line": {
- "color": "white"
- }
- },
- "type": "table"
- }
- ]
- },
- "layout": {
- "annotationdefaults": {
- "arrowcolor": "#2a3f5f",
- "arrowhead": 0,
- "arrowwidth": 1
- },
- "autotypenumbers": "strict",
- "coloraxis": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "colorscale": {
- "diverging": [
- [
- 0,
- "#8e0152"
- ],
- [
- 0.1,
- "#c51b7d"
- ],
- [
- 0.2,
- "#de77ae"
- ],
- [
- 0.3,
- "#f1b6da"
- ],
- [
- 0.4,
- "#fde0ef"
- ],
- [
- 0.5,
- "#f7f7f7"
- ],
- [
- 0.6,
- "#e6f5d0"
- ],
- [
- 0.7,
- "#b8e186"
- ],
- [
- 0.8,
- "#7fbc41"
- ],
- [
- 0.9,
- "#4d9221"
- ],
- [
- 1,
- "#276419"
- ]
- ],
- "sequential": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "sequentialminus": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ]
- },
- "colorway": [
- "#636efa",
- "#EF553B",
- "#00cc96",
- "#ab63fa",
- "#FFA15A",
- "#19d3f3",
- "#FF6692",
- "#B6E880",
- "#FF97FF",
- "#FECB52"
- ],
- "font": {
- "color": "#2a3f5f"
- },
- "geo": {
- "bgcolor": "white",
- "lakecolor": "white",
- "landcolor": "#E5ECF6",
- "showlakes": true,
- "showland": true,
- "subunitcolor": "white"
- },
- "hoverlabel": {
- "align": "left"
- },
- "hovermode": "closest",
- "mapbox": {
- "style": "light"
- },
- "paper_bgcolor": "white",
- "plot_bgcolor": "#E5ECF6",
- "polar": {
- "angularaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "bgcolor": "#E5ECF6",
- "radialaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- }
- },
- "scene": {
- "xaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- },
- "yaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- },
- "zaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- }
- },
- "shapedefaults": {
- "line": {
- "color": "#2a3f5f"
- }
- },
- "ternary": {
- "aaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "baxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "bgcolor": "#E5ECF6",
- "caxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- }
- },
- "title": {
- "x": 0.05
- },
- "xaxis": {
- "automargin": true,
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": "",
- "title": {
- "standoff": 15
- },
- "zerolinecolor": "white",
- "zerolinewidth": 2
- },
- "yaxis": {
- "automargin": true,
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": "",
- "title": {
- "standoff": 15
- },
- "zerolinecolor": "white",
- "zerolinewidth": 2
- }
- }
- },
- "title": {
- "font": {
- "size": 15
- },
- "text": "Correct rate of each KC(Default) (bottom 20) (total transactions of each KC are required to be more than 300)"
- },
- "xaxis": {
- "anchor": "y",
- "domain": [
- 0,
- 1
- ],
- "title": {
- "text": "correct rate"
- }
- },
- "yaxis": {
- "anchor": "x",
- "domain": [
- 0,
- 1
- ],
- "title": {
- "text": "KC(Default)"
- },
- "visible": false
- }
- }
- },
- "text/html": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "data.dropna(subset=['KC(Default)'], inplace=True)\n",
- "\n",
- "data['total transactions'] = data['Corrects'] + data['Hints'] + data['Incorrects']\n",
- "df1 = data.groupby('KC(Default)')['total transactions'].sum().reset_index()\n",
- "df2 = data.groupby('KC(Default)')['Corrects'].sum().reset_index()\n",
- "df1['Corrects'] = df2['Corrects']\n",
- "df1['correct rate'] = df1['Corrects'] / df1['total transactions']\n",
- "\n",
- "count = 0\n",
- "standard = 300\n",
- "for i in df1['total transactions']:\n",
- " if i > standard:\n",
- " count += 1\n",
- "df1 = df1.sort_values('total transactions').tail(count)\n",
- "\n",
- "df1 = df1.sort_values('correct rate')\n",
- "\n",
- "df1['KC(Default)'] = df1['KC(Default)'].astype(str) + '-'\n",
- "\n",
- "df_px = df1.tail(20)\n",
- "\n",
- "fig = px.bar(\n",
- " df_px,\n",
- " x='correct rate',\n",
- " y='KC(Default)',\n",
- " orientation='h',\n",
- " title='Correct rate of each KC(Default) (top 20) (total transactions of \\\n",
- "each KC are required to be more than 300)',\n",
- " text='KC(Default)'\n",
- ")\n",
- "fig.update_yaxes(visible=False)\n",
- "fig.show()\n",
- "\n",
- "df_px = df1.head(20)\n",
- "\n",
- "fig = px.bar(\n",
- " df_px,\n",
- " x='correct rate',\n",
- " y='KC(Default)',\n",
- " orientation='h',\n",
- " title='Correct rate of each KC(Default) (bottom 20) (total transactions of \\\n",
- "each KC are required to be more than 300)',\n",
- " text='KC(Default)'\n",
- ")\n",
- "fig.update_yaxes(visible=False)\n",
- "fig.update_layout(title_font_size=15)\n",
- "fig.show()"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "0feef8a1",
- "metadata": {},
- "source": [
- "These two figures present the correct rate of KCs. KCs with low correct rate deserve more attention from teachers and students."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "22d99527",
- "metadata": {},
- "source": [
- "## Postscript"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "09bc0903",
- "metadata": {},
- "source": [
- "Given that the whole data package is composed of 5 data sets and data files in these 5 data sets that can be used to conduct data analysis share the same data format, the following analysis based on \"algebra_2006_2007_train\" is just an example of data analysis on KDD Cup, and the code can be used to analyse other data files with some small changes on the file path and column names.\n"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3 (ipykernel)",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.9.6"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
From 69d505e68b3c3571526253fff449c7bf364a3b74 Mon Sep 17 00:00:00 2001
From: gguu1314 <88154940+gguu1314@users.noreply.github.com>
Date: Sat, 14 Aug 2021 12:20:55 +0800
Subject: [PATCH 05/10] Add files via upload
---
docs/KDD Cup 2010.ipynb | 4925 +++++++++++++++++++++++++++++++++++++++
1 file changed, 4925 insertions(+)
create mode 100644 docs/KDD Cup 2010.ipynb
diff --git a/docs/KDD Cup 2010.ipynb b/docs/KDD Cup 2010.ipynb
new file mode 100644
index 0000000..a662588
--- /dev/null
+++ b/docs/KDD Cup 2010.ipynb
@@ -0,0 +1,4925 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "e002fdf8",
+ "metadata": {},
+ "source": [
+ "# KDD Cup 2010 —— Data Analysis on algebra_2006_2007_train"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "429152ff",
+ "metadata": {},
+ "source": [
+ "## Data Description"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2c89d116",
+ "metadata": {},
+ "source": [
+ "### Column Description"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f590eee5",
+ "metadata": {},
+ "source": [
+ "| Attribute | Annotaion |\n",
+ "|:--:|---|\n",
+ "|Row|The row number|\n",
+ "| Anon Student Id | Unique, anonymous identifier for a student |\n",
+ "| Problem Hierarchy | The hierarchy of curriculum levels containing the problem |\n",
+ "| Problem Name | Unique identifier for a problem |\n",
+ "| Problem View | The total number of times the student encountered the problem so far |\n",
+ "| Step Name | Unique identifier for one of the steps in a problem |\n",
+ "| Step Start Time | The starting time of the step (Can be null) |\n",
+ "| First Transaction Time | The time of the first transaction toward the step |\n",
+ "| Correct Transaction Time | The time of the correct attempt toward the step, if there was one |\n",
+ "| Step End Time | The time of the last transaction toward the step |\n",
+ "| Step Duration (sec) | The elapsed time of the step in seconds, calculated by adding all of the durations for transactions that were attributed to the step (Can be null if step start time is null) |\n",
+ "| Correct Step Duration (sec) | The step duration if the first attempt for the step was correct |\n",
+ "| Error Step Duration (sec) | The step duration if the first attempt for the step was an error (incorrect attempt or hint request) |\n",
+ "| Correct First Attempt | The tutor's evaluation of the student's first attempt on the step—1 if correct, 0 if an error |\n",
+ "| Incorrects | Total number of incorrect attempts by the student on the step |\n",
+ "| Hints | Total number of hints requested by the student for the step |\n",
+ "| Corrects | Total correct attempts by the student for the step (only increases if the step is encountered more than once) |\n",
+ "| KC(KC Model Name) | The identified skills that are used in a problem, where available |\n",
+ "| Opportunity(KC Model Name) | A count that increases by one each time the student encounters a step with the listed knowledge component |\n",
+ "|| Additional KC models, which exist for the challenge data sets, will appear as additional pairs of columns (KC and Opportunity columns for each model) |"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2c2a2d3e",
+ "metadata": {},
+ "source": [
+ "For the test portion of the challenge data sets, values will not be provided for the following columns:"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f19eb949",
+ "metadata": {},
+ "source": [
+ "♦ Step Start Time\n",
+ "\n",
+ "♦ First Transaction Time\n",
+ "\n",
+ "♦ Correct Transaction Time\n",
+ "\n",
+ "♦ Step End Time\n",
+ "\n",
+ "♦ Step Duration (sec)\n",
+ "\n",
+ "♦ Correct Step Duration (sec)\n",
+ "\n",
+ "♦ Error Step Duration (sec)\n",
+ "\n",
+ "♦ Correct First Attempt\n",
+ "\n",
+ "♦ Incorrects\n",
+ "\n",
+ "♦ Hints\n",
+ "\n",
+ "♦ Corrects"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "123674b7",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import pandas as pd\n",
+ "import plotly.express as px"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "id": "efa6be16",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "path = \"algebra_2006_2007_train.txt\"\n",
+ "data = pd.read_table(path, encoding=\"ISO-8859-15\", low_memory=False)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "993f1986",
+ "metadata": {},
+ "source": [
+ "## Record Examples"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "8b2af14e",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " Row \n",
+ " Anon Student Id \n",
+ " Problem Hierarchy \n",
+ " Problem Name \n",
+ " Problem View \n",
+ " Step Name \n",
+ " Step Start Time \n",
+ " First Transaction Time \n",
+ " Correct Transaction Time \n",
+ " Step End Time \n",
+ " Step Duration (sec) \n",
+ " Correct Step Duration (sec) \n",
+ " Error Step Duration (sec) \n",
+ " Correct First Attempt \n",
+ " Incorrects \n",
+ " Hints \n",
+ " Corrects \n",
+ " KC(Default) \n",
+ " Opportunity(Default) \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 \n",
+ " 1 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R1C1 \n",
+ " 2006-10-26 09:51:58.0 \n",
+ " 2006-10-26 09:52:30.0 \n",
+ " 2006-10-26 09:53:30.0 \n",
+ " 2006-10-26 09:53:30.0 \n",
+ " 92.0 \n",
+ " NaN \n",
+ " 92.0 \n",
+ " 0 \n",
+ " 2 \n",
+ " 0 \n",
+ " 1 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " 1 \n",
+ " 2 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R1C2 \n",
+ " 2006-10-26 09:53:30.0 \n",
+ " 2006-10-26 09:53:41.0 \n",
+ " 2006-10-26 09:53:41.0 \n",
+ " 2006-10-26 09:53:41.0 \n",
+ " 11.0 \n",
+ " 11.0 \n",
+ " NaN \n",
+ " 1 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " 2 \n",
+ " 3 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R2C1 \n",
+ " 2006-10-26 09:53:41.0 \n",
+ " 2006-10-26 09:53:46.0 \n",
+ " 2006-10-26 09:53:46.0 \n",
+ " 2006-10-26 09:53:46.0 \n",
+ " 5.0 \n",
+ " 5.0 \n",
+ " NaN \n",
+ " 1 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " Identifying units \n",
+ " 1 \n",
+ " \n",
+ " \n",
+ " 3 \n",
+ " 4 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R2C2 \n",
+ " 2006-10-26 09:53:46.0 \n",
+ " 2006-10-26 09:53:50.0 \n",
+ " 2006-10-26 09:53:50.0 \n",
+ " 2006-10-26 09:53:50.0 \n",
+ " 4.0 \n",
+ " 4.0 \n",
+ " NaN \n",
+ " 1 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " Identifying units \n",
+ " 2 \n",
+ " \n",
+ " \n",
+ " 4 \n",
+ " 5 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R4C1 \n",
+ " 2006-10-26 09:53:50.0 \n",
+ " 2006-10-26 09:54:05.0 \n",
+ " 2006-10-26 09:54:05.0 \n",
+ " 2006-10-26 09:54:05.0 \n",
+ " 15.0 \n",
+ " 15.0 \n",
+ " NaN \n",
+ " 1 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " Entering a given \n",
+ " 1 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Row Anon Student Id Problem Hierarchy Problem Name \\\n",
+ "0 1 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "1 2 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "2 3 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "3 4 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "4 5 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "\n",
+ " Problem View Step Name Step Start Time First Transaction Time \\\n",
+ "0 1 R1C1 2006-10-26 09:51:58.0 2006-10-26 09:52:30.0 \n",
+ "1 1 R1C2 2006-10-26 09:53:30.0 2006-10-26 09:53:41.0 \n",
+ "2 1 R2C1 2006-10-26 09:53:41.0 2006-10-26 09:53:46.0 \n",
+ "3 1 R2C2 2006-10-26 09:53:46.0 2006-10-26 09:53:50.0 \n",
+ "4 1 R4C1 2006-10-26 09:53:50.0 2006-10-26 09:54:05.0 \n",
+ "\n",
+ " Correct Transaction Time Step End Time Step Duration (sec) \\\n",
+ "0 2006-10-26 09:53:30.0 2006-10-26 09:53:30.0 92.0 \n",
+ "1 2006-10-26 09:53:41.0 2006-10-26 09:53:41.0 11.0 \n",
+ "2 2006-10-26 09:53:46.0 2006-10-26 09:53:46.0 5.0 \n",
+ "3 2006-10-26 09:53:50.0 2006-10-26 09:53:50.0 4.0 \n",
+ "4 2006-10-26 09:54:05.0 2006-10-26 09:54:05.0 15.0 \n",
+ "\n",
+ " Correct Step Duration (sec) Error Step Duration (sec) \\\n",
+ "0 NaN 92.0 \n",
+ "1 11.0 NaN \n",
+ "2 5.0 NaN \n",
+ "3 4.0 NaN \n",
+ "4 15.0 NaN \n",
+ "\n",
+ " Correct First Attempt Incorrects Hints Corrects KC(Default) \\\n",
+ "0 0 2 0 1 NaN \n",
+ "1 1 0 0 1 NaN \n",
+ "2 1 0 0 1 Identifying units \n",
+ "3 1 0 0 1 Identifying units \n",
+ "4 1 0 0 1 Entering a given \n",
+ "\n",
+ " Opportunity(Default) \n",
+ "0 NaN \n",
+ "1 NaN \n",
+ "2 1 \n",
+ "3 2 \n",
+ "4 1 "
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "pd.set_option('display.max_column', 500)\n",
+ "data.head()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "id": "9d5e5859",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " Row \n",
+ " Problem View \n",
+ " Step Duration (sec) \n",
+ " Correct Step Duration (sec) \n",
+ " Error Step Duration (sec) \n",
+ " Correct First Attempt \n",
+ " Incorrects \n",
+ " Hints \n",
+ " Corrects \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " count \n",
+ " 2.270384e+06 \n",
+ " 2.270384e+06 \n",
+ " 2.267551e+06 \n",
+ " 1.751638e+06 \n",
+ " 515913.000000 \n",
+ " 2.270384e+06 \n",
+ " 2.270384e+06 \n",
+ " 2.270384e+06 \n",
+ " 2.270384e+06 \n",
+ " \n",
+ " \n",
+ " mean \n",
+ " 1.513120e+06 \n",
+ " 1.092910e+00 \n",
+ " 1.958364e+01 \n",
+ " 1.171716e+01 \n",
+ " 46.292087 \n",
+ " 7.722359e-01 \n",
+ " 4.455044e-01 \n",
+ " 1.184311e-01 \n",
+ " 1.062878e+00 \n",
+ " \n",
+ " \n",
+ " std \n",
+ " 8.736198e+05 \n",
+ " 3.448857e-01 \n",
+ " 4.768345e+01 \n",
+ " 2.645318e+01 \n",
+ " 81.817794 \n",
+ " 4.193897e-01 \n",
+ " 2.000914e+00 \n",
+ " 6.199071e-01 \n",
+ " 6.894285e-01 \n",
+ " \n",
+ " \n",
+ " min \n",
+ " 1.000000e+00 \n",
+ " 1.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " \n",
+ " \n",
+ " 25% \n",
+ " 7.577408e+05 \n",
+ " 1.000000e+00 \n",
+ " 3.000000e+00 \n",
+ " 3.000000e+00 \n",
+ " 11.000000 \n",
+ " 1.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 1.000000e+00 \n",
+ " \n",
+ " \n",
+ " 50% \n",
+ " 1.511844e+06 \n",
+ " 1.000000e+00 \n",
+ " 7.000000e+00 \n",
+ " 5.000000e+00 \n",
+ " 22.000000 \n",
+ " 1.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 1.000000e+00 \n",
+ " \n",
+ " \n",
+ " 75% \n",
+ " 2.269432e+06 \n",
+ " 1.000000e+00 \n",
+ " 1.700000e+01 \n",
+ " 1.100000e+01 \n",
+ " 47.000000 \n",
+ " 1.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 1.000000e+00 \n",
+ " \n",
+ " \n",
+ " max \n",
+ " 3.025933e+06 \n",
+ " 1.000000e+01 \n",
+ " 3.208000e+03 \n",
+ " 1.204000e+03 \n",
+ " 3208.000000 \n",
+ " 1.000000e+00 \n",
+ " 3.600000e+02 \n",
+ " 1.020000e+02 \n",
+ " 9.200000e+01 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Row Problem View Step Duration (sec) \\\n",
+ "count 2.270384e+06 2.270384e+06 2.267551e+06 \n",
+ "mean 1.513120e+06 1.092910e+00 1.958364e+01 \n",
+ "std 8.736198e+05 3.448857e-01 4.768345e+01 \n",
+ "min 1.000000e+00 1.000000e+00 0.000000e+00 \n",
+ "25% 7.577408e+05 1.000000e+00 3.000000e+00 \n",
+ "50% 1.511844e+06 1.000000e+00 7.000000e+00 \n",
+ "75% 2.269432e+06 1.000000e+00 1.700000e+01 \n",
+ "max 3.025933e+06 1.000000e+01 3.208000e+03 \n",
+ "\n",
+ " Correct Step Duration (sec) Error Step Duration (sec) \\\n",
+ "count 1.751638e+06 515913.000000 \n",
+ "mean 1.171716e+01 46.292087 \n",
+ "std 2.645318e+01 81.817794 \n",
+ "min 0.000000e+00 0.000000 \n",
+ "25% 3.000000e+00 11.000000 \n",
+ "50% 5.000000e+00 22.000000 \n",
+ "75% 1.100000e+01 47.000000 \n",
+ "max 1.204000e+03 3208.000000 \n",
+ "\n",
+ " Correct First Attempt Incorrects Hints Corrects \n",
+ "count 2.270384e+06 2.270384e+06 2.270384e+06 2.270384e+06 \n",
+ "mean 7.722359e-01 4.455044e-01 1.184311e-01 1.062878e+00 \n",
+ "std 4.193897e-01 2.000914e+00 6.199071e-01 6.894285e-01 \n",
+ "min 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 \n",
+ "25% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
+ "50% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
+ "75% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
+ "max 1.000000e+00 3.600000e+02 1.020000e+02 9.200000e+01 "
+ ]
+ },
+ "execution_count": 4,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "data.describe()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "id": "92cc0aab",
+ "metadata": {
+ "scrolled": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Part of missing values for every column\n",
+ "Row 0.000000\n",
+ "Anon Student Id 0.000000\n",
+ "Problem Hierarchy 0.000000\n",
+ "Problem Name 0.000000\n",
+ "Problem View 0.000000\n",
+ "Step Name 0.000000\n",
+ "Step Start Time 0.001103\n",
+ "First Transaction Time 0.000000\n",
+ "Correct Transaction Time 0.034757\n",
+ "Step End Time 0.000000\n",
+ "Step Duration (sec) 0.001248\n",
+ "Correct Step Duration (sec) 0.228484\n",
+ "Error Step Duration (sec) 0.772764\n",
+ "Correct First Attempt 0.000000\n",
+ "Incorrects 0.000000\n",
+ "Hints 0.000000\n",
+ "Corrects 0.000000\n",
+ "KC(Default) 0.203407\n",
+ "Opportunity(Default) 0.203407\n",
+ "dtype: float64\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"Part of missing values for every column\")\n",
+ "print(data.isnull().sum() / len(data))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "id": "0187b3b5",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "the number of records:\n",
+ "2270384\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"the number of records:\")\n",
+ "print(len(data))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "id": "701b6633",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "how many students are there in the table:\n",
+ "1338\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"how many students are there in the table:\")\n",
+ "print(len(data['Anon Student Id'].unique()))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "id": "bf7b246f",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "how many problems are there in the table:\n",
+ "91913\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"how many problems are there in the table:\")\n",
+ "print(len(data['Problem Name'].unique()))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e0602c47",
+ "metadata": {},
+ "source": [
+ "## Sort by Anon Student Id"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "id": "8051cc2b",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ "0 2000 4000 6000 8000 271emtbxq8pa- 3cjD21W- 271zbdm1lcgj- 271swzglvvxm- 271jonvpgijj- 24841uicq- E9dzBix- 7OalbuD- a8YLu01- 271sjweu45ee- MYmjG5R- ug982yk- 271g8nye4tne- 271zzbwqtqht- 271k4y8incfb- 271tt6j61n7d- 271rvro73lce- 7LZr10z- F713eQN- 271g7beuc4s1- Top 40 students by number of steps they have done count Anon Student Id "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "ds = data['Anon Student Id'].value_counts().reset_index()\n",
+ "ds.columns = [\n",
+ " 'Anon Student Id',\n",
+ " 'count'\n",
+ "]\n",
+ "ds['Anon Student Id'] = ds['Anon Student Id'].astype(str) + '-'\n",
+ "ds = ds.sort_values('count').tail(40)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " ds,\n",
+ " x='count',\n",
+ " y='Anon Student Id',\n",
+ " orientation='h',\n",
+ " title='Top 40 students by number of steps they have done'\n",
+ ")\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "66ef53ad",
+ "metadata": {},
+ "source": [
+ "## Percent of corrects, hints and incorrects"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "id": "c8f1539c",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ "65.3% 27.4% 7.28% corrects incorrects hints Percent of corrects, hints and incorrects "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "count_corrects = data['Corrects'].sum()\n",
+ "count_hints = data['Hints'].sum()\n",
+ "count_incorrects = data['Incorrects'].sum()\n",
+ "\n",
+ "total = count_corrects + count_hints + count_incorrects\n",
+ "\n",
+ "percent_corrects = count_corrects / total\n",
+ "percent_hints = count_hints / total\n",
+ "percent_incorrects = count_incorrects / total\n",
+ "\n",
+ "dfl = [['corrects', percent_corrects], ['hints', percent_hints], ['incorrects', percent_incorrects]]\n",
+ "\n",
+ "df = pd.DataFrame(dfl, columns=['transaction type', 'percent'])\n",
+ "\n",
+ "fig = px.pie(\n",
+ " df,\n",
+ " names=['corrects', 'hints', 'incorrects'],\n",
+ " values='percent',\n",
+ " title='Percent of corrects, hints and incorrects'\n",
+ ")\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3b097141",
+ "metadata": {},
+ "source": [
+ "## Sort by Problem Name"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "id": "6d668c43",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ "0 200 400 600 800 1000 1PTFB11- DISTFB03_SP- DIST11_SP- PROP06- DIST09_SP- DIST10_SP- EG4-CONSTANT 3(x+2) = 15- JAN06- FEB04- NOV13- PROP03- PROP12- FEB11- RATIO2-001- PROP05- JAN09- PROP04- EG1-CONSTANT 7(8+2)- LDEMO_WSLVR- L5FB16- Top 40 useful problem count Problem Name "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "storeProblemCount = [1]\n",
+ "storeProblemName = [data['Problem Name'][0]]\n",
+ "currentProblemName = data['Problem Name'][0]\n",
+ "currentStepName = [data['Step Name'][0]]\n",
+ "lastIndex = 0\n",
+ "\n",
+ "for i in range(1, len(data), 1):\n",
+ " pbNameI = data['Problem Name'][i]\n",
+ " stNameI = data['Step Name'][i]\n",
+ " if pbNameI != data['Problem Name'][lastIndex]:\n",
+ " currentStepName = [stNameI]\n",
+ " currentProblemName = pbNameI\n",
+ " if pbNameI not in storeProblemName:\n",
+ " storeProblemName.append(pbNameI)\n",
+ " storeProblemCount.append(1)\n",
+ " else:\n",
+ " storeProblemCount[storeProblemName.index(pbNameI)] += 1\n",
+ " lastIndex = i\n",
+ " elif stNameI not in currentStepName:\n",
+ " currentStepName.append(stNameI)\n",
+ " lastIndex = i\n",
+ " else:\n",
+ " currentStepName = [stNameI]\n",
+ " storeProblemCount[storeProblemName.index(pbNameI)] += 1\n",
+ " lastIndex = i\n",
+ "\n",
+ "dfData = {\n",
+ " 'Problem Name': storeProblemName,\n",
+ " 'count': storeProblemCount\n",
+ "}\n",
+ "df = pd.DataFrame(dfData).sort_values('count').tail(40)\n",
+ "df[\"Problem Name\"] += '-'\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df,\n",
+ " x='count',\n",
+ " y='Problem Name',\n",
+ " orientation='h',\n",
+ " title='Top 40 useful problem'\n",
+ ")\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "id": "1b965aa4",
+ "metadata": {
+ "scrolled": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ " \n",
+ " "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "application/vnd.plotly.v1+json": {
+ "config": {
+ "plotlyServerURL": "https://plot.ly"
+ },
+ "data": [
+ {
+ "alignmentgroup": "True",
+ "hovertemplate": "Correct rate=%{x} Problem Name=%{text} ",
+ "legendgroup": "",
+ "marker": {
+ "color": "#636efa",
+ "pattern": {
+ "shape": ""
+ }
+ },
+ "name": "",
+ "offsetgroup": "",
+ "orientation": "h",
+ "showlegend": false,
+ "text": [
+ "SYLT-2X&YGE-2X+9-",
+ "ROOTS1-001-",
+ "SY=2X&Y=-3X+5-",
+ "PROBABILITY1-006-",
+ "EG-RE-DISTRIB-05 (2x+4)/(3x+6)-",
+ "PROBABILITY1-070-",
+ "G3X-YLE5&3X-YGE15-",
+ "BUSES-",
+ "PEANUTS-CASHEWS-",
+ "EXPONENT2-012-",
+ "TVS3-",
+ "PROBABILITY5-001-",
+ "EXPONENT3-071-",
+ "EXPONENT2-046-",
+ "EXPONENT5-001-",
+ "PROBABILITY2-001-",
+ "GLFM-BUSES-",
+ "EG-EPS01-CONSTANT (4(x^2)*y^4)(3(x^3)*y^3)-",
+ "PROBABILITY6-002-",
+ "PROBABILITY6-001-"
+ ],
+ "textposition": "auto",
+ "type": "bar",
+ "x": [
+ 0.84,
+ 0.8442477876106195,
+ 0.848,
+ 0.8492462311557789,
+ 0.8505096262740657,
+ 0.8557142857142858,
+ 0.8559622195985832,
+ 0.8581843429960077,
+ 0.8608278344331134,
+ 0.8679504814305364,
+ 0.86878612716763,
+ 0.878392305049811,
+ 0.8799212598425197,
+ 0.8896797153024911,
+ 0.8966360856269113,
+ 0.9195046439628483,
+ 0.924791086350975,
+ 0.934416715031921,
+ 0.9473684210526315,
+ 0.9579413392363033
+ ],
+ "xaxis": "x",
+ "y": [
+ "SYLT-2X&YGE-2X+9-",
+ "ROOTS1-001-",
+ "SY=2X&Y=-3X+5-",
+ "PROBABILITY1-006-",
+ "EG-RE-DISTRIB-05 (2x+4)/(3x+6)-",
+ "PROBABILITY1-070-",
+ "G3X-YLE5&3X-YGE15-",
+ "BUSES-",
+ "PEANUTS-CASHEWS-",
+ "EXPONENT2-012-",
+ "TVS3-",
+ "PROBABILITY5-001-",
+ "EXPONENT3-071-",
+ "EXPONENT2-046-",
+ "EXPONENT5-001-",
+ "PROBABILITY2-001-",
+ "GLFM-BUSES-",
+ "EG-EPS01-CONSTANT (4(x^2)*y^4)(3(x^3)*y^3)-",
+ "PROBABILITY6-002-",
+ "PROBABILITY6-001-"
+ ],
+ "yaxis": "y"
+ }
+ ],
+ "layout": {
+ "barmode": "relative",
+ "legend": {
+ "tracegroupgap": 0
+ },
+ "template": {
+ "data": {
+ "bar": [
+ {
+ "error_x": {
+ "color": "#2a3f5f"
+ },
+ "error_y": {
+ "color": "#2a3f5f"
+ },
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "bar"
+ }
+ ],
+ "barpolar": [
+ {
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "barpolar"
+ }
+ ],
+ "carpet": [
+ {
+ "aaxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "baxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "type": "carpet"
+ }
+ ],
+ "choropleth": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "choropleth"
+ }
+ ],
+ "contour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "contour"
+ }
+ ],
+ "contourcarpet": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "contourcarpet"
+ }
+ ],
+ "heatmap": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmap"
+ }
+ ],
+ "heatmapgl": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmapgl"
+ }
+ ],
+ "histogram": [
+ {
+ "marker": {
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "histogram"
+ }
+ ],
+ "histogram2d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2d"
+ }
+ ],
+ "histogram2dcontour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2dcontour"
+ }
+ ],
+ "mesh3d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "mesh3d"
+ }
+ ],
+ "parcoords": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "parcoords"
+ }
+ ],
+ "pie": [
+ {
+ "automargin": true,
+ "type": "pie"
+ }
+ ],
+ "scatter": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter"
+ }
+ ],
+ "scatter3d": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter3d"
+ }
+ ],
+ "scattercarpet": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattercarpet"
+ }
+ ],
+ "scattergeo": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergeo"
+ }
+ ],
+ "scattergl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergl"
+ }
+ ],
+ "scattermapbox": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattermapbox"
+ }
+ ],
+ "scatterpolar": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolar"
+ }
+ ],
+ "scatterpolargl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolargl"
+ }
+ ],
+ "scatterternary": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterternary"
+ }
+ ],
+ "surface": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "surface"
+ }
+ ],
+ "table": [
+ {
+ "cells": {
+ "fill": {
+ "color": "#EBF0F8"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "header": {
+ "fill": {
+ "color": "#C8D4E3"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "type": "table"
+ }
+ ]
+ },
+ "layout": {
+ "annotationdefaults": {
+ "arrowcolor": "#2a3f5f",
+ "arrowhead": 0,
+ "arrowwidth": 1
+ },
+ "autotypenumbers": "strict",
+ "coloraxis": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "colorscale": {
+ "diverging": [
+ [
+ 0,
+ "#8e0152"
+ ],
+ [
+ 0.1,
+ "#c51b7d"
+ ],
+ [
+ 0.2,
+ "#de77ae"
+ ],
+ [
+ 0.3,
+ "#f1b6da"
+ ],
+ [
+ 0.4,
+ "#fde0ef"
+ ],
+ [
+ 0.5,
+ "#f7f7f7"
+ ],
+ [
+ 0.6,
+ "#e6f5d0"
+ ],
+ [
+ 0.7,
+ "#b8e186"
+ ],
+ [
+ 0.8,
+ "#7fbc41"
+ ],
+ [
+ 0.9,
+ "#4d9221"
+ ],
+ [
+ 1,
+ "#276419"
+ ]
+ ],
+ "sequential": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "sequentialminus": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ]
+ },
+ "colorway": [
+ "#636efa",
+ "#EF553B",
+ "#00cc96",
+ "#ab63fa",
+ "#FFA15A",
+ "#19d3f3",
+ "#FF6692",
+ "#B6E880",
+ "#FF97FF",
+ "#FECB52"
+ ],
+ "font": {
+ "color": "#2a3f5f"
+ },
+ "geo": {
+ "bgcolor": "white",
+ "lakecolor": "white",
+ "landcolor": "#E5ECF6",
+ "showlakes": true,
+ "showland": true,
+ "subunitcolor": "white"
+ },
+ "hoverlabel": {
+ "align": "left"
+ },
+ "hovermode": "closest",
+ "mapbox": {
+ "style": "light"
+ },
+ "paper_bgcolor": "white",
+ "plot_bgcolor": "#E5ECF6",
+ "polar": {
+ "angularaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "radialaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "scene": {
+ "xaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "yaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "zaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ }
+ },
+ "shapedefaults": {
+ "line": {
+ "color": "#2a3f5f"
+ }
+ },
+ "ternary": {
+ "aaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "baxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "caxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "title": {
+ "x": 0.05
+ },
+ "xaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ },
+ "yaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ }
+ }
+ },
+ "title": {
+ "text": "Correct rate of each problem (top 20) (total transactions of each problem are required to be more than 500)"
+ },
+ "xaxis": {
+ "anchor": "y",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "Correct rate"
+ }
+ },
+ "yaxis": {
+ "anchor": "x",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "Problem Name"
+ }
+ }
+ }
+ },
+ "text/html": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "application/vnd.plotly.v1+json": {
+ "config": {
+ "plotlyServerURL": "https://plot.ly"
+ },
+ "data": [
+ {
+ "alignmentgroup": "True",
+ "hovertemplate": "Correct rate=%{x} Problem Name=%{text} ",
+ "legendgroup": "",
+ "marker": {
+ "color": "#636efa",
+ "pattern": {
+ "shape": ""
+ }
+ },
+ "name": "",
+ "offsetgroup": "",
+ "orientation": "h",
+ "showlegend": false,
+ "text": [
+ "RATIONAL1-091-",
+ "RATIONAL1-165-",
+ "RATIONAL1-034-",
+ "RATIONAL1-058-",
+ "RATIONAL1-075-",
+ "RATIONAL1-035-",
+ "RATIONAL1-281-",
+ "RATIONAL1-261-",
+ "RATIONAL1-177-",
+ "RATIONAL1-064-",
+ "RATIONAL1-147-",
+ "RATIONAL1-021-",
+ "RATIONAL1-121-",
+ "RATIONAL1-008-",
+ "RATIONAL1-288-",
+ "RATIONAL1-109-",
+ "BH1T31B-",
+ "RXMX_3C-",
+ "EG-RE-CANCEL-04 ((x-9)^3)/(x-9)^4-",
+ "RATIONAL2-205-"
+ ],
+ "textposition": "auto",
+ "type": "bar",
+ "x": [
+ 0.07532467532467532,
+ 0.07955596669750231,
+ 0.0807799442896936,
+ 0.0975609756097561,
+ 0.09904153354632587,
+ 0.1016566265060241,
+ 0.1198501872659176,
+ 0.1217564870259481,
+ 0.1288888888888889,
+ 0.13678756476683937,
+ 0.14023732470334413,
+ 0.1404303510758777,
+ 0.14186851211072665,
+ 0.16796875,
+ 0.17764471057884232,
+ 0.24912280701754386,
+ 0.3157172271791352,
+ 0.329510366122629,
+ 0.3380703066566941,
+ 0.3424479166666667
+ ],
+ "xaxis": "x",
+ "y": [
+ "RATIONAL1-091-",
+ "RATIONAL1-165-",
+ "RATIONAL1-034-",
+ "RATIONAL1-058-",
+ "RATIONAL1-075-",
+ "RATIONAL1-035-",
+ "RATIONAL1-281-",
+ "RATIONAL1-261-",
+ "RATIONAL1-177-",
+ "RATIONAL1-064-",
+ "RATIONAL1-147-",
+ "RATIONAL1-021-",
+ "RATIONAL1-121-",
+ "RATIONAL1-008-",
+ "RATIONAL1-288-",
+ "RATIONAL1-109-",
+ "BH1T31B-",
+ "RXMX_3C-",
+ "EG-RE-CANCEL-04 ((x-9)^3)/(x-9)^4-",
+ "RATIONAL2-205-"
+ ],
+ "yaxis": "y"
+ }
+ ],
+ "layout": {
+ "barmode": "relative",
+ "legend": {
+ "tracegroupgap": 0
+ },
+ "template": {
+ "data": {
+ "bar": [
+ {
+ "error_x": {
+ "color": "#2a3f5f"
+ },
+ "error_y": {
+ "color": "#2a3f5f"
+ },
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "bar"
+ }
+ ],
+ "barpolar": [
+ {
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "barpolar"
+ }
+ ],
+ "carpet": [
+ {
+ "aaxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "baxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "type": "carpet"
+ }
+ ],
+ "choropleth": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "choropleth"
+ }
+ ],
+ "contour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "contour"
+ }
+ ],
+ "contourcarpet": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "contourcarpet"
+ }
+ ],
+ "heatmap": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmap"
+ }
+ ],
+ "heatmapgl": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmapgl"
+ }
+ ],
+ "histogram": [
+ {
+ "marker": {
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "histogram"
+ }
+ ],
+ "histogram2d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2d"
+ }
+ ],
+ "histogram2dcontour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2dcontour"
+ }
+ ],
+ "mesh3d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "mesh3d"
+ }
+ ],
+ "parcoords": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "parcoords"
+ }
+ ],
+ "pie": [
+ {
+ "automargin": true,
+ "type": "pie"
+ }
+ ],
+ "scatter": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter"
+ }
+ ],
+ "scatter3d": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter3d"
+ }
+ ],
+ "scattercarpet": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattercarpet"
+ }
+ ],
+ "scattergeo": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergeo"
+ }
+ ],
+ "scattergl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergl"
+ }
+ ],
+ "scattermapbox": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattermapbox"
+ }
+ ],
+ "scatterpolar": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolar"
+ }
+ ],
+ "scatterpolargl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolargl"
+ }
+ ],
+ "scatterternary": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterternary"
+ }
+ ],
+ "surface": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "surface"
+ }
+ ],
+ "table": [
+ {
+ "cells": {
+ "fill": {
+ "color": "#EBF0F8"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "header": {
+ "fill": {
+ "color": "#C8D4E3"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "type": "table"
+ }
+ ]
+ },
+ "layout": {
+ "annotationdefaults": {
+ "arrowcolor": "#2a3f5f",
+ "arrowhead": 0,
+ "arrowwidth": 1
+ },
+ "autotypenumbers": "strict",
+ "coloraxis": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "colorscale": {
+ "diverging": [
+ [
+ 0,
+ "#8e0152"
+ ],
+ [
+ 0.1,
+ "#c51b7d"
+ ],
+ [
+ 0.2,
+ "#de77ae"
+ ],
+ [
+ 0.3,
+ "#f1b6da"
+ ],
+ [
+ 0.4,
+ "#fde0ef"
+ ],
+ [
+ 0.5,
+ "#f7f7f7"
+ ],
+ [
+ 0.6,
+ "#e6f5d0"
+ ],
+ [
+ 0.7,
+ "#b8e186"
+ ],
+ [
+ 0.8,
+ "#7fbc41"
+ ],
+ [
+ 0.9,
+ "#4d9221"
+ ],
+ [
+ 1,
+ "#276419"
+ ]
+ ],
+ "sequential": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "sequentialminus": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ]
+ },
+ "colorway": [
+ "#636efa",
+ "#EF553B",
+ "#00cc96",
+ "#ab63fa",
+ "#FFA15A",
+ "#19d3f3",
+ "#FF6692",
+ "#B6E880",
+ "#FF97FF",
+ "#FECB52"
+ ],
+ "font": {
+ "color": "#2a3f5f"
+ },
+ "geo": {
+ "bgcolor": "white",
+ "lakecolor": "white",
+ "landcolor": "#E5ECF6",
+ "showlakes": true,
+ "showland": true,
+ "subunitcolor": "white"
+ },
+ "hoverlabel": {
+ "align": "left"
+ },
+ "hovermode": "closest",
+ "mapbox": {
+ "style": "light"
+ },
+ "paper_bgcolor": "white",
+ "plot_bgcolor": "#E5ECF6",
+ "polar": {
+ "angularaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "radialaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "scene": {
+ "xaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "yaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "zaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ }
+ },
+ "shapedefaults": {
+ "line": {
+ "color": "#2a3f5f"
+ }
+ },
+ "ternary": {
+ "aaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "baxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "caxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "title": {
+ "x": 0.05
+ },
+ "xaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ },
+ "yaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ }
+ }
+ },
+ "title": {
+ "font": {
+ "size": 15
+ },
+ "text": "Correct rate of each problem (bottom 20) (total transactions of each problem are required to be more than 500)"
+ },
+ "xaxis": {
+ "anchor": "y",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "Correct rate"
+ }
+ },
+ "yaxis": {
+ "anchor": "x",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "Problem Name"
+ }
+ }
+ }
+ },
+ "text/html": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "data['total transactions'] = data['Incorrects'] + data['Hints'] + data['Corrects']\n",
+ "df1 = data.groupby('Problem Name')['total transactions'].sum().reset_index()\n",
+ "df2 = data.groupby('Problem Name')['Corrects'].sum().reset_index()\n",
+ "df1['Corrects'] = df2['Corrects']\n",
+ "df1['Correct rate'] = df1['Corrects'] / df1['total transactions']\n",
+ "\n",
+ "df1 = df1.sort_values('total transactions')\n",
+ "count = 0\n",
+ "standard = 500\n",
+ "for i in df1['total transactions']:\n",
+ " if i > standard:\n",
+ " count += 1\n",
+ "df1 = df1.tail(count)\n",
+ "\n",
+ "df1 = df1.sort_values('Correct rate')\n",
+ "\n",
+ "df1['Problem Name'] = df1['Problem Name'].astype(str) + \"-\"\n",
+ "\n",
+ "df_px = df1.tail(20)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df_px,\n",
+ " x='Correct rate',\n",
+ " y='Problem Name',\n",
+ " orientation='h',\n",
+ " title='Correct rate of each problem (top 20) (total transactions of \\\n",
+ "each problem are required to be more than 500)',\n",
+ " text='Problem Name'\n",
+ ")\n",
+ "fig.show(\"svg\")\n",
+ "\n",
+ "df_px = df1.head(20)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df_px,\n",
+ " x='Correct rate',\n",
+ " y='Problem Name',\n",
+ " orientation='h',\n",
+ " title='Correct rate of each problem (bottom 20) (total transactions of \\\n",
+ "each problem are required to be more than 500)',\n",
+ " text='Problem Name'\n",
+ ")\n",
+ "fig.update_layout(title_font_size=15)\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2bee6e04",
+ "metadata": {},
+ "source": [
+ "These two figures present the correct rate of problems. Problems with low correct rate deserve more attention from teachers and students."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0047e839",
+ "metadata": {},
+ "source": [
+ "## Sort by KC"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "id": "c6e910e0",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.plotly.v1+json": {
+ "config": {
+ "plotlyServerURL": "https://plot.ly"
+ },
+ "data": [
+ {
+ "alignmentgroup": "True",
+ "hovertemplate": "correct rate=%{x} KC(Default)=%{text} ",
+ "legendgroup": "",
+ "marker": {
+ "color": "#636efa",
+ "pattern": {
+ "shape": ""
+ }
+ },
+ "name": "",
+ "offsetgroup": "",
+ "orientation": "h",
+ "showlegend": false,
+ "text": [
+ "[Rule: RF right ([SolverOperation rf],{components with property canReduceFractionsNoMultWhole of right})]-",
+ "Enter second extreme in equation-",
+ "Proportional-Constant-Expression-Gla:Student-Modeling-Analysis-",
+ "Enter Calculated value of rate-",
+ "[SkillRule: Combine like terms, no var; CLT]-",
+ "Enter square of leg label-",
+ "Compare medians - removed outlier-",
+ "Enter number of total outcomes in table-",
+ "Find square of given leg-",
+ "Enter ratio quantity to right of \"to\"-",
+ "Enter fractional probability of event-",
+ "[SkillRule: Select Multiply; {MT; MT no fraction coeff}]-",
+ "Write base of exponential from given whole number as product-",
+ "Write decimal multiplier from given scientific notation-",
+ "Enter ratio quantity to right of colon-",
+ "PM-ROW-1-",
+ "Changing axis intervals-",
+ "unspecified-",
+ "[Rule: unnec-elems ([SolverOperation unnec-elems],)]-",
+ "Select second event-"
+ ],
+ "textposition": "auto",
+ "type": "bar",
+ "x": [
+ 0.9094659771050032,
+ 0.9125055236411843,
+ 0.9135446685878963,
+ 0.9155947136563877,
+ 0.9156931229676021,
+ 0.9194915254237288,
+ 0.9218241042345277,
+ 0.9247135842880524,
+ 0.9323943661971831,
+ 0.9386369807222373,
+ 0.9389944134078212,
+ 0.9399566697616837,
+ 0.9405666897028334,
+ 0.940814757878555,
+ 0.9422934648581998,
+ 0.9431375603676585,
+ 0.9706148701992291,
+ 0.9769823066841415,
+ 0.9838308457711443,
+ 0.987603305785124
+ ],
+ "xaxis": "x",
+ "y": [
+ "[Rule: RF right ([SolverOperation rf],{components with property canReduceFractionsNoMultWhole of right})]-",
+ "Enter second extreme in equation-",
+ "Proportional-Constant-Expression-Gla:Student-Modeling-Analysis-",
+ "Enter Calculated value of rate-",
+ "[SkillRule: Combine like terms, no var; CLT]-",
+ "Enter square of leg label-",
+ "Compare medians - removed outlier-",
+ "Enter number of total outcomes in table-",
+ "Find square of given leg-",
+ "Enter ratio quantity to right of \"to\"-",
+ "Enter fractional probability of event-",
+ "[SkillRule: Select Multiply; {MT; MT no fraction coeff}]-",
+ "Write base of exponential from given whole number as product-",
+ "Write decimal multiplier from given scientific notation-",
+ "Enter ratio quantity to right of colon-",
+ "PM-ROW-1-",
+ "Changing axis intervals-",
+ "unspecified-",
+ "[Rule: unnec-elems ([SolverOperation unnec-elems],)]-",
+ "Select second event-"
+ ],
+ "yaxis": "y"
+ }
+ ],
+ "layout": {
+ "barmode": "relative",
+ "legend": {
+ "tracegroupgap": 0
+ },
+ "template": {
+ "data": {
+ "bar": [
+ {
+ "error_x": {
+ "color": "#2a3f5f"
+ },
+ "error_y": {
+ "color": "#2a3f5f"
+ },
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "bar"
+ }
+ ],
+ "barpolar": [
+ {
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "barpolar"
+ }
+ ],
+ "carpet": [
+ {
+ "aaxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "baxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "type": "carpet"
+ }
+ ],
+ "choropleth": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "choropleth"
+ }
+ ],
+ "contour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "contour"
+ }
+ ],
+ "contourcarpet": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "contourcarpet"
+ }
+ ],
+ "heatmap": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmap"
+ }
+ ],
+ "heatmapgl": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmapgl"
+ }
+ ],
+ "histogram": [
+ {
+ "marker": {
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "histogram"
+ }
+ ],
+ "histogram2d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2d"
+ }
+ ],
+ "histogram2dcontour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2dcontour"
+ }
+ ],
+ "mesh3d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "mesh3d"
+ }
+ ],
+ "parcoords": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "parcoords"
+ }
+ ],
+ "pie": [
+ {
+ "automargin": true,
+ "type": "pie"
+ }
+ ],
+ "scatter": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter"
+ }
+ ],
+ "scatter3d": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter3d"
+ }
+ ],
+ "scattercarpet": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattercarpet"
+ }
+ ],
+ "scattergeo": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergeo"
+ }
+ ],
+ "scattergl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergl"
+ }
+ ],
+ "scattermapbox": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattermapbox"
+ }
+ ],
+ "scatterpolar": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolar"
+ }
+ ],
+ "scatterpolargl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolargl"
+ }
+ ],
+ "scatterternary": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterternary"
+ }
+ ],
+ "surface": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "surface"
+ }
+ ],
+ "table": [
+ {
+ "cells": {
+ "fill": {
+ "color": "#EBF0F8"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "header": {
+ "fill": {
+ "color": "#C8D4E3"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "type": "table"
+ }
+ ]
+ },
+ "layout": {
+ "annotationdefaults": {
+ "arrowcolor": "#2a3f5f",
+ "arrowhead": 0,
+ "arrowwidth": 1
+ },
+ "autotypenumbers": "strict",
+ "coloraxis": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "colorscale": {
+ "diverging": [
+ [
+ 0,
+ "#8e0152"
+ ],
+ [
+ 0.1,
+ "#c51b7d"
+ ],
+ [
+ 0.2,
+ "#de77ae"
+ ],
+ [
+ 0.3,
+ "#f1b6da"
+ ],
+ [
+ 0.4,
+ "#fde0ef"
+ ],
+ [
+ 0.5,
+ "#f7f7f7"
+ ],
+ [
+ 0.6,
+ "#e6f5d0"
+ ],
+ [
+ 0.7,
+ "#b8e186"
+ ],
+ [
+ 0.8,
+ "#7fbc41"
+ ],
+ [
+ 0.9,
+ "#4d9221"
+ ],
+ [
+ 1,
+ "#276419"
+ ]
+ ],
+ "sequential": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "sequentialminus": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ]
+ },
+ "colorway": [
+ "#636efa",
+ "#EF553B",
+ "#00cc96",
+ "#ab63fa",
+ "#FFA15A",
+ "#19d3f3",
+ "#FF6692",
+ "#B6E880",
+ "#FF97FF",
+ "#FECB52"
+ ],
+ "font": {
+ "color": "#2a3f5f"
+ },
+ "geo": {
+ "bgcolor": "white",
+ "lakecolor": "white",
+ "landcolor": "#E5ECF6",
+ "showlakes": true,
+ "showland": true,
+ "subunitcolor": "white"
+ },
+ "hoverlabel": {
+ "align": "left"
+ },
+ "hovermode": "closest",
+ "mapbox": {
+ "style": "light"
+ },
+ "paper_bgcolor": "white",
+ "plot_bgcolor": "#E5ECF6",
+ "polar": {
+ "angularaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "radialaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "scene": {
+ "xaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "yaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "zaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ }
+ },
+ "shapedefaults": {
+ "line": {
+ "color": "#2a3f5f"
+ }
+ },
+ "ternary": {
+ "aaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "baxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "caxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "title": {
+ "x": 0.05
+ },
+ "xaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ },
+ "yaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ }
+ }
+ },
+ "title": {
+ "text": "Correct rate of each KC(Default) (top 20) (total transactions of each KC are required to be more than 300)"
+ },
+ "xaxis": {
+ "anchor": "y",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "correct rate"
+ }
+ },
+ "yaxis": {
+ "anchor": "x",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "KC(Default)"
+ },
+ "visible": false
+ }
+ }
+ },
+ "text/html": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "application/vnd.plotly.v1+json": {
+ "config": {
+ "plotlyServerURL": "https://plot.ly"
+ },
+ "data": [
+ {
+ "alignmentgroup": "True",
+ "hovertemplate": "correct rate=%{x} KC(Default)=%{text} ",
+ "legendgroup": "",
+ "marker": {
+ "color": "#636efa",
+ "pattern": {
+ "shape": ""
+ }
+ },
+ "name": "",
+ "offsetgroup": "",
+ "orientation": "h",
+ "showlegend": false,
+ "text": [
+ "unknown bug element-",
+ "[SkillRule: Done?; {doneleft; doneright; doneleft, no menu; doneright, nomenu; done no solution; Done No Solution, domain exception; Done No Solution, range exception; done infinite solutions; done expr, standard form; done expr, standard form, no menu}]-",
+ "unknown bug element~~CLT-ROW-1-COEFF-",
+ "unknown bug element~~CLT-ROW-1-",
+ "Plot terminating improper fractions-",
+ "Plot decimal - thousandths-",
+ "Plot terminating mixed number-",
+ "Plot imperfect radical-",
+ "Plot decimal - hundredths-",
+ "Setting the slope-",
+ "Plot terminating proper fraction-",
+ "Plot percent-",
+ "Plot non-terminating proper fraction-",
+ "Plot non-terminating improper fraction-",
+ "Entering slope, GLF-",
+ "Placing coordinate point-",
+ "Plot decimal - tenths-",
+ "Finding the intersection, SIF-",
+ "Finding the intersection, Mixed-",
+ "[Rule: CLT nested, no LCD ([SolverOperation clt],)]-"
+ ],
+ "textposition": "auto",
+ "type": "bar",
+ "x": [
+ 0,
+ 0,
+ 0.014072847682119206,
+ 0.014477766287487074,
+ 0.14357160282995854,
+ 0.19899605593402653,
+ 0.20798727690404664,
+ 0.2222222222222222,
+ 0.22783707253321717,
+ 0.2335486778846154,
+ 0.24001726742931145,
+ 0.2440093512565751,
+ 0.26698670605613,
+ 0.29059485530546625,
+ 0.3,
+ 0.3083741984156922,
+ 0.3113859585303747,
+ 0.3129411764705882,
+ 0.31905781584582443,
+ 0.3395784543325527
+ ],
+ "xaxis": "x",
+ "y": [
+ "unknown bug element-",
+ "[SkillRule: Done?; {doneleft; doneright; doneleft, no menu; doneright, nomenu; done no solution; Done No Solution, domain exception; Done No Solution, range exception; done infinite solutions; done expr, standard form; done expr, standard form, no menu}]-",
+ "unknown bug element~~CLT-ROW-1-COEFF-",
+ "unknown bug element~~CLT-ROW-1-",
+ "Plot terminating improper fractions-",
+ "Plot decimal - thousandths-",
+ "Plot terminating mixed number-",
+ "Plot imperfect radical-",
+ "Plot decimal - hundredths-",
+ "Setting the slope-",
+ "Plot terminating proper fraction-",
+ "Plot percent-",
+ "Plot non-terminating proper fraction-",
+ "Plot non-terminating improper fraction-",
+ "Entering slope, GLF-",
+ "Placing coordinate point-",
+ "Plot decimal - tenths-",
+ "Finding the intersection, SIF-",
+ "Finding the intersection, Mixed-",
+ "[Rule: CLT nested, no LCD ([SolverOperation clt],)]-"
+ ],
+ "yaxis": "y"
+ }
+ ],
+ "layout": {
+ "barmode": "relative",
+ "legend": {
+ "tracegroupgap": 0
+ },
+ "template": {
+ "data": {
+ "bar": [
+ {
+ "error_x": {
+ "color": "#2a3f5f"
+ },
+ "error_y": {
+ "color": "#2a3f5f"
+ },
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "bar"
+ }
+ ],
+ "barpolar": [
+ {
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "barpolar"
+ }
+ ],
+ "carpet": [
+ {
+ "aaxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "baxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "type": "carpet"
+ }
+ ],
+ "choropleth": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "choropleth"
+ }
+ ],
+ "contour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "contour"
+ }
+ ],
+ "contourcarpet": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "contourcarpet"
+ }
+ ],
+ "heatmap": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmap"
+ }
+ ],
+ "heatmapgl": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmapgl"
+ }
+ ],
+ "histogram": [
+ {
+ "marker": {
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "histogram"
+ }
+ ],
+ "histogram2d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2d"
+ }
+ ],
+ "histogram2dcontour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2dcontour"
+ }
+ ],
+ "mesh3d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "mesh3d"
+ }
+ ],
+ "parcoords": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "parcoords"
+ }
+ ],
+ "pie": [
+ {
+ "automargin": true,
+ "type": "pie"
+ }
+ ],
+ "scatter": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter"
+ }
+ ],
+ "scatter3d": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter3d"
+ }
+ ],
+ "scattercarpet": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattercarpet"
+ }
+ ],
+ "scattergeo": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergeo"
+ }
+ ],
+ "scattergl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergl"
+ }
+ ],
+ "scattermapbox": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattermapbox"
+ }
+ ],
+ "scatterpolar": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolar"
+ }
+ ],
+ "scatterpolargl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolargl"
+ }
+ ],
+ "scatterternary": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterternary"
+ }
+ ],
+ "surface": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "surface"
+ }
+ ],
+ "table": [
+ {
+ "cells": {
+ "fill": {
+ "color": "#EBF0F8"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "header": {
+ "fill": {
+ "color": "#C8D4E3"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "type": "table"
+ }
+ ]
+ },
+ "layout": {
+ "annotationdefaults": {
+ "arrowcolor": "#2a3f5f",
+ "arrowhead": 0,
+ "arrowwidth": 1
+ },
+ "autotypenumbers": "strict",
+ "coloraxis": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "colorscale": {
+ "diverging": [
+ [
+ 0,
+ "#8e0152"
+ ],
+ [
+ 0.1,
+ "#c51b7d"
+ ],
+ [
+ 0.2,
+ "#de77ae"
+ ],
+ [
+ 0.3,
+ "#f1b6da"
+ ],
+ [
+ 0.4,
+ "#fde0ef"
+ ],
+ [
+ 0.5,
+ "#f7f7f7"
+ ],
+ [
+ 0.6,
+ "#e6f5d0"
+ ],
+ [
+ 0.7,
+ "#b8e186"
+ ],
+ [
+ 0.8,
+ "#7fbc41"
+ ],
+ [
+ 0.9,
+ "#4d9221"
+ ],
+ [
+ 1,
+ "#276419"
+ ]
+ ],
+ "sequential": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "sequentialminus": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ]
+ },
+ "colorway": [
+ "#636efa",
+ "#EF553B",
+ "#00cc96",
+ "#ab63fa",
+ "#FFA15A",
+ "#19d3f3",
+ "#FF6692",
+ "#B6E880",
+ "#FF97FF",
+ "#FECB52"
+ ],
+ "font": {
+ "color": "#2a3f5f"
+ },
+ "geo": {
+ "bgcolor": "white",
+ "lakecolor": "white",
+ "landcolor": "#E5ECF6",
+ "showlakes": true,
+ "showland": true,
+ "subunitcolor": "white"
+ },
+ "hoverlabel": {
+ "align": "left"
+ },
+ "hovermode": "closest",
+ "mapbox": {
+ "style": "light"
+ },
+ "paper_bgcolor": "white",
+ "plot_bgcolor": "#E5ECF6",
+ "polar": {
+ "angularaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "radialaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "scene": {
+ "xaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "yaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "zaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ }
+ },
+ "shapedefaults": {
+ "line": {
+ "color": "#2a3f5f"
+ }
+ },
+ "ternary": {
+ "aaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "baxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "caxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "title": {
+ "x": 0.05
+ },
+ "xaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ },
+ "yaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ }
+ }
+ },
+ "title": {
+ "font": {
+ "size": 15
+ },
+ "text": "Correct rate of each KC(Default) (bottom 20) (total transactions of each KC are required to be more than 300)"
+ },
+ "xaxis": {
+ "anchor": "y",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "correct rate"
+ }
+ },
+ "yaxis": {
+ "anchor": "x",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "KC(Default)"
+ },
+ "visible": false
+ }
+ }
+ },
+ "text/html": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "data.dropna(subset=['KC(Default)'], inplace=True)\n",
+ "\n",
+ "data['total transactions'] = data['Corrects'] + data['Hints'] + data['Incorrects']\n",
+ "df1 = data.groupby('KC(Default)')['total transactions'].sum().reset_index()\n",
+ "df2 = data.groupby('KC(Default)')['Corrects'].sum().reset_index()\n",
+ "df1['Corrects'] = df2['Corrects']\n",
+ "df1['correct rate'] = df1['Corrects'] / df1['total transactions']\n",
+ "\n",
+ "count = 0\n",
+ "standard = 300\n",
+ "for i in df1['total transactions']:\n",
+ " if i > standard:\n",
+ " count += 1\n",
+ "df1 = df1.sort_values('total transactions').tail(count)\n",
+ "\n",
+ "df1 = df1.sort_values('correct rate')\n",
+ "\n",
+ "df1['KC(Default)'] = df1['KC(Default)'].astype(str) + '-'\n",
+ "\n",
+ "df_px = df1.tail(20)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df_px,\n",
+ " x='correct rate',\n",
+ " y='KC(Default)',\n",
+ " orientation='h',\n",
+ " title='Correct rate of each KC(Default) (top 20) (total transactions of \\\n",
+ "each KC are required to be more than 300)',\n",
+ " text='KC(Default)'\n",
+ ")\n",
+ "fig.update_yaxes(visible=False)\n",
+ "fig.show()\n",
+ "\n",
+ "df_px = df1.head(20)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df_px,\n",
+ " x='correct rate',\n",
+ " y='KC(Default)',\n",
+ " orientation='h',\n",
+ " title='Correct rate of each KC(Default) (bottom 20) (total transactions of \\\n",
+ "each KC are required to be more than 300)',\n",
+ " text='KC(Default)'\n",
+ ")\n",
+ "fig.update_yaxes(visible=False)\n",
+ "fig.update_layout(title_font_size=15)\n",
+ "fig.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0feef8a1",
+ "metadata": {},
+ "source": [
+ "These two figures present the correct rate of KCs. KCs with low correct rate deserve more attention from teachers and students."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "22d99527",
+ "metadata": {},
+ "source": [
+ "## Postscript"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "09bc0903",
+ "metadata": {},
+ "source": [
+ "Given that the whole data package is composed of 5 data sets and data files in these 5 data sets that can be used to conduct data analysis share the same data format, the following analysis based on \"algebra_2006_2007_train\" is just an example of data analysis on KDD Cup, and the code can be used to analyse other data files with some small changes on the file path and column names.\n"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.9.6"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
From b6e28c0cc398d58480710b86b733be3fc1ca170f Mon Sep 17 00:00:00 2001
From: gguu1314 <88154940+gguu1314@users.noreply.github.com>
Date: Sat, 14 Aug 2021 12:34:21 +0800
Subject: [PATCH 06/10] Delete KDD Cup 2010.ipynb
---
docs/KDD Cup 2010.ipynb | 4925 ---------------------------------------
1 file changed, 4925 deletions(-)
delete mode 100644 docs/KDD Cup 2010.ipynb
diff --git a/docs/KDD Cup 2010.ipynb b/docs/KDD Cup 2010.ipynb
deleted file mode 100644
index a662588..0000000
--- a/docs/KDD Cup 2010.ipynb
+++ /dev/null
@@ -1,4925 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "id": "e002fdf8",
- "metadata": {},
- "source": [
- "# KDD Cup 2010 —— Data Analysis on algebra_2006_2007_train"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "429152ff",
- "metadata": {},
- "source": [
- "## Data Description"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "2c89d116",
- "metadata": {},
- "source": [
- "### Column Description"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "f590eee5",
- "metadata": {},
- "source": [
- "| Attribute | Annotaion |\n",
- "|:--:|---|\n",
- "|Row|The row number|\n",
- "| Anon Student Id | Unique, anonymous identifier for a student |\n",
- "| Problem Hierarchy | The hierarchy of curriculum levels containing the problem |\n",
- "| Problem Name | Unique identifier for a problem |\n",
- "| Problem View | The total number of times the student encountered the problem so far |\n",
- "| Step Name | Unique identifier for one of the steps in a problem |\n",
- "| Step Start Time | The starting time of the step (Can be null) |\n",
- "| First Transaction Time | The time of the first transaction toward the step |\n",
- "| Correct Transaction Time | The time of the correct attempt toward the step, if there was one |\n",
- "| Step End Time | The time of the last transaction toward the step |\n",
- "| Step Duration (sec) | The elapsed time of the step in seconds, calculated by adding all of the durations for transactions that were attributed to the step (Can be null if step start time is null) |\n",
- "| Correct Step Duration (sec) | The step duration if the first attempt for the step was correct |\n",
- "| Error Step Duration (sec) | The step duration if the first attempt for the step was an error (incorrect attempt or hint request) |\n",
- "| Correct First Attempt | The tutor's evaluation of the student's first attempt on the step—1 if correct, 0 if an error |\n",
- "| Incorrects | Total number of incorrect attempts by the student on the step |\n",
- "| Hints | Total number of hints requested by the student for the step |\n",
- "| Corrects | Total correct attempts by the student for the step (only increases if the step is encountered more than once) |\n",
- "| KC(KC Model Name) | The identified skills that are used in a problem, where available |\n",
- "| Opportunity(KC Model Name) | A count that increases by one each time the student encounters a step with the listed knowledge component |\n",
- "|| Additional KC models, which exist for the challenge data sets, will appear as additional pairs of columns (KC and Opportunity columns for each model) |"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "2c2a2d3e",
- "metadata": {},
- "source": [
- "For the test portion of the challenge data sets, values will not be provided for the following columns:"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "f19eb949",
- "metadata": {},
- "source": [
- "♦ Step Start Time\n",
- "\n",
- "♦ First Transaction Time\n",
- "\n",
- "♦ Correct Transaction Time\n",
- "\n",
- "♦ Step End Time\n",
- "\n",
- "♦ Step Duration (sec)\n",
- "\n",
- "♦ Correct Step Duration (sec)\n",
- "\n",
- "♦ Error Step Duration (sec)\n",
- "\n",
- "♦ Correct First Attempt\n",
- "\n",
- "♦ Incorrects\n",
- "\n",
- "♦ Hints\n",
- "\n",
- "♦ Corrects"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "id": "123674b7",
- "metadata": {},
- "outputs": [],
- "source": [
- "import pandas as pd\n",
- "import plotly.express as px"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "id": "efa6be16",
- "metadata": {},
- "outputs": [],
- "source": [
- "path = \"algebra_2006_2007_train.txt\"\n",
- "data = pd.read_table(path, encoding=\"ISO-8859-15\", low_memory=False)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "993f1986",
- "metadata": {},
- "source": [
- "## Record Examples"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 3,
- "id": "8b2af14e",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " Row \n",
- " Anon Student Id \n",
- " Problem Hierarchy \n",
- " Problem Name \n",
- " Problem View \n",
- " Step Name \n",
- " Step Start Time \n",
- " First Transaction Time \n",
- " Correct Transaction Time \n",
- " Step End Time \n",
- " Step Duration (sec) \n",
- " Correct Step Duration (sec) \n",
- " Error Step Duration (sec) \n",
- " Correct First Attempt \n",
- " Incorrects \n",
- " Hints \n",
- " Corrects \n",
- " KC(Default) \n",
- " Opportunity(Default) \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " 0 \n",
- " 1 \n",
- " JG4Tz \n",
- " Unit CTA1_01, Section CTA1_01-1 \n",
- " LDEMO_WKST \n",
- " 1 \n",
- " R1C1 \n",
- " 2006-10-26 09:51:58.0 \n",
- " 2006-10-26 09:52:30.0 \n",
- " 2006-10-26 09:53:30.0 \n",
- " 2006-10-26 09:53:30.0 \n",
- " 92.0 \n",
- " NaN \n",
- " 92.0 \n",
- " 0 \n",
- " 2 \n",
- " 0 \n",
- " 1 \n",
- " NaN \n",
- " NaN \n",
- " \n",
- " \n",
- " 1 \n",
- " 2 \n",
- " JG4Tz \n",
- " Unit CTA1_01, Section CTA1_01-1 \n",
- " LDEMO_WKST \n",
- " 1 \n",
- " R1C2 \n",
- " 2006-10-26 09:53:30.0 \n",
- " 2006-10-26 09:53:41.0 \n",
- " 2006-10-26 09:53:41.0 \n",
- " 2006-10-26 09:53:41.0 \n",
- " 11.0 \n",
- " 11.0 \n",
- " NaN \n",
- " 1 \n",
- " 0 \n",
- " 0 \n",
- " 1 \n",
- " NaN \n",
- " NaN \n",
- " \n",
- " \n",
- " 2 \n",
- " 3 \n",
- " JG4Tz \n",
- " Unit CTA1_01, Section CTA1_01-1 \n",
- " LDEMO_WKST \n",
- " 1 \n",
- " R2C1 \n",
- " 2006-10-26 09:53:41.0 \n",
- " 2006-10-26 09:53:46.0 \n",
- " 2006-10-26 09:53:46.0 \n",
- " 2006-10-26 09:53:46.0 \n",
- " 5.0 \n",
- " 5.0 \n",
- " NaN \n",
- " 1 \n",
- " 0 \n",
- " 0 \n",
- " 1 \n",
- " Identifying units \n",
- " 1 \n",
- " \n",
- " \n",
- " 3 \n",
- " 4 \n",
- " JG4Tz \n",
- " Unit CTA1_01, Section CTA1_01-1 \n",
- " LDEMO_WKST \n",
- " 1 \n",
- " R2C2 \n",
- " 2006-10-26 09:53:46.0 \n",
- " 2006-10-26 09:53:50.0 \n",
- " 2006-10-26 09:53:50.0 \n",
- " 2006-10-26 09:53:50.0 \n",
- " 4.0 \n",
- " 4.0 \n",
- " NaN \n",
- " 1 \n",
- " 0 \n",
- " 0 \n",
- " 1 \n",
- " Identifying units \n",
- " 2 \n",
- " \n",
- " \n",
- " 4 \n",
- " 5 \n",
- " JG4Tz \n",
- " Unit CTA1_01, Section CTA1_01-1 \n",
- " LDEMO_WKST \n",
- " 1 \n",
- " R4C1 \n",
- " 2006-10-26 09:53:50.0 \n",
- " 2006-10-26 09:54:05.0 \n",
- " 2006-10-26 09:54:05.0 \n",
- " 2006-10-26 09:54:05.0 \n",
- " 15.0 \n",
- " 15.0 \n",
- " NaN \n",
- " 1 \n",
- " 0 \n",
- " 0 \n",
- " 1 \n",
- " Entering a given \n",
- " 1 \n",
- " \n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " Row Anon Student Id Problem Hierarchy Problem Name \\\n",
- "0 1 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
- "1 2 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
- "2 3 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
- "3 4 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
- "4 5 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
- "\n",
- " Problem View Step Name Step Start Time First Transaction Time \\\n",
- "0 1 R1C1 2006-10-26 09:51:58.0 2006-10-26 09:52:30.0 \n",
- "1 1 R1C2 2006-10-26 09:53:30.0 2006-10-26 09:53:41.0 \n",
- "2 1 R2C1 2006-10-26 09:53:41.0 2006-10-26 09:53:46.0 \n",
- "3 1 R2C2 2006-10-26 09:53:46.0 2006-10-26 09:53:50.0 \n",
- "4 1 R4C1 2006-10-26 09:53:50.0 2006-10-26 09:54:05.0 \n",
- "\n",
- " Correct Transaction Time Step End Time Step Duration (sec) \\\n",
- "0 2006-10-26 09:53:30.0 2006-10-26 09:53:30.0 92.0 \n",
- "1 2006-10-26 09:53:41.0 2006-10-26 09:53:41.0 11.0 \n",
- "2 2006-10-26 09:53:46.0 2006-10-26 09:53:46.0 5.0 \n",
- "3 2006-10-26 09:53:50.0 2006-10-26 09:53:50.0 4.0 \n",
- "4 2006-10-26 09:54:05.0 2006-10-26 09:54:05.0 15.0 \n",
- "\n",
- " Correct Step Duration (sec) Error Step Duration (sec) \\\n",
- "0 NaN 92.0 \n",
- "1 11.0 NaN \n",
- "2 5.0 NaN \n",
- "3 4.0 NaN \n",
- "4 15.0 NaN \n",
- "\n",
- " Correct First Attempt Incorrects Hints Corrects KC(Default) \\\n",
- "0 0 2 0 1 NaN \n",
- "1 1 0 0 1 NaN \n",
- "2 1 0 0 1 Identifying units \n",
- "3 1 0 0 1 Identifying units \n",
- "4 1 0 0 1 Entering a given \n",
- "\n",
- " Opportunity(Default) \n",
- "0 NaN \n",
- "1 NaN \n",
- "2 1 \n",
- "3 2 \n",
- "4 1 "
- ]
- },
- "execution_count": 3,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "pd.set_option('display.max_column', 500)\n",
- "data.head()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 4,
- "id": "9d5e5859",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " Row \n",
- " Problem View \n",
- " Step Duration (sec) \n",
- " Correct Step Duration (sec) \n",
- " Error Step Duration (sec) \n",
- " Correct First Attempt \n",
- " Incorrects \n",
- " Hints \n",
- " Corrects \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " count \n",
- " 2.270384e+06 \n",
- " 2.270384e+06 \n",
- " 2.267551e+06 \n",
- " 1.751638e+06 \n",
- " 515913.000000 \n",
- " 2.270384e+06 \n",
- " 2.270384e+06 \n",
- " 2.270384e+06 \n",
- " 2.270384e+06 \n",
- " \n",
- " \n",
- " mean \n",
- " 1.513120e+06 \n",
- " 1.092910e+00 \n",
- " 1.958364e+01 \n",
- " 1.171716e+01 \n",
- " 46.292087 \n",
- " 7.722359e-01 \n",
- " 4.455044e-01 \n",
- " 1.184311e-01 \n",
- " 1.062878e+00 \n",
- " \n",
- " \n",
- " std \n",
- " 8.736198e+05 \n",
- " 3.448857e-01 \n",
- " 4.768345e+01 \n",
- " 2.645318e+01 \n",
- " 81.817794 \n",
- " 4.193897e-01 \n",
- " 2.000914e+00 \n",
- " 6.199071e-01 \n",
- " 6.894285e-01 \n",
- " \n",
- " \n",
- " min \n",
- " 1.000000e+00 \n",
- " 1.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " \n",
- " \n",
- " 25% \n",
- " 7.577408e+05 \n",
- " 1.000000e+00 \n",
- " 3.000000e+00 \n",
- " 3.000000e+00 \n",
- " 11.000000 \n",
- " 1.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 1.000000e+00 \n",
- " \n",
- " \n",
- " 50% \n",
- " 1.511844e+06 \n",
- " 1.000000e+00 \n",
- " 7.000000e+00 \n",
- " 5.000000e+00 \n",
- " 22.000000 \n",
- " 1.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 1.000000e+00 \n",
- " \n",
- " \n",
- " 75% \n",
- " 2.269432e+06 \n",
- " 1.000000e+00 \n",
- " 1.700000e+01 \n",
- " 1.100000e+01 \n",
- " 47.000000 \n",
- " 1.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 1.000000e+00 \n",
- " \n",
- " \n",
- " max \n",
- " 3.025933e+06 \n",
- " 1.000000e+01 \n",
- " 3.208000e+03 \n",
- " 1.204000e+03 \n",
- " 3208.000000 \n",
- " 1.000000e+00 \n",
- " 3.600000e+02 \n",
- " 1.020000e+02 \n",
- " 9.200000e+01 \n",
- " \n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " Row Problem View Step Duration (sec) \\\n",
- "count 2.270384e+06 2.270384e+06 2.267551e+06 \n",
- "mean 1.513120e+06 1.092910e+00 1.958364e+01 \n",
- "std 8.736198e+05 3.448857e-01 4.768345e+01 \n",
- "min 1.000000e+00 1.000000e+00 0.000000e+00 \n",
- "25% 7.577408e+05 1.000000e+00 3.000000e+00 \n",
- "50% 1.511844e+06 1.000000e+00 7.000000e+00 \n",
- "75% 2.269432e+06 1.000000e+00 1.700000e+01 \n",
- "max 3.025933e+06 1.000000e+01 3.208000e+03 \n",
- "\n",
- " Correct Step Duration (sec) Error Step Duration (sec) \\\n",
- "count 1.751638e+06 515913.000000 \n",
- "mean 1.171716e+01 46.292087 \n",
- "std 2.645318e+01 81.817794 \n",
- "min 0.000000e+00 0.000000 \n",
- "25% 3.000000e+00 11.000000 \n",
- "50% 5.000000e+00 22.000000 \n",
- "75% 1.100000e+01 47.000000 \n",
- "max 1.204000e+03 3208.000000 \n",
- "\n",
- " Correct First Attempt Incorrects Hints Corrects \n",
- "count 2.270384e+06 2.270384e+06 2.270384e+06 2.270384e+06 \n",
- "mean 7.722359e-01 4.455044e-01 1.184311e-01 1.062878e+00 \n",
- "std 4.193897e-01 2.000914e+00 6.199071e-01 6.894285e-01 \n",
- "min 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 \n",
- "25% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
- "50% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
- "75% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
- "max 1.000000e+00 3.600000e+02 1.020000e+02 9.200000e+01 "
- ]
- },
- "execution_count": 4,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "data.describe()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 5,
- "id": "92cc0aab",
- "metadata": {
- "scrolled": false
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Part of missing values for every column\n",
- "Row 0.000000\n",
- "Anon Student Id 0.000000\n",
- "Problem Hierarchy 0.000000\n",
- "Problem Name 0.000000\n",
- "Problem View 0.000000\n",
- "Step Name 0.000000\n",
- "Step Start Time 0.001103\n",
- "First Transaction Time 0.000000\n",
- "Correct Transaction Time 0.034757\n",
- "Step End Time 0.000000\n",
- "Step Duration (sec) 0.001248\n",
- "Correct Step Duration (sec) 0.228484\n",
- "Error Step Duration (sec) 0.772764\n",
- "Correct First Attempt 0.000000\n",
- "Incorrects 0.000000\n",
- "Hints 0.000000\n",
- "Corrects 0.000000\n",
- "KC(Default) 0.203407\n",
- "Opportunity(Default) 0.203407\n",
- "dtype: float64\n"
- ]
- }
- ],
- "source": [
- "print(\"Part of missing values for every column\")\n",
- "print(data.isnull().sum() / len(data))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 6,
- "id": "0187b3b5",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "the number of records:\n",
- "2270384\n"
- ]
- }
- ],
- "source": [
- "print(\"the number of records:\")\n",
- "print(len(data))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 7,
- "id": "701b6633",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "how many students are there in the table:\n",
- "1338\n"
- ]
- }
- ],
- "source": [
- "print(\"how many students are there in the table:\")\n",
- "print(len(data['Anon Student Id'].unique()))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 8,
- "id": "bf7b246f",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "how many problems are there in the table:\n",
- "91913\n"
- ]
- }
- ],
- "source": [
- "print(\"how many problems are there in the table:\")\n",
- "print(len(data['Problem Name'].unique()))"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "e0602c47",
- "metadata": {},
- "source": [
- "## Sort by Anon Student Id"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 9,
- "id": "8051cc2b",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "image/svg+xml": [
- "0 2000 4000 6000 8000 271emtbxq8pa- 3cjD21W- 271zbdm1lcgj- 271swzglvvxm- 271jonvpgijj- 24841uicq- E9dzBix- 7OalbuD- a8YLu01- 271sjweu45ee- MYmjG5R- ug982yk- 271g8nye4tne- 271zzbwqtqht- 271k4y8incfb- 271tt6j61n7d- 271rvro73lce- 7LZr10z- F713eQN- 271g7beuc4s1- Top 40 students by number of steps they have done count Anon Student Id "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "ds = data['Anon Student Id'].value_counts().reset_index()\n",
- "ds.columns = [\n",
- " 'Anon Student Id',\n",
- " 'count'\n",
- "]\n",
- "ds['Anon Student Id'] = ds['Anon Student Id'].astype(str) + '-'\n",
- "ds = ds.sort_values('count').tail(40)\n",
- "\n",
- "fig = px.bar(\n",
- " ds,\n",
- " x='count',\n",
- " y='Anon Student Id',\n",
- " orientation='h',\n",
- " title='Top 40 students by number of steps they have done'\n",
- ")\n",
- "fig.show(\"svg\")"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "66ef53ad",
- "metadata": {},
- "source": [
- "## Percent of corrects, hints and incorrects"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 10,
- "id": "c8f1539c",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "image/svg+xml": [
- "65.3% 27.4% 7.28% corrects incorrects hints Percent of corrects, hints and incorrects "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "count_corrects = data['Corrects'].sum()\n",
- "count_hints = data['Hints'].sum()\n",
- "count_incorrects = data['Incorrects'].sum()\n",
- "\n",
- "total = count_corrects + count_hints + count_incorrects\n",
- "\n",
- "percent_corrects = count_corrects / total\n",
- "percent_hints = count_hints / total\n",
- "percent_incorrects = count_incorrects / total\n",
- "\n",
- "dfl = [['corrects', percent_corrects], ['hints', percent_hints], ['incorrects', percent_incorrects]]\n",
- "\n",
- "df = pd.DataFrame(dfl, columns=['transaction type', 'percent'])\n",
- "\n",
- "fig = px.pie(\n",
- " df,\n",
- " names=['corrects', 'hints', 'incorrects'],\n",
- " values='percent',\n",
- " title='Percent of corrects, hints and incorrects'\n",
- ")\n",
- "fig.show(\"svg\")"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "3b097141",
- "metadata": {},
- "source": [
- "## Sort by Problem Name"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 11,
- "id": "6d668c43",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "image/svg+xml": [
- "0 200 400 600 800 1000 1PTFB11- DISTFB03_SP- DIST11_SP- PROP06- DIST09_SP- DIST10_SP- EG4-CONSTANT 3(x+2) = 15- JAN06- FEB04- NOV13- PROP03- PROP12- FEB11- RATIO2-001- PROP05- JAN09- PROP04- EG1-CONSTANT 7(8+2)- LDEMO_WSLVR- L5FB16- Top 40 useful problem count Problem Name "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "storeProblemCount = [1]\n",
- "storeProblemName = [data['Problem Name'][0]]\n",
- "currentProblemName = data['Problem Name'][0]\n",
- "currentStepName = [data['Step Name'][0]]\n",
- "lastIndex = 0\n",
- "\n",
- "for i in range(1, len(data), 1):\n",
- " pbNameI = data['Problem Name'][i]\n",
- " stNameI = data['Step Name'][i]\n",
- " if pbNameI != data['Problem Name'][lastIndex]:\n",
- " currentStepName = [stNameI]\n",
- " currentProblemName = pbNameI\n",
- " if pbNameI not in storeProblemName:\n",
- " storeProblemName.append(pbNameI)\n",
- " storeProblemCount.append(1)\n",
- " else:\n",
- " storeProblemCount[storeProblemName.index(pbNameI)] += 1\n",
- " lastIndex = i\n",
- " elif stNameI not in currentStepName:\n",
- " currentStepName.append(stNameI)\n",
- " lastIndex = i\n",
- " else:\n",
- " currentStepName = [stNameI]\n",
- " storeProblemCount[storeProblemName.index(pbNameI)] += 1\n",
- " lastIndex = i\n",
- "\n",
- "dfData = {\n",
- " 'Problem Name': storeProblemName,\n",
- " 'count': storeProblemCount\n",
- "}\n",
- "df = pd.DataFrame(dfData).sort_values('count').tail(40)\n",
- "df[\"Problem Name\"] += '-'\n",
- "\n",
- "fig = px.bar(\n",
- " df,\n",
- " x='count',\n",
- " y='Problem Name',\n",
- " orientation='h',\n",
- " title='Top 40 useful problem'\n",
- ")\n",
- "fig.show(\"svg\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 12,
- "id": "1b965aa4",
- "metadata": {
- "scrolled": false
- },
- "outputs": [
- {
- "data": {
- "text/html": [
- " \n",
- " "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "application/vnd.plotly.v1+json": {
- "config": {
- "plotlyServerURL": "https://plot.ly"
- },
- "data": [
- {
- "alignmentgroup": "True",
- "hovertemplate": "Correct rate=%{x} Problem Name=%{text} ",
- "legendgroup": "",
- "marker": {
- "color": "#636efa",
- "pattern": {
- "shape": ""
- }
- },
- "name": "",
- "offsetgroup": "",
- "orientation": "h",
- "showlegend": false,
- "text": [
- "SYLT-2X&YGE-2X+9-",
- "ROOTS1-001-",
- "SY=2X&Y=-3X+5-",
- "PROBABILITY1-006-",
- "EG-RE-DISTRIB-05 (2x+4)/(3x+6)-",
- "PROBABILITY1-070-",
- "G3X-YLE5&3X-YGE15-",
- "BUSES-",
- "PEANUTS-CASHEWS-",
- "EXPONENT2-012-",
- "TVS3-",
- "PROBABILITY5-001-",
- "EXPONENT3-071-",
- "EXPONENT2-046-",
- "EXPONENT5-001-",
- "PROBABILITY2-001-",
- "GLFM-BUSES-",
- "EG-EPS01-CONSTANT (4(x^2)*y^4)(3(x^3)*y^3)-",
- "PROBABILITY6-002-",
- "PROBABILITY6-001-"
- ],
- "textposition": "auto",
- "type": "bar",
- "x": [
- 0.84,
- 0.8442477876106195,
- 0.848,
- 0.8492462311557789,
- 0.8505096262740657,
- 0.8557142857142858,
- 0.8559622195985832,
- 0.8581843429960077,
- 0.8608278344331134,
- 0.8679504814305364,
- 0.86878612716763,
- 0.878392305049811,
- 0.8799212598425197,
- 0.8896797153024911,
- 0.8966360856269113,
- 0.9195046439628483,
- 0.924791086350975,
- 0.934416715031921,
- 0.9473684210526315,
- 0.9579413392363033
- ],
- "xaxis": "x",
- "y": [
- "SYLT-2X&YGE-2X+9-",
- "ROOTS1-001-",
- "SY=2X&Y=-3X+5-",
- "PROBABILITY1-006-",
- "EG-RE-DISTRIB-05 (2x+4)/(3x+6)-",
- "PROBABILITY1-070-",
- "G3X-YLE5&3X-YGE15-",
- "BUSES-",
- "PEANUTS-CASHEWS-",
- "EXPONENT2-012-",
- "TVS3-",
- "PROBABILITY5-001-",
- "EXPONENT3-071-",
- "EXPONENT2-046-",
- "EXPONENT5-001-",
- "PROBABILITY2-001-",
- "GLFM-BUSES-",
- "EG-EPS01-CONSTANT (4(x^2)*y^4)(3(x^3)*y^3)-",
- "PROBABILITY6-002-",
- "PROBABILITY6-001-"
- ],
- "yaxis": "y"
- }
- ],
- "layout": {
- "barmode": "relative",
- "legend": {
- "tracegroupgap": 0
- },
- "template": {
- "data": {
- "bar": [
- {
- "error_x": {
- "color": "#2a3f5f"
- },
- "error_y": {
- "color": "#2a3f5f"
- },
- "marker": {
- "line": {
- "color": "#E5ECF6",
- "width": 0.5
- },
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "bar"
- }
- ],
- "barpolar": [
- {
- "marker": {
- "line": {
- "color": "#E5ECF6",
- "width": 0.5
- },
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "barpolar"
- }
- ],
- "carpet": [
- {
- "aaxis": {
- "endlinecolor": "#2a3f5f",
- "gridcolor": "white",
- "linecolor": "white",
- "minorgridcolor": "white",
- "startlinecolor": "#2a3f5f"
- },
- "baxis": {
- "endlinecolor": "#2a3f5f",
- "gridcolor": "white",
- "linecolor": "white",
- "minorgridcolor": "white",
- "startlinecolor": "#2a3f5f"
- },
- "type": "carpet"
- }
- ],
- "choropleth": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "choropleth"
- }
- ],
- "contour": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "contour"
- }
- ],
- "contourcarpet": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "contourcarpet"
- }
- ],
- "heatmap": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "heatmap"
- }
- ],
- "heatmapgl": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "heatmapgl"
- }
- ],
- "histogram": [
- {
- "marker": {
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "histogram"
- }
- ],
- "histogram2d": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "histogram2d"
- }
- ],
- "histogram2dcontour": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "histogram2dcontour"
- }
- ],
- "mesh3d": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "mesh3d"
- }
- ],
- "parcoords": [
- {
- "line": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "parcoords"
- }
- ],
- "pie": [
- {
- "automargin": true,
- "type": "pie"
- }
- ],
- "scatter": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatter"
- }
- ],
- "scatter3d": [
- {
- "line": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatter3d"
- }
- ],
- "scattercarpet": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattercarpet"
- }
- ],
- "scattergeo": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattergeo"
- }
- ],
- "scattergl": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattergl"
- }
- ],
- "scattermapbox": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattermapbox"
- }
- ],
- "scatterpolar": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterpolar"
- }
- ],
- "scatterpolargl": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterpolargl"
- }
- ],
- "scatterternary": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterternary"
- }
- ],
- "surface": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "surface"
- }
- ],
- "table": [
- {
- "cells": {
- "fill": {
- "color": "#EBF0F8"
- },
- "line": {
- "color": "white"
- }
- },
- "header": {
- "fill": {
- "color": "#C8D4E3"
- },
- "line": {
- "color": "white"
- }
- },
- "type": "table"
- }
- ]
- },
- "layout": {
- "annotationdefaults": {
- "arrowcolor": "#2a3f5f",
- "arrowhead": 0,
- "arrowwidth": 1
- },
- "autotypenumbers": "strict",
- "coloraxis": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "colorscale": {
- "diverging": [
- [
- 0,
- "#8e0152"
- ],
- [
- 0.1,
- "#c51b7d"
- ],
- [
- 0.2,
- "#de77ae"
- ],
- [
- 0.3,
- "#f1b6da"
- ],
- [
- 0.4,
- "#fde0ef"
- ],
- [
- 0.5,
- "#f7f7f7"
- ],
- [
- 0.6,
- "#e6f5d0"
- ],
- [
- 0.7,
- "#b8e186"
- ],
- [
- 0.8,
- "#7fbc41"
- ],
- [
- 0.9,
- "#4d9221"
- ],
- [
- 1,
- "#276419"
- ]
- ],
- "sequential": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "sequentialminus": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ]
- },
- "colorway": [
- "#636efa",
- "#EF553B",
- "#00cc96",
- "#ab63fa",
- "#FFA15A",
- "#19d3f3",
- "#FF6692",
- "#B6E880",
- "#FF97FF",
- "#FECB52"
- ],
- "font": {
- "color": "#2a3f5f"
- },
- "geo": {
- "bgcolor": "white",
- "lakecolor": "white",
- "landcolor": "#E5ECF6",
- "showlakes": true,
- "showland": true,
- "subunitcolor": "white"
- },
- "hoverlabel": {
- "align": "left"
- },
- "hovermode": "closest",
- "mapbox": {
- "style": "light"
- },
- "paper_bgcolor": "white",
- "plot_bgcolor": "#E5ECF6",
- "polar": {
- "angularaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "bgcolor": "#E5ECF6",
- "radialaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- }
- },
- "scene": {
- "xaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- },
- "yaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- },
- "zaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- }
- },
- "shapedefaults": {
- "line": {
- "color": "#2a3f5f"
- }
- },
- "ternary": {
- "aaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "baxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "bgcolor": "#E5ECF6",
- "caxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- }
- },
- "title": {
- "x": 0.05
- },
- "xaxis": {
- "automargin": true,
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": "",
- "title": {
- "standoff": 15
- },
- "zerolinecolor": "white",
- "zerolinewidth": 2
- },
- "yaxis": {
- "automargin": true,
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": "",
- "title": {
- "standoff": 15
- },
- "zerolinecolor": "white",
- "zerolinewidth": 2
- }
- }
- },
- "title": {
- "text": "Correct rate of each problem (top 20) (total transactions of each problem are required to be more than 500)"
- },
- "xaxis": {
- "anchor": "y",
- "domain": [
- 0,
- 1
- ],
- "title": {
- "text": "Correct rate"
- }
- },
- "yaxis": {
- "anchor": "x",
- "domain": [
- 0,
- 1
- ],
- "title": {
- "text": "Problem Name"
- }
- }
- }
- },
- "text/html": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "application/vnd.plotly.v1+json": {
- "config": {
- "plotlyServerURL": "https://plot.ly"
- },
- "data": [
- {
- "alignmentgroup": "True",
- "hovertemplate": "Correct rate=%{x} Problem Name=%{text} ",
- "legendgroup": "",
- "marker": {
- "color": "#636efa",
- "pattern": {
- "shape": ""
- }
- },
- "name": "",
- "offsetgroup": "",
- "orientation": "h",
- "showlegend": false,
- "text": [
- "RATIONAL1-091-",
- "RATIONAL1-165-",
- "RATIONAL1-034-",
- "RATIONAL1-058-",
- "RATIONAL1-075-",
- "RATIONAL1-035-",
- "RATIONAL1-281-",
- "RATIONAL1-261-",
- "RATIONAL1-177-",
- "RATIONAL1-064-",
- "RATIONAL1-147-",
- "RATIONAL1-021-",
- "RATIONAL1-121-",
- "RATIONAL1-008-",
- "RATIONAL1-288-",
- "RATIONAL1-109-",
- "BH1T31B-",
- "RXMX_3C-",
- "EG-RE-CANCEL-04 ((x-9)^3)/(x-9)^4-",
- "RATIONAL2-205-"
- ],
- "textposition": "auto",
- "type": "bar",
- "x": [
- 0.07532467532467532,
- 0.07955596669750231,
- 0.0807799442896936,
- 0.0975609756097561,
- 0.09904153354632587,
- 0.1016566265060241,
- 0.1198501872659176,
- 0.1217564870259481,
- 0.1288888888888889,
- 0.13678756476683937,
- 0.14023732470334413,
- 0.1404303510758777,
- 0.14186851211072665,
- 0.16796875,
- 0.17764471057884232,
- 0.24912280701754386,
- 0.3157172271791352,
- 0.329510366122629,
- 0.3380703066566941,
- 0.3424479166666667
- ],
- "xaxis": "x",
- "y": [
- "RATIONAL1-091-",
- "RATIONAL1-165-",
- "RATIONAL1-034-",
- "RATIONAL1-058-",
- "RATIONAL1-075-",
- "RATIONAL1-035-",
- "RATIONAL1-281-",
- "RATIONAL1-261-",
- "RATIONAL1-177-",
- "RATIONAL1-064-",
- "RATIONAL1-147-",
- "RATIONAL1-021-",
- "RATIONAL1-121-",
- "RATIONAL1-008-",
- "RATIONAL1-288-",
- "RATIONAL1-109-",
- "BH1T31B-",
- "RXMX_3C-",
- "EG-RE-CANCEL-04 ((x-9)^3)/(x-9)^4-",
- "RATIONAL2-205-"
- ],
- "yaxis": "y"
- }
- ],
- "layout": {
- "barmode": "relative",
- "legend": {
- "tracegroupgap": 0
- },
- "template": {
- "data": {
- "bar": [
- {
- "error_x": {
- "color": "#2a3f5f"
- },
- "error_y": {
- "color": "#2a3f5f"
- },
- "marker": {
- "line": {
- "color": "#E5ECF6",
- "width": 0.5
- },
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "bar"
- }
- ],
- "barpolar": [
- {
- "marker": {
- "line": {
- "color": "#E5ECF6",
- "width": 0.5
- },
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "barpolar"
- }
- ],
- "carpet": [
- {
- "aaxis": {
- "endlinecolor": "#2a3f5f",
- "gridcolor": "white",
- "linecolor": "white",
- "minorgridcolor": "white",
- "startlinecolor": "#2a3f5f"
- },
- "baxis": {
- "endlinecolor": "#2a3f5f",
- "gridcolor": "white",
- "linecolor": "white",
- "minorgridcolor": "white",
- "startlinecolor": "#2a3f5f"
- },
- "type": "carpet"
- }
- ],
- "choropleth": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "choropleth"
- }
- ],
- "contour": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "contour"
- }
- ],
- "contourcarpet": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "contourcarpet"
- }
- ],
- "heatmap": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "heatmap"
- }
- ],
- "heatmapgl": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "heatmapgl"
- }
- ],
- "histogram": [
- {
- "marker": {
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "histogram"
- }
- ],
- "histogram2d": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "histogram2d"
- }
- ],
- "histogram2dcontour": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "histogram2dcontour"
- }
- ],
- "mesh3d": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "mesh3d"
- }
- ],
- "parcoords": [
- {
- "line": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "parcoords"
- }
- ],
- "pie": [
- {
- "automargin": true,
- "type": "pie"
- }
- ],
- "scatter": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatter"
- }
- ],
- "scatter3d": [
- {
- "line": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatter3d"
- }
- ],
- "scattercarpet": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattercarpet"
- }
- ],
- "scattergeo": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattergeo"
- }
- ],
- "scattergl": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattergl"
- }
- ],
- "scattermapbox": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattermapbox"
- }
- ],
- "scatterpolar": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterpolar"
- }
- ],
- "scatterpolargl": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterpolargl"
- }
- ],
- "scatterternary": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterternary"
- }
- ],
- "surface": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "surface"
- }
- ],
- "table": [
- {
- "cells": {
- "fill": {
- "color": "#EBF0F8"
- },
- "line": {
- "color": "white"
- }
- },
- "header": {
- "fill": {
- "color": "#C8D4E3"
- },
- "line": {
- "color": "white"
- }
- },
- "type": "table"
- }
- ]
- },
- "layout": {
- "annotationdefaults": {
- "arrowcolor": "#2a3f5f",
- "arrowhead": 0,
- "arrowwidth": 1
- },
- "autotypenumbers": "strict",
- "coloraxis": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "colorscale": {
- "diverging": [
- [
- 0,
- "#8e0152"
- ],
- [
- 0.1,
- "#c51b7d"
- ],
- [
- 0.2,
- "#de77ae"
- ],
- [
- 0.3,
- "#f1b6da"
- ],
- [
- 0.4,
- "#fde0ef"
- ],
- [
- 0.5,
- "#f7f7f7"
- ],
- [
- 0.6,
- "#e6f5d0"
- ],
- [
- 0.7,
- "#b8e186"
- ],
- [
- 0.8,
- "#7fbc41"
- ],
- [
- 0.9,
- "#4d9221"
- ],
- [
- 1,
- "#276419"
- ]
- ],
- "sequential": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "sequentialminus": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ]
- },
- "colorway": [
- "#636efa",
- "#EF553B",
- "#00cc96",
- "#ab63fa",
- "#FFA15A",
- "#19d3f3",
- "#FF6692",
- "#B6E880",
- "#FF97FF",
- "#FECB52"
- ],
- "font": {
- "color": "#2a3f5f"
- },
- "geo": {
- "bgcolor": "white",
- "lakecolor": "white",
- "landcolor": "#E5ECF6",
- "showlakes": true,
- "showland": true,
- "subunitcolor": "white"
- },
- "hoverlabel": {
- "align": "left"
- },
- "hovermode": "closest",
- "mapbox": {
- "style": "light"
- },
- "paper_bgcolor": "white",
- "plot_bgcolor": "#E5ECF6",
- "polar": {
- "angularaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "bgcolor": "#E5ECF6",
- "radialaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- }
- },
- "scene": {
- "xaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- },
- "yaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- },
- "zaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- }
- },
- "shapedefaults": {
- "line": {
- "color": "#2a3f5f"
- }
- },
- "ternary": {
- "aaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "baxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "bgcolor": "#E5ECF6",
- "caxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- }
- },
- "title": {
- "x": 0.05
- },
- "xaxis": {
- "automargin": true,
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": "",
- "title": {
- "standoff": 15
- },
- "zerolinecolor": "white",
- "zerolinewidth": 2
- },
- "yaxis": {
- "automargin": true,
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": "",
- "title": {
- "standoff": 15
- },
- "zerolinecolor": "white",
- "zerolinewidth": 2
- }
- }
- },
- "title": {
- "font": {
- "size": 15
- },
- "text": "Correct rate of each problem (bottom 20) (total transactions of each problem are required to be more than 500)"
- },
- "xaxis": {
- "anchor": "y",
- "domain": [
- 0,
- 1
- ],
- "title": {
- "text": "Correct rate"
- }
- },
- "yaxis": {
- "anchor": "x",
- "domain": [
- 0,
- 1
- ],
- "title": {
- "text": "Problem Name"
- }
- }
- }
- },
- "text/html": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "data['total transactions'] = data['Incorrects'] + data['Hints'] + data['Corrects']\n",
- "df1 = data.groupby('Problem Name')['total transactions'].sum().reset_index()\n",
- "df2 = data.groupby('Problem Name')['Corrects'].sum().reset_index()\n",
- "df1['Corrects'] = df2['Corrects']\n",
- "df1['Correct rate'] = df1['Corrects'] / df1['total transactions']\n",
- "\n",
- "df1 = df1.sort_values('total transactions')\n",
- "count = 0\n",
- "standard = 500\n",
- "for i in df1['total transactions']:\n",
- " if i > standard:\n",
- " count += 1\n",
- "df1 = df1.tail(count)\n",
- "\n",
- "df1 = df1.sort_values('Correct rate')\n",
- "\n",
- "df1['Problem Name'] = df1['Problem Name'].astype(str) + \"-\"\n",
- "\n",
- "df_px = df1.tail(20)\n",
- "\n",
- "fig = px.bar(\n",
- " df_px,\n",
- " x='Correct rate',\n",
- " y='Problem Name',\n",
- " orientation='h',\n",
- " title='Correct rate of each problem (top 20) (total transactions of \\\n",
- "each problem are required to be more than 500)',\n",
- " text='Problem Name'\n",
- ")\n",
- "fig.show(\"svg\")\n",
- "\n",
- "df_px = df1.head(20)\n",
- "\n",
- "fig = px.bar(\n",
- " df_px,\n",
- " x='Correct rate',\n",
- " y='Problem Name',\n",
- " orientation='h',\n",
- " title='Correct rate of each problem (bottom 20) (total transactions of \\\n",
- "each problem are required to be more than 500)',\n",
- " text='Problem Name'\n",
- ")\n",
- "fig.update_layout(title_font_size=15)\n",
- "fig.show(\"svg\")"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "2bee6e04",
- "metadata": {},
- "source": [
- "These two figures present the correct rate of problems. Problems with low correct rate deserve more attention from teachers and students."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "0047e839",
- "metadata": {},
- "source": [
- "## Sort by KC"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 13,
- "id": "c6e910e0",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "application/vnd.plotly.v1+json": {
- "config": {
- "plotlyServerURL": "https://plot.ly"
- },
- "data": [
- {
- "alignmentgroup": "True",
- "hovertemplate": "correct rate=%{x} KC(Default)=%{text} ",
- "legendgroup": "",
- "marker": {
- "color": "#636efa",
- "pattern": {
- "shape": ""
- }
- },
- "name": "",
- "offsetgroup": "",
- "orientation": "h",
- "showlegend": false,
- "text": [
- "[Rule: RF right ([SolverOperation rf],{components with property canReduceFractionsNoMultWhole of right})]-",
- "Enter second extreme in equation-",
- "Proportional-Constant-Expression-Gla:Student-Modeling-Analysis-",
- "Enter Calculated value of rate-",
- "[SkillRule: Combine like terms, no var; CLT]-",
- "Enter square of leg label-",
- "Compare medians - removed outlier-",
- "Enter number of total outcomes in table-",
- "Find square of given leg-",
- "Enter ratio quantity to right of \"to\"-",
- "Enter fractional probability of event-",
- "[SkillRule: Select Multiply; {MT; MT no fraction coeff}]-",
- "Write base of exponential from given whole number as product-",
- "Write decimal multiplier from given scientific notation-",
- "Enter ratio quantity to right of colon-",
- "PM-ROW-1-",
- "Changing axis intervals-",
- "unspecified-",
- "[Rule: unnec-elems ([SolverOperation unnec-elems],)]-",
- "Select second event-"
- ],
- "textposition": "auto",
- "type": "bar",
- "x": [
- 0.9094659771050032,
- 0.9125055236411843,
- 0.9135446685878963,
- 0.9155947136563877,
- 0.9156931229676021,
- 0.9194915254237288,
- 0.9218241042345277,
- 0.9247135842880524,
- 0.9323943661971831,
- 0.9386369807222373,
- 0.9389944134078212,
- 0.9399566697616837,
- 0.9405666897028334,
- 0.940814757878555,
- 0.9422934648581998,
- 0.9431375603676585,
- 0.9706148701992291,
- 0.9769823066841415,
- 0.9838308457711443,
- 0.987603305785124
- ],
- "xaxis": "x",
- "y": [
- "[Rule: RF right ([SolverOperation rf],{components with property canReduceFractionsNoMultWhole of right})]-",
- "Enter second extreme in equation-",
- "Proportional-Constant-Expression-Gla:Student-Modeling-Analysis-",
- "Enter Calculated value of rate-",
- "[SkillRule: Combine like terms, no var; CLT]-",
- "Enter square of leg label-",
- "Compare medians - removed outlier-",
- "Enter number of total outcomes in table-",
- "Find square of given leg-",
- "Enter ratio quantity to right of \"to\"-",
- "Enter fractional probability of event-",
- "[SkillRule: Select Multiply; {MT; MT no fraction coeff}]-",
- "Write base of exponential from given whole number as product-",
- "Write decimal multiplier from given scientific notation-",
- "Enter ratio quantity to right of colon-",
- "PM-ROW-1-",
- "Changing axis intervals-",
- "unspecified-",
- "[Rule: unnec-elems ([SolverOperation unnec-elems],)]-",
- "Select second event-"
- ],
- "yaxis": "y"
- }
- ],
- "layout": {
- "barmode": "relative",
- "legend": {
- "tracegroupgap": 0
- },
- "template": {
- "data": {
- "bar": [
- {
- "error_x": {
- "color": "#2a3f5f"
- },
- "error_y": {
- "color": "#2a3f5f"
- },
- "marker": {
- "line": {
- "color": "#E5ECF6",
- "width": 0.5
- },
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "bar"
- }
- ],
- "barpolar": [
- {
- "marker": {
- "line": {
- "color": "#E5ECF6",
- "width": 0.5
- },
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "barpolar"
- }
- ],
- "carpet": [
- {
- "aaxis": {
- "endlinecolor": "#2a3f5f",
- "gridcolor": "white",
- "linecolor": "white",
- "minorgridcolor": "white",
- "startlinecolor": "#2a3f5f"
- },
- "baxis": {
- "endlinecolor": "#2a3f5f",
- "gridcolor": "white",
- "linecolor": "white",
- "minorgridcolor": "white",
- "startlinecolor": "#2a3f5f"
- },
- "type": "carpet"
- }
- ],
- "choropleth": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "choropleth"
- }
- ],
- "contour": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "contour"
- }
- ],
- "contourcarpet": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "contourcarpet"
- }
- ],
- "heatmap": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "heatmap"
- }
- ],
- "heatmapgl": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "heatmapgl"
- }
- ],
- "histogram": [
- {
- "marker": {
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "histogram"
- }
- ],
- "histogram2d": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "histogram2d"
- }
- ],
- "histogram2dcontour": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "histogram2dcontour"
- }
- ],
- "mesh3d": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "mesh3d"
- }
- ],
- "parcoords": [
- {
- "line": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "parcoords"
- }
- ],
- "pie": [
- {
- "automargin": true,
- "type": "pie"
- }
- ],
- "scatter": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatter"
- }
- ],
- "scatter3d": [
- {
- "line": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatter3d"
- }
- ],
- "scattercarpet": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattercarpet"
- }
- ],
- "scattergeo": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattergeo"
- }
- ],
- "scattergl": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattergl"
- }
- ],
- "scattermapbox": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattermapbox"
- }
- ],
- "scatterpolar": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterpolar"
- }
- ],
- "scatterpolargl": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterpolargl"
- }
- ],
- "scatterternary": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterternary"
- }
- ],
- "surface": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "surface"
- }
- ],
- "table": [
- {
- "cells": {
- "fill": {
- "color": "#EBF0F8"
- },
- "line": {
- "color": "white"
- }
- },
- "header": {
- "fill": {
- "color": "#C8D4E3"
- },
- "line": {
- "color": "white"
- }
- },
- "type": "table"
- }
- ]
- },
- "layout": {
- "annotationdefaults": {
- "arrowcolor": "#2a3f5f",
- "arrowhead": 0,
- "arrowwidth": 1
- },
- "autotypenumbers": "strict",
- "coloraxis": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "colorscale": {
- "diverging": [
- [
- 0,
- "#8e0152"
- ],
- [
- 0.1,
- "#c51b7d"
- ],
- [
- 0.2,
- "#de77ae"
- ],
- [
- 0.3,
- "#f1b6da"
- ],
- [
- 0.4,
- "#fde0ef"
- ],
- [
- 0.5,
- "#f7f7f7"
- ],
- [
- 0.6,
- "#e6f5d0"
- ],
- [
- 0.7,
- "#b8e186"
- ],
- [
- 0.8,
- "#7fbc41"
- ],
- [
- 0.9,
- "#4d9221"
- ],
- [
- 1,
- "#276419"
- ]
- ],
- "sequential": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "sequentialminus": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ]
- },
- "colorway": [
- "#636efa",
- "#EF553B",
- "#00cc96",
- "#ab63fa",
- "#FFA15A",
- "#19d3f3",
- "#FF6692",
- "#B6E880",
- "#FF97FF",
- "#FECB52"
- ],
- "font": {
- "color": "#2a3f5f"
- },
- "geo": {
- "bgcolor": "white",
- "lakecolor": "white",
- "landcolor": "#E5ECF6",
- "showlakes": true,
- "showland": true,
- "subunitcolor": "white"
- },
- "hoverlabel": {
- "align": "left"
- },
- "hovermode": "closest",
- "mapbox": {
- "style": "light"
- },
- "paper_bgcolor": "white",
- "plot_bgcolor": "#E5ECF6",
- "polar": {
- "angularaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "bgcolor": "#E5ECF6",
- "radialaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- }
- },
- "scene": {
- "xaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- },
- "yaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- },
- "zaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- }
- },
- "shapedefaults": {
- "line": {
- "color": "#2a3f5f"
- }
- },
- "ternary": {
- "aaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "baxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "bgcolor": "#E5ECF6",
- "caxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- }
- },
- "title": {
- "x": 0.05
- },
- "xaxis": {
- "automargin": true,
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": "",
- "title": {
- "standoff": 15
- },
- "zerolinecolor": "white",
- "zerolinewidth": 2
- },
- "yaxis": {
- "automargin": true,
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": "",
- "title": {
- "standoff": 15
- },
- "zerolinecolor": "white",
- "zerolinewidth": 2
- }
- }
- },
- "title": {
- "text": "Correct rate of each KC(Default) (top 20) (total transactions of each KC are required to be more than 300)"
- },
- "xaxis": {
- "anchor": "y",
- "domain": [
- 0,
- 1
- ],
- "title": {
- "text": "correct rate"
- }
- },
- "yaxis": {
- "anchor": "x",
- "domain": [
- 0,
- 1
- ],
- "title": {
- "text": "KC(Default)"
- },
- "visible": false
- }
- }
- },
- "text/html": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "application/vnd.plotly.v1+json": {
- "config": {
- "plotlyServerURL": "https://plot.ly"
- },
- "data": [
- {
- "alignmentgroup": "True",
- "hovertemplate": "correct rate=%{x} KC(Default)=%{text} ",
- "legendgroup": "",
- "marker": {
- "color": "#636efa",
- "pattern": {
- "shape": ""
- }
- },
- "name": "",
- "offsetgroup": "",
- "orientation": "h",
- "showlegend": false,
- "text": [
- "unknown bug element-",
- "[SkillRule: Done?; {doneleft; doneright; doneleft, no menu; doneright, nomenu; done no solution; Done No Solution, domain exception; Done No Solution, range exception; done infinite solutions; done expr, standard form; done expr, standard form, no menu}]-",
- "unknown bug element~~CLT-ROW-1-COEFF-",
- "unknown bug element~~CLT-ROW-1-",
- "Plot terminating improper fractions-",
- "Plot decimal - thousandths-",
- "Plot terminating mixed number-",
- "Plot imperfect radical-",
- "Plot decimal - hundredths-",
- "Setting the slope-",
- "Plot terminating proper fraction-",
- "Plot percent-",
- "Plot non-terminating proper fraction-",
- "Plot non-terminating improper fraction-",
- "Entering slope, GLF-",
- "Placing coordinate point-",
- "Plot decimal - tenths-",
- "Finding the intersection, SIF-",
- "Finding the intersection, Mixed-",
- "[Rule: CLT nested, no LCD ([SolverOperation clt],)]-"
- ],
- "textposition": "auto",
- "type": "bar",
- "x": [
- 0,
- 0,
- 0.014072847682119206,
- 0.014477766287487074,
- 0.14357160282995854,
- 0.19899605593402653,
- 0.20798727690404664,
- 0.2222222222222222,
- 0.22783707253321717,
- 0.2335486778846154,
- 0.24001726742931145,
- 0.2440093512565751,
- 0.26698670605613,
- 0.29059485530546625,
- 0.3,
- 0.3083741984156922,
- 0.3113859585303747,
- 0.3129411764705882,
- 0.31905781584582443,
- 0.3395784543325527
- ],
- "xaxis": "x",
- "y": [
- "unknown bug element-",
- "[SkillRule: Done?; {doneleft; doneright; doneleft, no menu; doneright, nomenu; done no solution; Done No Solution, domain exception; Done No Solution, range exception; done infinite solutions; done expr, standard form; done expr, standard form, no menu}]-",
- "unknown bug element~~CLT-ROW-1-COEFF-",
- "unknown bug element~~CLT-ROW-1-",
- "Plot terminating improper fractions-",
- "Plot decimal - thousandths-",
- "Plot terminating mixed number-",
- "Plot imperfect radical-",
- "Plot decimal - hundredths-",
- "Setting the slope-",
- "Plot terminating proper fraction-",
- "Plot percent-",
- "Plot non-terminating proper fraction-",
- "Plot non-terminating improper fraction-",
- "Entering slope, GLF-",
- "Placing coordinate point-",
- "Plot decimal - tenths-",
- "Finding the intersection, SIF-",
- "Finding the intersection, Mixed-",
- "[Rule: CLT nested, no LCD ([SolverOperation clt],)]-"
- ],
- "yaxis": "y"
- }
- ],
- "layout": {
- "barmode": "relative",
- "legend": {
- "tracegroupgap": 0
- },
- "template": {
- "data": {
- "bar": [
- {
- "error_x": {
- "color": "#2a3f5f"
- },
- "error_y": {
- "color": "#2a3f5f"
- },
- "marker": {
- "line": {
- "color": "#E5ECF6",
- "width": 0.5
- },
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "bar"
- }
- ],
- "barpolar": [
- {
- "marker": {
- "line": {
- "color": "#E5ECF6",
- "width": 0.5
- },
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "barpolar"
- }
- ],
- "carpet": [
- {
- "aaxis": {
- "endlinecolor": "#2a3f5f",
- "gridcolor": "white",
- "linecolor": "white",
- "minorgridcolor": "white",
- "startlinecolor": "#2a3f5f"
- },
- "baxis": {
- "endlinecolor": "#2a3f5f",
- "gridcolor": "white",
- "linecolor": "white",
- "minorgridcolor": "white",
- "startlinecolor": "#2a3f5f"
- },
- "type": "carpet"
- }
- ],
- "choropleth": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "choropleth"
- }
- ],
- "contour": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "contour"
- }
- ],
- "contourcarpet": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "contourcarpet"
- }
- ],
- "heatmap": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "heatmap"
- }
- ],
- "heatmapgl": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "heatmapgl"
- }
- ],
- "histogram": [
- {
- "marker": {
- "pattern": {
- "fillmode": "overlay",
- "size": 10,
- "solidity": 0.2
- }
- },
- "type": "histogram"
- }
- ],
- "histogram2d": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "histogram2d"
- }
- ],
- "histogram2dcontour": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "histogram2dcontour"
- }
- ],
- "mesh3d": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "type": "mesh3d"
- }
- ],
- "parcoords": [
- {
- "line": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "parcoords"
- }
- ],
- "pie": [
- {
- "automargin": true,
- "type": "pie"
- }
- ],
- "scatter": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatter"
- }
- ],
- "scatter3d": [
- {
- "line": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatter3d"
- }
- ],
- "scattercarpet": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattercarpet"
- }
- ],
- "scattergeo": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattergeo"
- }
- ],
- "scattergl": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattergl"
- }
- ],
- "scattermapbox": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scattermapbox"
- }
- ],
- "scatterpolar": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterpolar"
- }
- ],
- "scatterpolargl": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterpolargl"
- }
- ],
- "scatterternary": [
- {
- "marker": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "type": "scatterternary"
- }
- ],
- "surface": [
- {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- },
- "colorscale": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "type": "surface"
- }
- ],
- "table": [
- {
- "cells": {
- "fill": {
- "color": "#EBF0F8"
- },
- "line": {
- "color": "white"
- }
- },
- "header": {
- "fill": {
- "color": "#C8D4E3"
- },
- "line": {
- "color": "white"
- }
- },
- "type": "table"
- }
- ]
- },
- "layout": {
- "annotationdefaults": {
- "arrowcolor": "#2a3f5f",
- "arrowhead": 0,
- "arrowwidth": 1
- },
- "autotypenumbers": "strict",
- "coloraxis": {
- "colorbar": {
- "outlinewidth": 0,
- "ticks": ""
- }
- },
- "colorscale": {
- "diverging": [
- [
- 0,
- "#8e0152"
- ],
- [
- 0.1,
- "#c51b7d"
- ],
- [
- 0.2,
- "#de77ae"
- ],
- [
- 0.3,
- "#f1b6da"
- ],
- [
- 0.4,
- "#fde0ef"
- ],
- [
- 0.5,
- "#f7f7f7"
- ],
- [
- 0.6,
- "#e6f5d0"
- ],
- [
- 0.7,
- "#b8e186"
- ],
- [
- 0.8,
- "#7fbc41"
- ],
- [
- 0.9,
- "#4d9221"
- ],
- [
- 1,
- "#276419"
- ]
- ],
- "sequential": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ],
- "sequentialminus": [
- [
- 0,
- "#0d0887"
- ],
- [
- 0.1111111111111111,
- "#46039f"
- ],
- [
- 0.2222222222222222,
- "#7201a8"
- ],
- [
- 0.3333333333333333,
- "#9c179e"
- ],
- [
- 0.4444444444444444,
- "#bd3786"
- ],
- [
- 0.5555555555555556,
- "#d8576b"
- ],
- [
- 0.6666666666666666,
- "#ed7953"
- ],
- [
- 0.7777777777777778,
- "#fb9f3a"
- ],
- [
- 0.8888888888888888,
- "#fdca26"
- ],
- [
- 1,
- "#f0f921"
- ]
- ]
- },
- "colorway": [
- "#636efa",
- "#EF553B",
- "#00cc96",
- "#ab63fa",
- "#FFA15A",
- "#19d3f3",
- "#FF6692",
- "#B6E880",
- "#FF97FF",
- "#FECB52"
- ],
- "font": {
- "color": "#2a3f5f"
- },
- "geo": {
- "bgcolor": "white",
- "lakecolor": "white",
- "landcolor": "#E5ECF6",
- "showlakes": true,
- "showland": true,
- "subunitcolor": "white"
- },
- "hoverlabel": {
- "align": "left"
- },
- "hovermode": "closest",
- "mapbox": {
- "style": "light"
- },
- "paper_bgcolor": "white",
- "plot_bgcolor": "#E5ECF6",
- "polar": {
- "angularaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "bgcolor": "#E5ECF6",
- "radialaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- }
- },
- "scene": {
- "xaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- },
- "yaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- },
- "zaxis": {
- "backgroundcolor": "#E5ECF6",
- "gridcolor": "white",
- "gridwidth": 2,
- "linecolor": "white",
- "showbackground": true,
- "ticks": "",
- "zerolinecolor": "white"
- }
- },
- "shapedefaults": {
- "line": {
- "color": "#2a3f5f"
- }
- },
- "ternary": {
- "aaxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "baxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- },
- "bgcolor": "#E5ECF6",
- "caxis": {
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": ""
- }
- },
- "title": {
- "x": 0.05
- },
- "xaxis": {
- "automargin": true,
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": "",
- "title": {
- "standoff": 15
- },
- "zerolinecolor": "white",
- "zerolinewidth": 2
- },
- "yaxis": {
- "automargin": true,
- "gridcolor": "white",
- "linecolor": "white",
- "ticks": "",
- "title": {
- "standoff": 15
- },
- "zerolinecolor": "white",
- "zerolinewidth": 2
- }
- }
- },
- "title": {
- "font": {
- "size": 15
- },
- "text": "Correct rate of each KC(Default) (bottom 20) (total transactions of each KC are required to be more than 300)"
- },
- "xaxis": {
- "anchor": "y",
- "domain": [
- 0,
- 1
- ],
- "title": {
- "text": "correct rate"
- }
- },
- "yaxis": {
- "anchor": "x",
- "domain": [
- 0,
- 1
- ],
- "title": {
- "text": "KC(Default)"
- },
- "visible": false
- }
- }
- },
- "text/html": [
- ""
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "data.dropna(subset=['KC(Default)'], inplace=True)\n",
- "\n",
- "data['total transactions'] = data['Corrects'] + data['Hints'] + data['Incorrects']\n",
- "df1 = data.groupby('KC(Default)')['total transactions'].sum().reset_index()\n",
- "df2 = data.groupby('KC(Default)')['Corrects'].sum().reset_index()\n",
- "df1['Corrects'] = df2['Corrects']\n",
- "df1['correct rate'] = df1['Corrects'] / df1['total transactions']\n",
- "\n",
- "count = 0\n",
- "standard = 300\n",
- "for i in df1['total transactions']:\n",
- " if i > standard:\n",
- " count += 1\n",
- "df1 = df1.sort_values('total transactions').tail(count)\n",
- "\n",
- "df1 = df1.sort_values('correct rate')\n",
- "\n",
- "df1['KC(Default)'] = df1['KC(Default)'].astype(str) + '-'\n",
- "\n",
- "df_px = df1.tail(20)\n",
- "\n",
- "fig = px.bar(\n",
- " df_px,\n",
- " x='correct rate',\n",
- " y='KC(Default)',\n",
- " orientation='h',\n",
- " title='Correct rate of each KC(Default) (top 20) (total transactions of \\\n",
- "each KC are required to be more than 300)',\n",
- " text='KC(Default)'\n",
- ")\n",
- "fig.update_yaxes(visible=False)\n",
- "fig.show()\n",
- "\n",
- "df_px = df1.head(20)\n",
- "\n",
- "fig = px.bar(\n",
- " df_px,\n",
- " x='correct rate',\n",
- " y='KC(Default)',\n",
- " orientation='h',\n",
- " title='Correct rate of each KC(Default) (bottom 20) (total transactions of \\\n",
- "each KC are required to be more than 300)',\n",
- " text='KC(Default)'\n",
- ")\n",
- "fig.update_yaxes(visible=False)\n",
- "fig.update_layout(title_font_size=15)\n",
- "fig.show()"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "0feef8a1",
- "metadata": {},
- "source": [
- "These two figures present the correct rate of KCs. KCs with low correct rate deserve more attention from teachers and students."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "22d99527",
- "metadata": {},
- "source": [
- "## Postscript"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "09bc0903",
- "metadata": {},
- "source": [
- "Given that the whole data package is composed of 5 data sets and data files in these 5 data sets that can be used to conduct data analysis share the same data format, the following analysis based on \"algebra_2006_2007_train\" is just an example of data analysis on KDD Cup, and the code can be used to analyse other data files with some small changes on the file path and column names.\n"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3 (ipykernel)",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.9.6"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
From 9c56477121c378ea9f05aefd8c1f17a9bd8085b6 Mon Sep 17 00:00:00 2001
From: gguu1314 <88154940+gguu1314@users.noreply.github.com>
Date: Sat, 14 Aug 2021 12:37:21 +0800
Subject: [PATCH 07/10] Add files via upload
---
docs/KDD Cup 2010.ipynb | 994 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 994 insertions(+)
create mode 100644 docs/KDD Cup 2010.ipynb
diff --git a/docs/KDD Cup 2010.ipynb b/docs/KDD Cup 2010.ipynb
new file mode 100644
index 0000000..19b2dfc
--- /dev/null
+++ b/docs/KDD Cup 2010.ipynb
@@ -0,0 +1,994 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "e002fdf8",
+ "metadata": {},
+ "source": [
+ "# KDD Cup 2010 —— Data Analysis on algebra_2006_2007_train"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "429152ff",
+ "metadata": {},
+ "source": [
+ "## Data Description"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2c89d116",
+ "metadata": {},
+ "source": [
+ "### Column Description"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f590eee5",
+ "metadata": {},
+ "source": [
+ "| Attribute | Annotaion |\n",
+ "|:--:|---|\n",
+ "|Row|The row number|\n",
+ "| Anon Student Id | Unique, anonymous identifier for a student |\n",
+ "| Problem Hierarchy | The hierarchy of curriculum levels containing the problem |\n",
+ "| Problem Name | Unique identifier for a problem |\n",
+ "| Problem View | The total number of times the student encountered the problem so far |\n",
+ "| Step Name | Unique identifier for one of the steps in a problem |\n",
+ "| Step Start Time | The starting time of the step (Can be null) |\n",
+ "| First Transaction Time | The time of the first transaction toward the step |\n",
+ "| Correct Transaction Time | The time of the correct attempt toward the step, if there was one |\n",
+ "| Step End Time | The time of the last transaction toward the step |\n",
+ "| Step Duration (sec) | The elapsed time of the step in seconds, calculated by adding all of the durations for transactions that were attributed to the step (Can be null if step start time is null) |\n",
+ "| Correct Step Duration (sec) | The step duration if the first attempt for the step was correct |\n",
+ "| Error Step Duration (sec) | The step duration if the first attempt for the step was an error (incorrect attempt or hint request) |\n",
+ "| Correct First Attempt | The tutor's evaluation of the student's first attempt on the step—1 if correct, 0 if an error |\n",
+ "| Incorrects | Total number of incorrect attempts by the student on the step |\n",
+ "| Hints | Total number of hints requested by the student for the step |\n",
+ "| Corrects | Total correct attempts by the student for the step (only increases if the step is encountered more than once) |\n",
+ "| KC(KC Model Name) | The identified skills that are used in a problem, where available |\n",
+ "| Opportunity(KC Model Name) | A count that increases by one each time the student encounters a step with the listed knowledge component |\n",
+ "|| Additional KC models, which exist for the challenge data sets, will appear as additional pairs of columns (KC and Opportunity columns for each model) |"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2c2a2d3e",
+ "metadata": {},
+ "source": [
+ "For the test portion of the challenge data sets, values will not be provided for the following columns:"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f19eb949",
+ "metadata": {},
+ "source": [
+ "♦ Step Start Time\n",
+ "\n",
+ "♦ First Transaction Time\n",
+ "\n",
+ "♦ Correct Transaction Time\n",
+ "\n",
+ "♦ Step End Time\n",
+ "\n",
+ "♦ Step Duration (sec)\n",
+ "\n",
+ "♦ Correct Step Duration (sec)\n",
+ "\n",
+ "♦ Error Step Duration (sec)\n",
+ "\n",
+ "♦ Correct First Attempt\n",
+ "\n",
+ "♦ Incorrects\n",
+ "\n",
+ "♦ Hints\n",
+ "\n",
+ "♦ Corrects"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "123674b7",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import pandas as pd\n",
+ "import plotly.express as px"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "id": "efa6be16",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "path = \"algebra_2006_2007_train.txt\"\n",
+ "data = pd.read_table(path, encoding=\"ISO-8859-15\", low_memory=False)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "993f1986",
+ "metadata": {},
+ "source": [
+ "## Record Examples"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "8b2af14e",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " Row \n",
+ " Anon Student Id \n",
+ " Problem Hierarchy \n",
+ " Problem Name \n",
+ " Problem View \n",
+ " Step Name \n",
+ " Step Start Time \n",
+ " First Transaction Time \n",
+ " Correct Transaction Time \n",
+ " Step End Time \n",
+ " Step Duration (sec) \n",
+ " Correct Step Duration (sec) \n",
+ " Error Step Duration (sec) \n",
+ " Correct First Attempt \n",
+ " Incorrects \n",
+ " Hints \n",
+ " Corrects \n",
+ " KC(Default) \n",
+ " Opportunity(Default) \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 \n",
+ " 1 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R1C1 \n",
+ " 2006-10-26 09:51:58.0 \n",
+ " 2006-10-26 09:52:30.0 \n",
+ " 2006-10-26 09:53:30.0 \n",
+ " 2006-10-26 09:53:30.0 \n",
+ " 92.0 \n",
+ " NaN \n",
+ " 92.0 \n",
+ " 0 \n",
+ " 2 \n",
+ " 0 \n",
+ " 1 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " 1 \n",
+ " 2 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R1C2 \n",
+ " 2006-10-26 09:53:30.0 \n",
+ " 2006-10-26 09:53:41.0 \n",
+ " 2006-10-26 09:53:41.0 \n",
+ " 2006-10-26 09:53:41.0 \n",
+ " 11.0 \n",
+ " 11.0 \n",
+ " NaN \n",
+ " 1 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " 2 \n",
+ " 3 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R2C1 \n",
+ " 2006-10-26 09:53:41.0 \n",
+ " 2006-10-26 09:53:46.0 \n",
+ " 2006-10-26 09:53:46.0 \n",
+ " 2006-10-26 09:53:46.0 \n",
+ " 5.0 \n",
+ " 5.0 \n",
+ " NaN \n",
+ " 1 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " Identifying units \n",
+ " 1 \n",
+ " \n",
+ " \n",
+ " 3 \n",
+ " 4 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R2C2 \n",
+ " 2006-10-26 09:53:46.0 \n",
+ " 2006-10-26 09:53:50.0 \n",
+ " 2006-10-26 09:53:50.0 \n",
+ " 2006-10-26 09:53:50.0 \n",
+ " 4.0 \n",
+ " 4.0 \n",
+ " NaN \n",
+ " 1 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " Identifying units \n",
+ " 2 \n",
+ " \n",
+ " \n",
+ " 4 \n",
+ " 5 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R4C1 \n",
+ " 2006-10-26 09:53:50.0 \n",
+ " 2006-10-26 09:54:05.0 \n",
+ " 2006-10-26 09:54:05.0 \n",
+ " 2006-10-26 09:54:05.0 \n",
+ " 15.0 \n",
+ " 15.0 \n",
+ " NaN \n",
+ " 1 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " Entering a given \n",
+ " 1 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Row Anon Student Id Problem Hierarchy Problem Name \\\n",
+ "0 1 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "1 2 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "2 3 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "3 4 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "4 5 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "\n",
+ " Problem View Step Name Step Start Time First Transaction Time \\\n",
+ "0 1 R1C1 2006-10-26 09:51:58.0 2006-10-26 09:52:30.0 \n",
+ "1 1 R1C2 2006-10-26 09:53:30.0 2006-10-26 09:53:41.0 \n",
+ "2 1 R2C1 2006-10-26 09:53:41.0 2006-10-26 09:53:46.0 \n",
+ "3 1 R2C2 2006-10-26 09:53:46.0 2006-10-26 09:53:50.0 \n",
+ "4 1 R4C1 2006-10-26 09:53:50.0 2006-10-26 09:54:05.0 \n",
+ "\n",
+ " Correct Transaction Time Step End Time Step Duration (sec) \\\n",
+ "0 2006-10-26 09:53:30.0 2006-10-26 09:53:30.0 92.0 \n",
+ "1 2006-10-26 09:53:41.0 2006-10-26 09:53:41.0 11.0 \n",
+ "2 2006-10-26 09:53:46.0 2006-10-26 09:53:46.0 5.0 \n",
+ "3 2006-10-26 09:53:50.0 2006-10-26 09:53:50.0 4.0 \n",
+ "4 2006-10-26 09:54:05.0 2006-10-26 09:54:05.0 15.0 \n",
+ "\n",
+ " Correct Step Duration (sec) Error Step Duration (sec) \\\n",
+ "0 NaN 92.0 \n",
+ "1 11.0 NaN \n",
+ "2 5.0 NaN \n",
+ "3 4.0 NaN \n",
+ "4 15.0 NaN \n",
+ "\n",
+ " Correct First Attempt Incorrects Hints Corrects KC(Default) \\\n",
+ "0 0 2 0 1 NaN \n",
+ "1 1 0 0 1 NaN \n",
+ "2 1 0 0 1 Identifying units \n",
+ "3 1 0 0 1 Identifying units \n",
+ "4 1 0 0 1 Entering a given \n",
+ "\n",
+ " Opportunity(Default) \n",
+ "0 NaN \n",
+ "1 NaN \n",
+ "2 1 \n",
+ "3 2 \n",
+ "4 1 "
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "pd.set_option('display.max_column', 500)\n",
+ "data.head()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "id": "9d5e5859",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " Row \n",
+ " Problem View \n",
+ " Step Duration (sec) \n",
+ " Correct Step Duration (sec) \n",
+ " Error Step Duration (sec) \n",
+ " Correct First Attempt \n",
+ " Incorrects \n",
+ " Hints \n",
+ " Corrects \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " count \n",
+ " 2.270384e+06 \n",
+ " 2.270384e+06 \n",
+ " 2.267551e+06 \n",
+ " 1.751638e+06 \n",
+ " 515913.000000 \n",
+ " 2.270384e+06 \n",
+ " 2.270384e+06 \n",
+ " 2.270384e+06 \n",
+ " 2.270384e+06 \n",
+ " \n",
+ " \n",
+ " mean \n",
+ " 1.513120e+06 \n",
+ " 1.092910e+00 \n",
+ " 1.958364e+01 \n",
+ " 1.171716e+01 \n",
+ " 46.292087 \n",
+ " 7.722359e-01 \n",
+ " 4.455044e-01 \n",
+ " 1.184311e-01 \n",
+ " 1.062878e+00 \n",
+ " \n",
+ " \n",
+ " std \n",
+ " 8.736198e+05 \n",
+ " 3.448857e-01 \n",
+ " 4.768345e+01 \n",
+ " 2.645318e+01 \n",
+ " 81.817794 \n",
+ " 4.193897e-01 \n",
+ " 2.000914e+00 \n",
+ " 6.199071e-01 \n",
+ " 6.894285e-01 \n",
+ " \n",
+ " \n",
+ " min \n",
+ " 1.000000e+00 \n",
+ " 1.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " \n",
+ " \n",
+ " 25% \n",
+ " 7.577408e+05 \n",
+ " 1.000000e+00 \n",
+ " 3.000000e+00 \n",
+ " 3.000000e+00 \n",
+ " 11.000000 \n",
+ " 1.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 1.000000e+00 \n",
+ " \n",
+ " \n",
+ " 50% \n",
+ " 1.511844e+06 \n",
+ " 1.000000e+00 \n",
+ " 7.000000e+00 \n",
+ " 5.000000e+00 \n",
+ " 22.000000 \n",
+ " 1.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 1.000000e+00 \n",
+ " \n",
+ " \n",
+ " 75% \n",
+ " 2.269432e+06 \n",
+ " 1.000000e+00 \n",
+ " 1.700000e+01 \n",
+ " 1.100000e+01 \n",
+ " 47.000000 \n",
+ " 1.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 1.000000e+00 \n",
+ " \n",
+ " \n",
+ " max \n",
+ " 3.025933e+06 \n",
+ " 1.000000e+01 \n",
+ " 3.208000e+03 \n",
+ " 1.204000e+03 \n",
+ " 3208.000000 \n",
+ " 1.000000e+00 \n",
+ " 3.600000e+02 \n",
+ " 1.020000e+02 \n",
+ " 9.200000e+01 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Row Problem View Step Duration (sec) \\\n",
+ "count 2.270384e+06 2.270384e+06 2.267551e+06 \n",
+ "mean 1.513120e+06 1.092910e+00 1.958364e+01 \n",
+ "std 8.736198e+05 3.448857e-01 4.768345e+01 \n",
+ "min 1.000000e+00 1.000000e+00 0.000000e+00 \n",
+ "25% 7.577408e+05 1.000000e+00 3.000000e+00 \n",
+ "50% 1.511844e+06 1.000000e+00 7.000000e+00 \n",
+ "75% 2.269432e+06 1.000000e+00 1.700000e+01 \n",
+ "max 3.025933e+06 1.000000e+01 3.208000e+03 \n",
+ "\n",
+ " Correct Step Duration (sec) Error Step Duration (sec) \\\n",
+ "count 1.751638e+06 515913.000000 \n",
+ "mean 1.171716e+01 46.292087 \n",
+ "std 2.645318e+01 81.817794 \n",
+ "min 0.000000e+00 0.000000 \n",
+ "25% 3.000000e+00 11.000000 \n",
+ "50% 5.000000e+00 22.000000 \n",
+ "75% 1.100000e+01 47.000000 \n",
+ "max 1.204000e+03 3208.000000 \n",
+ "\n",
+ " Correct First Attempt Incorrects Hints Corrects \n",
+ "count 2.270384e+06 2.270384e+06 2.270384e+06 2.270384e+06 \n",
+ "mean 7.722359e-01 4.455044e-01 1.184311e-01 1.062878e+00 \n",
+ "std 4.193897e-01 2.000914e+00 6.199071e-01 6.894285e-01 \n",
+ "min 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 \n",
+ "25% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
+ "50% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
+ "75% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
+ "max 1.000000e+00 3.600000e+02 1.020000e+02 9.200000e+01 "
+ ]
+ },
+ "execution_count": 4,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "data.describe()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "id": "92cc0aab",
+ "metadata": {
+ "scrolled": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Part of missing values for every column\n",
+ "Row 0.000000\n",
+ "Anon Student Id 0.000000\n",
+ "Problem Hierarchy 0.000000\n",
+ "Problem Name 0.000000\n",
+ "Problem View 0.000000\n",
+ "Step Name 0.000000\n",
+ "Step Start Time 0.001103\n",
+ "First Transaction Time 0.000000\n",
+ "Correct Transaction Time 0.034757\n",
+ "Step End Time 0.000000\n",
+ "Step Duration (sec) 0.001248\n",
+ "Correct Step Duration (sec) 0.228484\n",
+ "Error Step Duration (sec) 0.772764\n",
+ "Correct First Attempt 0.000000\n",
+ "Incorrects 0.000000\n",
+ "Hints 0.000000\n",
+ "Corrects 0.000000\n",
+ "KC(Default) 0.203407\n",
+ "Opportunity(Default) 0.203407\n",
+ "dtype: float64\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"Part of missing values for every column\")\n",
+ "print(data.isnull().sum() / len(data))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "id": "0187b3b5",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "the number of records:\n",
+ "2270384\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"the number of records:\")\n",
+ "print(len(data))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "id": "701b6633",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "how many students are there in the table:\n",
+ "1338\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"how many students are there in the table:\")\n",
+ "print(len(data['Anon Student Id'].unique()))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "id": "bf7b246f",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "how many problems are there in the table:\n",
+ "91913\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"how many problems are there in the table:\")\n",
+ "print(len(data['Problem Name'].unique()))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e0602c47",
+ "metadata": {},
+ "source": [
+ "## Sort by Anon Student Id"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "id": "8051cc2b",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ "0 2000 4000 6000 8000 271emtbxq8pa- 3cjD21W- 271zbdm1lcgj- 271swzglvvxm- 271jonvpgijj- 24841uicq- E9dzBix- 7OalbuD- a8YLu01- 271sjweu45ee- MYmjG5R- ug982yk- 271g8nye4tne- 271zzbwqtqht- 271k4y8incfb- 271tt6j61n7d- 271rvro73lce- 7LZr10z- F713eQN- 271g7beuc4s1- Top 40 students by number of steps they have done count Anon Student Id "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "ds = data['Anon Student Id'].value_counts().reset_index()\n",
+ "ds.columns = [\n",
+ " 'Anon Student Id',\n",
+ " 'count'\n",
+ "]\n",
+ "ds['Anon Student Id'] = ds['Anon Student Id'].astype(str) + '-'\n",
+ "ds = ds.sort_values('count').tail(40)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " ds,\n",
+ " x='count',\n",
+ " y='Anon Student Id',\n",
+ " orientation='h',\n",
+ " title='Top 40 students by number of steps they have done'\n",
+ ")\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "66ef53ad",
+ "metadata": {},
+ "source": [
+ "## Percent of corrects, hints and incorrects"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "id": "c8f1539c",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ "65.3% 27.4% 7.28% corrects incorrects hints Percent of corrects, hints and incorrects "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "count_corrects = data['Corrects'].sum()\n",
+ "count_hints = data['Hints'].sum()\n",
+ "count_incorrects = data['Incorrects'].sum()\n",
+ "\n",
+ "total = count_corrects + count_hints + count_incorrects\n",
+ "\n",
+ "percent_corrects = count_corrects / total\n",
+ "percent_hints = count_hints / total\n",
+ "percent_incorrects = count_incorrects / total\n",
+ "\n",
+ "dfl = [['corrects', percent_corrects], ['hints', percent_hints], ['incorrects', percent_incorrects]]\n",
+ "\n",
+ "df = pd.DataFrame(dfl, columns=['transaction type', 'percent'])\n",
+ "\n",
+ "fig = px.pie(\n",
+ " df,\n",
+ " names=['corrects', 'hints', 'incorrects'],\n",
+ " values='percent',\n",
+ " title='Percent of corrects, hints and incorrects'\n",
+ ")\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3b097141",
+ "metadata": {},
+ "source": [
+ "## Sort by Problem Name"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "id": "6d668c43",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ "0 200 400 600 800 1000 1PTFB11- DISTFB03_SP- DIST11_SP- PROP06- DIST09_SP- DIST10_SP- EG4-CONSTANT 3(x+2) = 15- JAN06- FEB04- NOV13- PROP03- PROP12- FEB11- RATIO2-001- PROP05- JAN09- PROP04- EG1-CONSTANT 7(8+2)- LDEMO_WSLVR- L5FB16- Top 40 useful problem count Problem Name "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "storeProblemCount = [1]\n",
+ "storeProblemName = [data['Problem Name'][0]]\n",
+ "currentProblemName = data['Problem Name'][0]\n",
+ "currentStepName = [data['Step Name'][0]]\n",
+ "lastIndex = 0\n",
+ "\n",
+ "for i in range(1, len(data), 1):\n",
+ " pbNameI = data['Problem Name'][i]\n",
+ " stNameI = data['Step Name'][i]\n",
+ " if pbNameI != data['Problem Name'][lastIndex]:\n",
+ " currentStepName = [stNameI]\n",
+ " currentProblemName = pbNameI\n",
+ " if pbNameI not in storeProblemName:\n",
+ " storeProblemName.append(pbNameI)\n",
+ " storeProblemCount.append(1)\n",
+ " else:\n",
+ " storeProblemCount[storeProblemName.index(pbNameI)] += 1\n",
+ " lastIndex = i\n",
+ " elif stNameI not in currentStepName:\n",
+ " currentStepName.append(stNameI)\n",
+ " lastIndex = i\n",
+ " else:\n",
+ " currentStepName = [stNameI]\n",
+ " storeProblemCount[storeProblemName.index(pbNameI)] += 1\n",
+ " lastIndex = i\n",
+ "\n",
+ "dfData = {\n",
+ " 'Problem Name': storeProblemName,\n",
+ " 'count': storeProblemCount\n",
+ "}\n",
+ "df = pd.DataFrame(dfData).sort_values('count').tail(40)\n",
+ "df[\"Problem Name\"] += '-'\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df,\n",
+ " x='count',\n",
+ " y='Problem Name',\n",
+ " orientation='h',\n",
+ " title='Top 40 useful problem'\n",
+ ")\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "id": "1b965aa4",
+ "metadata": {
+ "scrolled": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ "SYLT-2X&YGE-2X+9- ROOTS1-001- SY=2X&Y=-3X+5- PROBABILITY1-006- EG-RE-DISTRIB-05 (2x+4)/(3x+6)- PROBABILITY1-070- G3X-YLE5&3X-YGE15- BUSES- PEANUTS-CASHEWS- EXPONENT2-012- TVS3- PROBABILITY5-001- EXPONENT3-071- EXPONENT2-046- EXPONENT5-001- PROBABILITY2-001- GLFM-BUSES- EG-EPS01-CONSTANT (4(x^2)*y^4)(3(x^3)*y^3)- PROBABILITY6-002- PROBABILITY6-001- 0 0.5 1 SYLT-2X&YGE-2X+9- ROOTS1-001- SY=2X&Y=-3X+5- PROBABILITY1-006- EG-RE-DISTRIB-05 (2x+4)/(3x+6)- PROBABILITY1-070- G3X-YLE5&3X-YGE15- BUSES- PEANUTS-CASHEWS- EXPONENT2-012- TVS3- PROBABILITY5-001- EXPONENT3-071- EXPONENT2-046- EXPONENT5-001- PROBABILITY2-001- GLFM-BUSES- EG-EPS01-CONSTANT (4(x^2)*y^4)(3(x^3)*y^3)- PROBABILITY6-002- PROBABILITY6-001- Correct rate of each problem (top 20) (total transactions of each problem are required to be more than 500) Correct rate Problem Name "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "image/svg+xml": [
+ "RATIONAL1-091- RATIONAL1-165- RATIONAL1-034- RATIONAL1-058- RATIONAL1-075- RATIONAL1-035- RATIONAL1-281- RATIONAL1-261- RATIONAL1-177- RATIONAL1-064- RATIONAL1-147- RATIONAL1-021- RATIONAL1-121- RATIONAL1-008- RATIONAL1-288- RATIONAL1-109- BH1T31B- RXMX_3C- EG-RE-CANCEL-04 ((x-9)^3)/(x-9)^4- RATIONAL2-205- 0 0.1 0.2 0.3 RATIONAL1-091- RATIONAL1-165- RATIONAL1-034- RATIONAL1-058- RATIONAL1-075- RATIONAL1-035- RATIONAL1-281- RATIONAL1-261- RATIONAL1-177- RATIONAL1-064- RATIONAL1-147- RATIONAL1-021- RATIONAL1-121- RATIONAL1-008- RATIONAL1-288- RATIONAL1-109- BH1T31B- RXMX_3C- EG-RE-CANCEL-04 ((x-9)^3)/(x-9)^4- RATIONAL2-205- Correct rate of each problem (bottom 20) (total transactions of each problem are required to be more than 500) Correct rate Problem Name "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "data['total transactions'] = data['Incorrects'] + data['Hints'] + data['Corrects']\n",
+ "df1 = data.groupby('Problem Name')['total transactions'].sum().reset_index()\n",
+ "df2 = data.groupby('Problem Name')['Corrects'].sum().reset_index()\n",
+ "df1['Corrects'] = df2['Corrects']\n",
+ "df1['Correct rate'] = df1['Corrects'] / df1['total transactions']\n",
+ "\n",
+ "df1 = df1.sort_values('total transactions')\n",
+ "count = 0\n",
+ "standard = 500\n",
+ "for i in df1['total transactions']:\n",
+ " if i > standard:\n",
+ " count += 1\n",
+ "df1 = df1.tail(count)\n",
+ "\n",
+ "df1 = df1.sort_values('Correct rate')\n",
+ "\n",
+ "df1['Problem Name'] = df1['Problem Name'].astype(str) + \"-\"\n",
+ "\n",
+ "df_px = df1.tail(20)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df_px,\n",
+ " x='Correct rate',\n",
+ " y='Problem Name',\n",
+ " orientation='h',\n",
+ " title='Correct rate of each problem (top 20) (total transactions of \\\n",
+ "each problem are required to be more than 500)',\n",
+ " text='Problem Name'\n",
+ ")\n",
+ "fig.update_layout(title_font_size=10)\n",
+ "fig.show(\"svg\")\n",
+ "\n",
+ "df_px = df1.head(20)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df_px,\n",
+ " x='Correct rate',\n",
+ " y='Problem Name',\n",
+ " orientation='h',\n",
+ " title='Correct rate of each problem (bottom 20) (total transactions of \\\n",
+ "each problem are required to be more than 500)',\n",
+ " text='Problem Name'\n",
+ ")\n",
+ "fig.update_layout(title_font_size=10)\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2bee6e04",
+ "metadata": {},
+ "source": [
+ "These two figures present the correct rate of problems. Problems with low correct rate deserve more attention from teachers and students."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0047e839",
+ "metadata": {},
+ "source": [
+ "## Sort by KC"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "id": "c6e910e0",
+ "metadata": {
+ "scrolled": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ "[Rule: RF right ([SolverOperation rf],{components with property canReduceFractionsNoMultWhole of right})]- Enter second extreme in equation- Proportional-Constant-Expression-Gla:Student-Modeling-Analysis- Enter Calculated value of rate- [SkillRule: Combine like terms, no var; CLT]- Enter square of leg label- Compare medians - removed outlier- Enter number of total outcomes in table- Find square of given leg- Enter ratio quantity to right of \"to\"- Enter fractional probability of event- [SkillRule: Select Multiply; {MT; MT no fraction coeff}]- Write base of exponential from given whole number as product- Write decimal multiplier from given scientific notation- Enter ratio quantity to right of colon- PM-ROW-1- Changing axis intervals- unspecified- [Rule: unnec-elems ([SolverOperation unnec-elems],)]- Select second event- 0 0.2 0.4 0.6 0.8 1 Correct rate of each KC(Default) (top 20) (total transactions of each KC are required to be more than 300) correct rate "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "image/svg+xml": [
+ "unknown bug element~~CLT-ROW-1-COEFF- unknown bug element~~CLT-ROW-1- Plot terminating improper fractions- Plot decimal - thousandths- Plot terminating mixed number- Plot imperfect radical- Plot decimal - hundredths- Setting the slope- Plot terminating proper fraction- Plot percent- Plot non-terminating proper fraction- Plot non-terminating improper fraction- Entering slope, GLF- Placing coordinate point- Plot decimal - tenths- Finding the intersection, SIF- Finding the intersection, Mixed- [Rule: CLT nested, no LCD ([SolverOperation clt],)]- 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 Correct rate of each KC(Default) (bottom 20) (total transactions of each KC are required to be more than 300) correct rate "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "data.dropna(subset=['KC(Default)'], inplace=True)\n",
+ "\n",
+ "data['total transactions'] = data['Corrects'] + data['Hints'] + data['Incorrects']\n",
+ "df1 = data.groupby('KC(Default)')['total transactions'].sum().reset_index()\n",
+ "df2 = data.groupby('KC(Default)')['Corrects'].sum().reset_index()\n",
+ "df1['Corrects'] = df2['Corrects']\n",
+ "df1['correct rate'] = df1['Corrects'] / df1['total transactions']\n",
+ "\n",
+ "count = 0\n",
+ "standard = 300\n",
+ "for i in df1['total transactions']:\n",
+ " if i > standard:\n",
+ " count += 1\n",
+ "df1 = df1.sort_values('total transactions').tail(count)\n",
+ "\n",
+ "df1 = df1.sort_values('correct rate')\n",
+ "\n",
+ "df1['KC(Default)'] = df1['KC(Default)'].astype(str) + '-'\n",
+ "\n",
+ "df_px = df1.tail(20)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df_px,\n",
+ " x='correct rate',\n",
+ " y='KC(Default)',\n",
+ " orientation='h',\n",
+ " title='Correct rate of each KC(Default) (top 20) (total transactions of \\\n",
+ "each KC are required to be more than 300)',\n",
+ " text='KC(Default)'\n",
+ ")\n",
+ "fig.update_yaxes(visible=False)\n",
+ "fig.update_layout(title_font_size=10)\n",
+ "fig.show(\"svg\")\n",
+ "\n",
+ "df_px = df1.head(20)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df_px,\n",
+ " x='correct rate',\n",
+ " y='KC(Default)',\n",
+ " orientation='h',\n",
+ " title='Correct rate of each KC(Default) (bottom 20) (total transactions of \\\n",
+ "each KC are required to be more than 300)',\n",
+ " text='KC(Default)'\n",
+ ")\n",
+ "fig.update_yaxes(visible=False)\n",
+ "fig.update_layout(title_font_size=10)\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0feef8a1",
+ "metadata": {},
+ "source": [
+ "These two figures present the correct rate of KCs. KCs with low correct rate deserve more attention from teachers and students."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "22d99527",
+ "metadata": {},
+ "source": [
+ "## Postscript"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "09bc0903",
+ "metadata": {},
+ "source": [
+ "Given that the whole data package is composed of 5 data sets and data files in these 5 data sets that can be used to conduct data analysis share the same data format, the following analysis based on \"algebra_2006_2007_train\" is just an example of data analysis on KDD Cup, and the code can be used to analyse other data files with some small changes on the file path and column names.\n"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.9.6"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
From 88f19b9266baa6f92a0a8ae8c8c511e39ed6e0f7 Mon Sep 17 00:00:00 2001
From: gguu1314 <88154940+gguu1314@users.noreply.github.com>
Date: Sat, 14 Aug 2021 13:51:28 +0800
Subject: [PATCH 08/10] Delete KDD Cup 2010.ipynb
---
docs/KDD Cup 2010.ipynb | 994 ----------------------------------------
1 file changed, 994 deletions(-)
delete mode 100644 docs/KDD Cup 2010.ipynb
diff --git a/docs/KDD Cup 2010.ipynb b/docs/KDD Cup 2010.ipynb
deleted file mode 100644
index 19b2dfc..0000000
--- a/docs/KDD Cup 2010.ipynb
+++ /dev/null
@@ -1,994 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "id": "e002fdf8",
- "metadata": {},
- "source": [
- "# KDD Cup 2010 —— Data Analysis on algebra_2006_2007_train"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "429152ff",
- "metadata": {},
- "source": [
- "## Data Description"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "2c89d116",
- "metadata": {},
- "source": [
- "### Column Description"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "f590eee5",
- "metadata": {},
- "source": [
- "| Attribute | Annotaion |\n",
- "|:--:|---|\n",
- "|Row|The row number|\n",
- "| Anon Student Id | Unique, anonymous identifier for a student |\n",
- "| Problem Hierarchy | The hierarchy of curriculum levels containing the problem |\n",
- "| Problem Name | Unique identifier for a problem |\n",
- "| Problem View | The total number of times the student encountered the problem so far |\n",
- "| Step Name | Unique identifier for one of the steps in a problem |\n",
- "| Step Start Time | The starting time of the step (Can be null) |\n",
- "| First Transaction Time | The time of the first transaction toward the step |\n",
- "| Correct Transaction Time | The time of the correct attempt toward the step, if there was one |\n",
- "| Step End Time | The time of the last transaction toward the step |\n",
- "| Step Duration (sec) | The elapsed time of the step in seconds, calculated by adding all of the durations for transactions that were attributed to the step (Can be null if step start time is null) |\n",
- "| Correct Step Duration (sec) | The step duration if the first attempt for the step was correct |\n",
- "| Error Step Duration (sec) | The step duration if the first attempt for the step was an error (incorrect attempt or hint request) |\n",
- "| Correct First Attempt | The tutor's evaluation of the student's first attempt on the step—1 if correct, 0 if an error |\n",
- "| Incorrects | Total number of incorrect attempts by the student on the step |\n",
- "| Hints | Total number of hints requested by the student for the step |\n",
- "| Corrects | Total correct attempts by the student for the step (only increases if the step is encountered more than once) |\n",
- "| KC(KC Model Name) | The identified skills that are used in a problem, where available |\n",
- "| Opportunity(KC Model Name) | A count that increases by one each time the student encounters a step with the listed knowledge component |\n",
- "|| Additional KC models, which exist for the challenge data sets, will appear as additional pairs of columns (KC and Opportunity columns for each model) |"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "2c2a2d3e",
- "metadata": {},
- "source": [
- "For the test portion of the challenge data sets, values will not be provided for the following columns:"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "f19eb949",
- "metadata": {},
- "source": [
- "♦ Step Start Time\n",
- "\n",
- "♦ First Transaction Time\n",
- "\n",
- "♦ Correct Transaction Time\n",
- "\n",
- "♦ Step End Time\n",
- "\n",
- "♦ Step Duration (sec)\n",
- "\n",
- "♦ Correct Step Duration (sec)\n",
- "\n",
- "♦ Error Step Duration (sec)\n",
- "\n",
- "♦ Correct First Attempt\n",
- "\n",
- "♦ Incorrects\n",
- "\n",
- "♦ Hints\n",
- "\n",
- "♦ Corrects"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "id": "123674b7",
- "metadata": {},
- "outputs": [],
- "source": [
- "import pandas as pd\n",
- "import plotly.express as px"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "id": "efa6be16",
- "metadata": {},
- "outputs": [],
- "source": [
- "path = \"algebra_2006_2007_train.txt\"\n",
- "data = pd.read_table(path, encoding=\"ISO-8859-15\", low_memory=False)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "993f1986",
- "metadata": {},
- "source": [
- "## Record Examples"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 3,
- "id": "8b2af14e",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " Row \n",
- " Anon Student Id \n",
- " Problem Hierarchy \n",
- " Problem Name \n",
- " Problem View \n",
- " Step Name \n",
- " Step Start Time \n",
- " First Transaction Time \n",
- " Correct Transaction Time \n",
- " Step End Time \n",
- " Step Duration (sec) \n",
- " Correct Step Duration (sec) \n",
- " Error Step Duration (sec) \n",
- " Correct First Attempt \n",
- " Incorrects \n",
- " Hints \n",
- " Corrects \n",
- " KC(Default) \n",
- " Opportunity(Default) \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " 0 \n",
- " 1 \n",
- " JG4Tz \n",
- " Unit CTA1_01, Section CTA1_01-1 \n",
- " LDEMO_WKST \n",
- " 1 \n",
- " R1C1 \n",
- " 2006-10-26 09:51:58.0 \n",
- " 2006-10-26 09:52:30.0 \n",
- " 2006-10-26 09:53:30.0 \n",
- " 2006-10-26 09:53:30.0 \n",
- " 92.0 \n",
- " NaN \n",
- " 92.0 \n",
- " 0 \n",
- " 2 \n",
- " 0 \n",
- " 1 \n",
- " NaN \n",
- " NaN \n",
- " \n",
- " \n",
- " 1 \n",
- " 2 \n",
- " JG4Tz \n",
- " Unit CTA1_01, Section CTA1_01-1 \n",
- " LDEMO_WKST \n",
- " 1 \n",
- " R1C2 \n",
- " 2006-10-26 09:53:30.0 \n",
- " 2006-10-26 09:53:41.0 \n",
- " 2006-10-26 09:53:41.0 \n",
- " 2006-10-26 09:53:41.0 \n",
- " 11.0 \n",
- " 11.0 \n",
- " NaN \n",
- " 1 \n",
- " 0 \n",
- " 0 \n",
- " 1 \n",
- " NaN \n",
- " NaN \n",
- " \n",
- " \n",
- " 2 \n",
- " 3 \n",
- " JG4Tz \n",
- " Unit CTA1_01, Section CTA1_01-1 \n",
- " LDEMO_WKST \n",
- " 1 \n",
- " R2C1 \n",
- " 2006-10-26 09:53:41.0 \n",
- " 2006-10-26 09:53:46.0 \n",
- " 2006-10-26 09:53:46.0 \n",
- " 2006-10-26 09:53:46.0 \n",
- " 5.0 \n",
- " 5.0 \n",
- " NaN \n",
- " 1 \n",
- " 0 \n",
- " 0 \n",
- " 1 \n",
- " Identifying units \n",
- " 1 \n",
- " \n",
- " \n",
- " 3 \n",
- " 4 \n",
- " JG4Tz \n",
- " Unit CTA1_01, Section CTA1_01-1 \n",
- " LDEMO_WKST \n",
- " 1 \n",
- " R2C2 \n",
- " 2006-10-26 09:53:46.0 \n",
- " 2006-10-26 09:53:50.0 \n",
- " 2006-10-26 09:53:50.0 \n",
- " 2006-10-26 09:53:50.0 \n",
- " 4.0 \n",
- " 4.0 \n",
- " NaN \n",
- " 1 \n",
- " 0 \n",
- " 0 \n",
- " 1 \n",
- " Identifying units \n",
- " 2 \n",
- " \n",
- " \n",
- " 4 \n",
- " 5 \n",
- " JG4Tz \n",
- " Unit CTA1_01, Section CTA1_01-1 \n",
- " LDEMO_WKST \n",
- " 1 \n",
- " R4C1 \n",
- " 2006-10-26 09:53:50.0 \n",
- " 2006-10-26 09:54:05.0 \n",
- " 2006-10-26 09:54:05.0 \n",
- " 2006-10-26 09:54:05.0 \n",
- " 15.0 \n",
- " 15.0 \n",
- " NaN \n",
- " 1 \n",
- " 0 \n",
- " 0 \n",
- " 1 \n",
- " Entering a given \n",
- " 1 \n",
- " \n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " Row Anon Student Id Problem Hierarchy Problem Name \\\n",
- "0 1 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
- "1 2 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
- "2 3 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
- "3 4 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
- "4 5 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
- "\n",
- " Problem View Step Name Step Start Time First Transaction Time \\\n",
- "0 1 R1C1 2006-10-26 09:51:58.0 2006-10-26 09:52:30.0 \n",
- "1 1 R1C2 2006-10-26 09:53:30.0 2006-10-26 09:53:41.0 \n",
- "2 1 R2C1 2006-10-26 09:53:41.0 2006-10-26 09:53:46.0 \n",
- "3 1 R2C2 2006-10-26 09:53:46.0 2006-10-26 09:53:50.0 \n",
- "4 1 R4C1 2006-10-26 09:53:50.0 2006-10-26 09:54:05.0 \n",
- "\n",
- " Correct Transaction Time Step End Time Step Duration (sec) \\\n",
- "0 2006-10-26 09:53:30.0 2006-10-26 09:53:30.0 92.0 \n",
- "1 2006-10-26 09:53:41.0 2006-10-26 09:53:41.0 11.0 \n",
- "2 2006-10-26 09:53:46.0 2006-10-26 09:53:46.0 5.0 \n",
- "3 2006-10-26 09:53:50.0 2006-10-26 09:53:50.0 4.0 \n",
- "4 2006-10-26 09:54:05.0 2006-10-26 09:54:05.0 15.0 \n",
- "\n",
- " Correct Step Duration (sec) Error Step Duration (sec) \\\n",
- "0 NaN 92.0 \n",
- "1 11.0 NaN \n",
- "2 5.0 NaN \n",
- "3 4.0 NaN \n",
- "4 15.0 NaN \n",
- "\n",
- " Correct First Attempt Incorrects Hints Corrects KC(Default) \\\n",
- "0 0 2 0 1 NaN \n",
- "1 1 0 0 1 NaN \n",
- "2 1 0 0 1 Identifying units \n",
- "3 1 0 0 1 Identifying units \n",
- "4 1 0 0 1 Entering a given \n",
- "\n",
- " Opportunity(Default) \n",
- "0 NaN \n",
- "1 NaN \n",
- "2 1 \n",
- "3 2 \n",
- "4 1 "
- ]
- },
- "execution_count": 3,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "pd.set_option('display.max_column', 500)\n",
- "data.head()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 4,
- "id": "9d5e5859",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " Row \n",
- " Problem View \n",
- " Step Duration (sec) \n",
- " Correct Step Duration (sec) \n",
- " Error Step Duration (sec) \n",
- " Correct First Attempt \n",
- " Incorrects \n",
- " Hints \n",
- " Corrects \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " count \n",
- " 2.270384e+06 \n",
- " 2.270384e+06 \n",
- " 2.267551e+06 \n",
- " 1.751638e+06 \n",
- " 515913.000000 \n",
- " 2.270384e+06 \n",
- " 2.270384e+06 \n",
- " 2.270384e+06 \n",
- " 2.270384e+06 \n",
- " \n",
- " \n",
- " mean \n",
- " 1.513120e+06 \n",
- " 1.092910e+00 \n",
- " 1.958364e+01 \n",
- " 1.171716e+01 \n",
- " 46.292087 \n",
- " 7.722359e-01 \n",
- " 4.455044e-01 \n",
- " 1.184311e-01 \n",
- " 1.062878e+00 \n",
- " \n",
- " \n",
- " std \n",
- " 8.736198e+05 \n",
- " 3.448857e-01 \n",
- " 4.768345e+01 \n",
- " 2.645318e+01 \n",
- " 81.817794 \n",
- " 4.193897e-01 \n",
- " 2.000914e+00 \n",
- " 6.199071e-01 \n",
- " 6.894285e-01 \n",
- " \n",
- " \n",
- " min \n",
- " 1.000000e+00 \n",
- " 1.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " \n",
- " \n",
- " 25% \n",
- " 7.577408e+05 \n",
- " 1.000000e+00 \n",
- " 3.000000e+00 \n",
- " 3.000000e+00 \n",
- " 11.000000 \n",
- " 1.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 1.000000e+00 \n",
- " \n",
- " \n",
- " 50% \n",
- " 1.511844e+06 \n",
- " 1.000000e+00 \n",
- " 7.000000e+00 \n",
- " 5.000000e+00 \n",
- " 22.000000 \n",
- " 1.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 1.000000e+00 \n",
- " \n",
- " \n",
- " 75% \n",
- " 2.269432e+06 \n",
- " 1.000000e+00 \n",
- " 1.700000e+01 \n",
- " 1.100000e+01 \n",
- " 47.000000 \n",
- " 1.000000e+00 \n",
- " 0.000000e+00 \n",
- " 0.000000e+00 \n",
- " 1.000000e+00 \n",
- " \n",
- " \n",
- " max \n",
- " 3.025933e+06 \n",
- " 1.000000e+01 \n",
- " 3.208000e+03 \n",
- " 1.204000e+03 \n",
- " 3208.000000 \n",
- " 1.000000e+00 \n",
- " 3.600000e+02 \n",
- " 1.020000e+02 \n",
- " 9.200000e+01 \n",
- " \n",
- " \n",
- "
\n",
- "
"
- ],
- "text/plain": [
- " Row Problem View Step Duration (sec) \\\n",
- "count 2.270384e+06 2.270384e+06 2.267551e+06 \n",
- "mean 1.513120e+06 1.092910e+00 1.958364e+01 \n",
- "std 8.736198e+05 3.448857e-01 4.768345e+01 \n",
- "min 1.000000e+00 1.000000e+00 0.000000e+00 \n",
- "25% 7.577408e+05 1.000000e+00 3.000000e+00 \n",
- "50% 1.511844e+06 1.000000e+00 7.000000e+00 \n",
- "75% 2.269432e+06 1.000000e+00 1.700000e+01 \n",
- "max 3.025933e+06 1.000000e+01 3.208000e+03 \n",
- "\n",
- " Correct Step Duration (sec) Error Step Duration (sec) \\\n",
- "count 1.751638e+06 515913.000000 \n",
- "mean 1.171716e+01 46.292087 \n",
- "std 2.645318e+01 81.817794 \n",
- "min 0.000000e+00 0.000000 \n",
- "25% 3.000000e+00 11.000000 \n",
- "50% 5.000000e+00 22.000000 \n",
- "75% 1.100000e+01 47.000000 \n",
- "max 1.204000e+03 3208.000000 \n",
- "\n",
- " Correct First Attempt Incorrects Hints Corrects \n",
- "count 2.270384e+06 2.270384e+06 2.270384e+06 2.270384e+06 \n",
- "mean 7.722359e-01 4.455044e-01 1.184311e-01 1.062878e+00 \n",
- "std 4.193897e-01 2.000914e+00 6.199071e-01 6.894285e-01 \n",
- "min 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 \n",
- "25% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
- "50% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
- "75% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
- "max 1.000000e+00 3.600000e+02 1.020000e+02 9.200000e+01 "
- ]
- },
- "execution_count": 4,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "data.describe()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 5,
- "id": "92cc0aab",
- "metadata": {
- "scrolled": false
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Part of missing values for every column\n",
- "Row 0.000000\n",
- "Anon Student Id 0.000000\n",
- "Problem Hierarchy 0.000000\n",
- "Problem Name 0.000000\n",
- "Problem View 0.000000\n",
- "Step Name 0.000000\n",
- "Step Start Time 0.001103\n",
- "First Transaction Time 0.000000\n",
- "Correct Transaction Time 0.034757\n",
- "Step End Time 0.000000\n",
- "Step Duration (sec) 0.001248\n",
- "Correct Step Duration (sec) 0.228484\n",
- "Error Step Duration (sec) 0.772764\n",
- "Correct First Attempt 0.000000\n",
- "Incorrects 0.000000\n",
- "Hints 0.000000\n",
- "Corrects 0.000000\n",
- "KC(Default) 0.203407\n",
- "Opportunity(Default) 0.203407\n",
- "dtype: float64\n"
- ]
- }
- ],
- "source": [
- "print(\"Part of missing values for every column\")\n",
- "print(data.isnull().sum() / len(data))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 6,
- "id": "0187b3b5",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "the number of records:\n",
- "2270384\n"
- ]
- }
- ],
- "source": [
- "print(\"the number of records:\")\n",
- "print(len(data))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 7,
- "id": "701b6633",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "how many students are there in the table:\n",
- "1338\n"
- ]
- }
- ],
- "source": [
- "print(\"how many students are there in the table:\")\n",
- "print(len(data['Anon Student Id'].unique()))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 8,
- "id": "bf7b246f",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "how many problems are there in the table:\n",
- "91913\n"
- ]
- }
- ],
- "source": [
- "print(\"how many problems are there in the table:\")\n",
- "print(len(data['Problem Name'].unique()))"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "e0602c47",
- "metadata": {},
- "source": [
- "## Sort by Anon Student Id"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 9,
- "id": "8051cc2b",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "image/svg+xml": [
- "0 2000 4000 6000 8000 271emtbxq8pa- 3cjD21W- 271zbdm1lcgj- 271swzglvvxm- 271jonvpgijj- 24841uicq- E9dzBix- 7OalbuD- a8YLu01- 271sjweu45ee- MYmjG5R- ug982yk- 271g8nye4tne- 271zzbwqtqht- 271k4y8incfb- 271tt6j61n7d- 271rvro73lce- 7LZr10z- F713eQN- 271g7beuc4s1- Top 40 students by number of steps they have done count Anon Student Id "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "ds = data['Anon Student Id'].value_counts().reset_index()\n",
- "ds.columns = [\n",
- " 'Anon Student Id',\n",
- " 'count'\n",
- "]\n",
- "ds['Anon Student Id'] = ds['Anon Student Id'].astype(str) + '-'\n",
- "ds = ds.sort_values('count').tail(40)\n",
- "\n",
- "fig = px.bar(\n",
- " ds,\n",
- " x='count',\n",
- " y='Anon Student Id',\n",
- " orientation='h',\n",
- " title='Top 40 students by number of steps they have done'\n",
- ")\n",
- "fig.show(\"svg\")"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "66ef53ad",
- "metadata": {},
- "source": [
- "## Percent of corrects, hints and incorrects"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 10,
- "id": "c8f1539c",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "image/svg+xml": [
- "65.3% 27.4% 7.28% corrects incorrects hints Percent of corrects, hints and incorrects "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "count_corrects = data['Corrects'].sum()\n",
- "count_hints = data['Hints'].sum()\n",
- "count_incorrects = data['Incorrects'].sum()\n",
- "\n",
- "total = count_corrects + count_hints + count_incorrects\n",
- "\n",
- "percent_corrects = count_corrects / total\n",
- "percent_hints = count_hints / total\n",
- "percent_incorrects = count_incorrects / total\n",
- "\n",
- "dfl = [['corrects', percent_corrects], ['hints', percent_hints], ['incorrects', percent_incorrects]]\n",
- "\n",
- "df = pd.DataFrame(dfl, columns=['transaction type', 'percent'])\n",
- "\n",
- "fig = px.pie(\n",
- " df,\n",
- " names=['corrects', 'hints', 'incorrects'],\n",
- " values='percent',\n",
- " title='Percent of corrects, hints and incorrects'\n",
- ")\n",
- "fig.show(\"svg\")"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "3b097141",
- "metadata": {},
- "source": [
- "## Sort by Problem Name"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 11,
- "id": "6d668c43",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "image/svg+xml": [
- "0 200 400 600 800 1000 1PTFB11- DISTFB03_SP- DIST11_SP- PROP06- DIST09_SP- DIST10_SP- EG4-CONSTANT 3(x+2) = 15- JAN06- FEB04- NOV13- PROP03- PROP12- FEB11- RATIO2-001- PROP05- JAN09- PROP04- EG1-CONSTANT 7(8+2)- LDEMO_WSLVR- L5FB16- Top 40 useful problem count Problem Name "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "storeProblemCount = [1]\n",
- "storeProblemName = [data['Problem Name'][0]]\n",
- "currentProblemName = data['Problem Name'][0]\n",
- "currentStepName = [data['Step Name'][0]]\n",
- "lastIndex = 0\n",
- "\n",
- "for i in range(1, len(data), 1):\n",
- " pbNameI = data['Problem Name'][i]\n",
- " stNameI = data['Step Name'][i]\n",
- " if pbNameI != data['Problem Name'][lastIndex]:\n",
- " currentStepName = [stNameI]\n",
- " currentProblemName = pbNameI\n",
- " if pbNameI not in storeProblemName:\n",
- " storeProblemName.append(pbNameI)\n",
- " storeProblemCount.append(1)\n",
- " else:\n",
- " storeProblemCount[storeProblemName.index(pbNameI)] += 1\n",
- " lastIndex = i\n",
- " elif stNameI not in currentStepName:\n",
- " currentStepName.append(stNameI)\n",
- " lastIndex = i\n",
- " else:\n",
- " currentStepName = [stNameI]\n",
- " storeProblemCount[storeProblemName.index(pbNameI)] += 1\n",
- " lastIndex = i\n",
- "\n",
- "dfData = {\n",
- " 'Problem Name': storeProblemName,\n",
- " 'count': storeProblemCount\n",
- "}\n",
- "df = pd.DataFrame(dfData).sort_values('count').tail(40)\n",
- "df[\"Problem Name\"] += '-'\n",
- "\n",
- "fig = px.bar(\n",
- " df,\n",
- " x='count',\n",
- " y='Problem Name',\n",
- " orientation='h',\n",
- " title='Top 40 useful problem'\n",
- ")\n",
- "fig.show(\"svg\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 12,
- "id": "1b965aa4",
- "metadata": {
- "scrolled": false
- },
- "outputs": [
- {
- "data": {
- "image/svg+xml": [
- "SYLT-2X&YGE-2X+9- ROOTS1-001- SY=2X&Y=-3X+5- PROBABILITY1-006- EG-RE-DISTRIB-05 (2x+4)/(3x+6)- PROBABILITY1-070- G3X-YLE5&3X-YGE15- BUSES- PEANUTS-CASHEWS- EXPONENT2-012- TVS3- PROBABILITY5-001- EXPONENT3-071- EXPONENT2-046- EXPONENT5-001- PROBABILITY2-001- GLFM-BUSES- EG-EPS01-CONSTANT (4(x^2)*y^4)(3(x^3)*y^3)- PROBABILITY6-002- PROBABILITY6-001- 0 0.5 1 SYLT-2X&YGE-2X+9- ROOTS1-001- SY=2X&Y=-3X+5- PROBABILITY1-006- EG-RE-DISTRIB-05 (2x+4)/(3x+6)- PROBABILITY1-070- G3X-YLE5&3X-YGE15- BUSES- PEANUTS-CASHEWS- EXPONENT2-012- TVS3- PROBABILITY5-001- EXPONENT3-071- EXPONENT2-046- EXPONENT5-001- PROBABILITY2-001- GLFM-BUSES- EG-EPS01-CONSTANT (4(x^2)*y^4)(3(x^3)*y^3)- PROBABILITY6-002- PROBABILITY6-001- Correct rate of each problem (top 20) (total transactions of each problem are required to be more than 500) Correct rate Problem Name "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "image/svg+xml": [
- "RATIONAL1-091- RATIONAL1-165- RATIONAL1-034- RATIONAL1-058- RATIONAL1-075- RATIONAL1-035- RATIONAL1-281- RATIONAL1-261- RATIONAL1-177- RATIONAL1-064- RATIONAL1-147- RATIONAL1-021- RATIONAL1-121- RATIONAL1-008- RATIONAL1-288- RATIONAL1-109- BH1T31B- RXMX_3C- EG-RE-CANCEL-04 ((x-9)^3)/(x-9)^4- RATIONAL2-205- 0 0.1 0.2 0.3 RATIONAL1-091- RATIONAL1-165- RATIONAL1-034- RATIONAL1-058- RATIONAL1-075- RATIONAL1-035- RATIONAL1-281- RATIONAL1-261- RATIONAL1-177- RATIONAL1-064- RATIONAL1-147- RATIONAL1-021- RATIONAL1-121- RATIONAL1-008- RATIONAL1-288- RATIONAL1-109- BH1T31B- RXMX_3C- EG-RE-CANCEL-04 ((x-9)^3)/(x-9)^4- RATIONAL2-205- Correct rate of each problem (bottom 20) (total transactions of each problem are required to be more than 500) Correct rate Problem Name "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "data['total transactions'] = data['Incorrects'] + data['Hints'] + data['Corrects']\n",
- "df1 = data.groupby('Problem Name')['total transactions'].sum().reset_index()\n",
- "df2 = data.groupby('Problem Name')['Corrects'].sum().reset_index()\n",
- "df1['Corrects'] = df2['Corrects']\n",
- "df1['Correct rate'] = df1['Corrects'] / df1['total transactions']\n",
- "\n",
- "df1 = df1.sort_values('total transactions')\n",
- "count = 0\n",
- "standard = 500\n",
- "for i in df1['total transactions']:\n",
- " if i > standard:\n",
- " count += 1\n",
- "df1 = df1.tail(count)\n",
- "\n",
- "df1 = df1.sort_values('Correct rate')\n",
- "\n",
- "df1['Problem Name'] = df1['Problem Name'].astype(str) + \"-\"\n",
- "\n",
- "df_px = df1.tail(20)\n",
- "\n",
- "fig = px.bar(\n",
- " df_px,\n",
- " x='Correct rate',\n",
- " y='Problem Name',\n",
- " orientation='h',\n",
- " title='Correct rate of each problem (top 20) (total transactions of \\\n",
- "each problem are required to be more than 500)',\n",
- " text='Problem Name'\n",
- ")\n",
- "fig.update_layout(title_font_size=10)\n",
- "fig.show(\"svg\")\n",
- "\n",
- "df_px = df1.head(20)\n",
- "\n",
- "fig = px.bar(\n",
- " df_px,\n",
- " x='Correct rate',\n",
- " y='Problem Name',\n",
- " orientation='h',\n",
- " title='Correct rate of each problem (bottom 20) (total transactions of \\\n",
- "each problem are required to be more than 500)',\n",
- " text='Problem Name'\n",
- ")\n",
- "fig.update_layout(title_font_size=10)\n",
- "fig.show(\"svg\")"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "2bee6e04",
- "metadata": {},
- "source": [
- "These two figures present the correct rate of problems. Problems with low correct rate deserve more attention from teachers and students."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "0047e839",
- "metadata": {},
- "source": [
- "## Sort by KC"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 13,
- "id": "c6e910e0",
- "metadata": {
- "scrolled": false
- },
- "outputs": [
- {
- "data": {
- "image/svg+xml": [
- "[Rule: RF right ([SolverOperation rf],{components with property canReduceFractionsNoMultWhole of right})]- Enter second extreme in equation- Proportional-Constant-Expression-Gla:Student-Modeling-Analysis- Enter Calculated value of rate- [SkillRule: Combine like terms, no var; CLT]- Enter square of leg label- Compare medians - removed outlier- Enter number of total outcomes in table- Find square of given leg- Enter ratio quantity to right of \"to\"- Enter fractional probability of event- [SkillRule: Select Multiply; {MT; MT no fraction coeff}]- Write base of exponential from given whole number as product- Write decimal multiplier from given scientific notation- Enter ratio quantity to right of colon- PM-ROW-1- Changing axis intervals- unspecified- [Rule: unnec-elems ([SolverOperation unnec-elems],)]- Select second event- 0 0.2 0.4 0.6 0.8 1 Correct rate of each KC(Default) (top 20) (total transactions of each KC are required to be more than 300) correct rate "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- },
- {
- "data": {
- "image/svg+xml": [
- "unknown bug element~~CLT-ROW-1-COEFF- unknown bug element~~CLT-ROW-1- Plot terminating improper fractions- Plot decimal - thousandths- Plot terminating mixed number- Plot imperfect radical- Plot decimal - hundredths- Setting the slope- Plot terminating proper fraction- Plot percent- Plot non-terminating proper fraction- Plot non-terminating improper fraction- Entering slope, GLF- Placing coordinate point- Plot decimal - tenths- Finding the intersection, SIF- Finding the intersection, Mixed- [Rule: CLT nested, no LCD ([SolverOperation clt],)]- 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 Correct rate of each KC(Default) (bottom 20) (total transactions of each KC are required to be more than 300) correct rate "
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "data.dropna(subset=['KC(Default)'], inplace=True)\n",
- "\n",
- "data['total transactions'] = data['Corrects'] + data['Hints'] + data['Incorrects']\n",
- "df1 = data.groupby('KC(Default)')['total transactions'].sum().reset_index()\n",
- "df2 = data.groupby('KC(Default)')['Corrects'].sum().reset_index()\n",
- "df1['Corrects'] = df2['Corrects']\n",
- "df1['correct rate'] = df1['Corrects'] / df1['total transactions']\n",
- "\n",
- "count = 0\n",
- "standard = 300\n",
- "for i in df1['total transactions']:\n",
- " if i > standard:\n",
- " count += 1\n",
- "df1 = df1.sort_values('total transactions').tail(count)\n",
- "\n",
- "df1 = df1.sort_values('correct rate')\n",
- "\n",
- "df1['KC(Default)'] = df1['KC(Default)'].astype(str) + '-'\n",
- "\n",
- "df_px = df1.tail(20)\n",
- "\n",
- "fig = px.bar(\n",
- " df_px,\n",
- " x='correct rate',\n",
- " y='KC(Default)',\n",
- " orientation='h',\n",
- " title='Correct rate of each KC(Default) (top 20) (total transactions of \\\n",
- "each KC are required to be more than 300)',\n",
- " text='KC(Default)'\n",
- ")\n",
- "fig.update_yaxes(visible=False)\n",
- "fig.update_layout(title_font_size=10)\n",
- "fig.show(\"svg\")\n",
- "\n",
- "df_px = df1.head(20)\n",
- "\n",
- "fig = px.bar(\n",
- " df_px,\n",
- " x='correct rate',\n",
- " y='KC(Default)',\n",
- " orientation='h',\n",
- " title='Correct rate of each KC(Default) (bottom 20) (total transactions of \\\n",
- "each KC are required to be more than 300)',\n",
- " text='KC(Default)'\n",
- ")\n",
- "fig.update_yaxes(visible=False)\n",
- "fig.update_layout(title_font_size=10)\n",
- "fig.show(\"svg\")"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "0feef8a1",
- "metadata": {},
- "source": [
- "These two figures present the correct rate of KCs. KCs with low correct rate deserve more attention from teachers and students."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "22d99527",
- "metadata": {},
- "source": [
- "## Postscript"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "09bc0903",
- "metadata": {},
- "source": [
- "Given that the whole data package is composed of 5 data sets and data files in these 5 data sets that can be used to conduct data analysis share the same data format, the following analysis based on \"algebra_2006_2007_train\" is just an example of data analysis on KDD Cup, and the code can be used to analyse other data files with some small changes on the file path and column names.\n"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3 (ipykernel)",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.9.6"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
From 41235aa986bb5592d6958a00dce6197aac50a46c Mon Sep 17 00:00:00 2001
From: gguu1314 <88154940+gguu1314@users.noreply.github.com>
Date: Sat, 14 Aug 2021 13:52:10 +0800
Subject: [PATCH 09/10] add data analysis to kdd cup 2010
---
docs/KDD Cup 2010.ipynb | 994 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 994 insertions(+)
create mode 100644 docs/KDD Cup 2010.ipynb
diff --git a/docs/KDD Cup 2010.ipynb b/docs/KDD Cup 2010.ipynb
new file mode 100644
index 0000000..376cac2
--- /dev/null
+++ b/docs/KDD Cup 2010.ipynb
@@ -0,0 +1,994 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "e002fdf8",
+ "metadata": {},
+ "source": [
+ "# KDD Cup 2010 —— Data Analysis on algebra_2006_2007_train"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "429152ff",
+ "metadata": {},
+ "source": [
+ "## Data Description"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2c89d116",
+ "metadata": {},
+ "source": [
+ "### Column Description"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f590eee5",
+ "metadata": {},
+ "source": [
+ "| Attribute | Annotaion |\n",
+ "|:--:|---|\n",
+ "|Row|The row number|\n",
+ "| Anon Student Id | Unique, anonymous identifier for a student |\n",
+ "| Problem Hierarchy | The hierarchy of curriculum levels containing the problem |\n",
+ "| Problem Name | Unique identifier for a problem |\n",
+ "| Problem View | The total number of times the student encountered the problem so far |\n",
+ "| Step Name | Unique identifier for one of the steps in a problem |\n",
+ "| Step Start Time | The starting time of the step (Can be null) |\n",
+ "| First Transaction Time | The time of the first transaction toward the step |\n",
+ "| Correct Transaction Time | The time of the correct attempt toward the step, if there was one |\n",
+ "| Step End Time | The time of the last transaction toward the step |\n",
+ "| Step Duration (sec) | The elapsed time of the step in seconds, calculated by adding all of the durations for transactions that were attributed to the step (Can be null if step start time is null) |\n",
+ "| Correct Step Duration (sec) | The step duration if the first attempt for the step was correct |\n",
+ "| Error Step Duration (sec) | The step duration if the first attempt for the step was an error (incorrect attempt or hint request) |\n",
+ "| Correct First Attempt | The tutor's evaluation of the student's first attempt on the step—1 if correct, 0 if an error |\n",
+ "| Incorrects | Total number of incorrect attempts by the student on the step |\n",
+ "| Hints | Total number of hints requested by the student for the step |\n",
+ "| Corrects | Total correct attempts by the student for the step (only increases if the step is encountered more than once) |\n",
+ "| KC(KC Model Name) | The identified skills that are used in a problem, where available |\n",
+ "| Opportunity(KC Model Name) | A count that increases by one each time the student encounters a step with the listed knowledge component |\n",
+ "|| Additional KC models, which exist for the challenge data sets, will appear as additional pairs of columns (KC and Opportunity columns for each model) |"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2c2a2d3e",
+ "metadata": {},
+ "source": [
+ "For the test portion of the challenge data sets, values will not be provided for the following columns:"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f19eb949",
+ "metadata": {},
+ "source": [
+ "♦ Step Start Time\n",
+ "\n",
+ "♦ First Transaction Time\n",
+ "\n",
+ "♦ Correct Transaction Time\n",
+ "\n",
+ "♦ Step End Time\n",
+ "\n",
+ "♦ Step Duration (sec)\n",
+ "\n",
+ "♦ Correct Step Duration (sec)\n",
+ "\n",
+ "♦ Error Step Duration (sec)\n",
+ "\n",
+ "♦ Correct First Attempt\n",
+ "\n",
+ "♦ Incorrects\n",
+ "\n",
+ "♦ Hints\n",
+ "\n",
+ "♦ Corrects"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "123674b7",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import pandas as pd\n",
+ "import plotly.express as px"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "id": "efa6be16",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "path = \"algebra_2006_2007_train.txt\"\n",
+ "data = pd.read_table(path, encoding=\"ISO-8859-15\", low_memory=False)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "993f1986",
+ "metadata": {},
+ "source": [
+ "## Record Examples"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "8b2af14e",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " Row \n",
+ " Anon Student Id \n",
+ " Problem Hierarchy \n",
+ " Problem Name \n",
+ " Problem View \n",
+ " Step Name \n",
+ " Step Start Time \n",
+ " First Transaction Time \n",
+ " Correct Transaction Time \n",
+ " Step End Time \n",
+ " Step Duration (sec) \n",
+ " Correct Step Duration (sec) \n",
+ " Error Step Duration (sec) \n",
+ " Correct First Attempt \n",
+ " Incorrects \n",
+ " Hints \n",
+ " Corrects \n",
+ " KC(Default) \n",
+ " Opportunity(Default) \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 \n",
+ " 1 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R1C1 \n",
+ " 2006-10-26 09:51:58.0 \n",
+ " 2006-10-26 09:52:30.0 \n",
+ " 2006-10-26 09:53:30.0 \n",
+ " 2006-10-26 09:53:30.0 \n",
+ " 92.0 \n",
+ " NaN \n",
+ " 92.0 \n",
+ " 0 \n",
+ " 2 \n",
+ " 0 \n",
+ " 1 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " 1 \n",
+ " 2 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R1C2 \n",
+ " 2006-10-26 09:53:30.0 \n",
+ " 2006-10-26 09:53:41.0 \n",
+ " 2006-10-26 09:53:41.0 \n",
+ " 2006-10-26 09:53:41.0 \n",
+ " 11.0 \n",
+ " 11.0 \n",
+ " NaN \n",
+ " 1 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " 2 \n",
+ " 3 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R2C1 \n",
+ " 2006-10-26 09:53:41.0 \n",
+ " 2006-10-26 09:53:46.0 \n",
+ " 2006-10-26 09:53:46.0 \n",
+ " 2006-10-26 09:53:46.0 \n",
+ " 5.0 \n",
+ " 5.0 \n",
+ " NaN \n",
+ " 1 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " Identifying units \n",
+ " 1 \n",
+ " \n",
+ " \n",
+ " 3 \n",
+ " 4 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R2C2 \n",
+ " 2006-10-26 09:53:46.0 \n",
+ " 2006-10-26 09:53:50.0 \n",
+ " 2006-10-26 09:53:50.0 \n",
+ " 2006-10-26 09:53:50.0 \n",
+ " 4.0 \n",
+ " 4.0 \n",
+ " NaN \n",
+ " 1 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " Identifying units \n",
+ " 2 \n",
+ " \n",
+ " \n",
+ " 4 \n",
+ " 5 \n",
+ " JG4Tz \n",
+ " Unit CTA1_01, Section CTA1_01-1 \n",
+ " LDEMO_WKST \n",
+ " 1 \n",
+ " R4C1 \n",
+ " 2006-10-26 09:53:50.0 \n",
+ " 2006-10-26 09:54:05.0 \n",
+ " 2006-10-26 09:54:05.0 \n",
+ " 2006-10-26 09:54:05.0 \n",
+ " 15.0 \n",
+ " 15.0 \n",
+ " NaN \n",
+ " 1 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " Entering a given \n",
+ " 1 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Row Anon Student Id Problem Hierarchy Problem Name \\\n",
+ "0 1 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "1 2 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "2 3 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "3 4 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "4 5 JG4Tz Unit CTA1_01, Section CTA1_01-1 LDEMO_WKST \n",
+ "\n",
+ " Problem View Step Name Step Start Time First Transaction Time \\\n",
+ "0 1 R1C1 2006-10-26 09:51:58.0 2006-10-26 09:52:30.0 \n",
+ "1 1 R1C2 2006-10-26 09:53:30.0 2006-10-26 09:53:41.0 \n",
+ "2 1 R2C1 2006-10-26 09:53:41.0 2006-10-26 09:53:46.0 \n",
+ "3 1 R2C2 2006-10-26 09:53:46.0 2006-10-26 09:53:50.0 \n",
+ "4 1 R4C1 2006-10-26 09:53:50.0 2006-10-26 09:54:05.0 \n",
+ "\n",
+ " Correct Transaction Time Step End Time Step Duration (sec) \\\n",
+ "0 2006-10-26 09:53:30.0 2006-10-26 09:53:30.0 92.0 \n",
+ "1 2006-10-26 09:53:41.0 2006-10-26 09:53:41.0 11.0 \n",
+ "2 2006-10-26 09:53:46.0 2006-10-26 09:53:46.0 5.0 \n",
+ "3 2006-10-26 09:53:50.0 2006-10-26 09:53:50.0 4.0 \n",
+ "4 2006-10-26 09:54:05.0 2006-10-26 09:54:05.0 15.0 \n",
+ "\n",
+ " Correct Step Duration (sec) Error Step Duration (sec) \\\n",
+ "0 NaN 92.0 \n",
+ "1 11.0 NaN \n",
+ "2 5.0 NaN \n",
+ "3 4.0 NaN \n",
+ "4 15.0 NaN \n",
+ "\n",
+ " Correct First Attempt Incorrects Hints Corrects KC(Default) \\\n",
+ "0 0 2 0 1 NaN \n",
+ "1 1 0 0 1 NaN \n",
+ "2 1 0 0 1 Identifying units \n",
+ "3 1 0 0 1 Identifying units \n",
+ "4 1 0 0 1 Entering a given \n",
+ "\n",
+ " Opportunity(Default) \n",
+ "0 NaN \n",
+ "1 NaN \n",
+ "2 1 \n",
+ "3 2 \n",
+ "4 1 "
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "pd.set_option('display.max_column', 500)\n",
+ "data.head()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "id": "9d5e5859",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " Row \n",
+ " Problem View \n",
+ " Step Duration (sec) \n",
+ " Correct Step Duration (sec) \n",
+ " Error Step Duration (sec) \n",
+ " Correct First Attempt \n",
+ " Incorrects \n",
+ " Hints \n",
+ " Corrects \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " count \n",
+ " 2.270384e+06 \n",
+ " 2.270384e+06 \n",
+ " 2.267551e+06 \n",
+ " 1.751638e+06 \n",
+ " 515913.000000 \n",
+ " 2.270384e+06 \n",
+ " 2.270384e+06 \n",
+ " 2.270384e+06 \n",
+ " 2.270384e+06 \n",
+ " \n",
+ " \n",
+ " mean \n",
+ " 1.513120e+06 \n",
+ " 1.092910e+00 \n",
+ " 1.958364e+01 \n",
+ " 1.171716e+01 \n",
+ " 46.292087 \n",
+ " 7.722359e-01 \n",
+ " 4.455044e-01 \n",
+ " 1.184311e-01 \n",
+ " 1.062878e+00 \n",
+ " \n",
+ " \n",
+ " std \n",
+ " 8.736198e+05 \n",
+ " 3.448857e-01 \n",
+ " 4.768345e+01 \n",
+ " 2.645318e+01 \n",
+ " 81.817794 \n",
+ " 4.193897e-01 \n",
+ " 2.000914e+00 \n",
+ " 6.199071e-01 \n",
+ " 6.894285e-01 \n",
+ " \n",
+ " \n",
+ " min \n",
+ " 1.000000e+00 \n",
+ " 1.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " \n",
+ " \n",
+ " 25% \n",
+ " 7.577408e+05 \n",
+ " 1.000000e+00 \n",
+ " 3.000000e+00 \n",
+ " 3.000000e+00 \n",
+ " 11.000000 \n",
+ " 1.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 1.000000e+00 \n",
+ " \n",
+ " \n",
+ " 50% \n",
+ " 1.511844e+06 \n",
+ " 1.000000e+00 \n",
+ " 7.000000e+00 \n",
+ " 5.000000e+00 \n",
+ " 22.000000 \n",
+ " 1.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 1.000000e+00 \n",
+ " \n",
+ " \n",
+ " 75% \n",
+ " 2.269432e+06 \n",
+ " 1.000000e+00 \n",
+ " 1.700000e+01 \n",
+ " 1.100000e+01 \n",
+ " 47.000000 \n",
+ " 1.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 0.000000e+00 \n",
+ " 1.000000e+00 \n",
+ " \n",
+ " \n",
+ " max \n",
+ " 3.025933e+06 \n",
+ " 1.000000e+01 \n",
+ " 3.208000e+03 \n",
+ " 1.204000e+03 \n",
+ " 3208.000000 \n",
+ " 1.000000e+00 \n",
+ " 3.600000e+02 \n",
+ " 1.020000e+02 \n",
+ " 9.200000e+01 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Row Problem View Step Duration (sec) \\\n",
+ "count 2.270384e+06 2.270384e+06 2.267551e+06 \n",
+ "mean 1.513120e+06 1.092910e+00 1.958364e+01 \n",
+ "std 8.736198e+05 3.448857e-01 4.768345e+01 \n",
+ "min 1.000000e+00 1.000000e+00 0.000000e+00 \n",
+ "25% 7.577408e+05 1.000000e+00 3.000000e+00 \n",
+ "50% 1.511844e+06 1.000000e+00 7.000000e+00 \n",
+ "75% 2.269432e+06 1.000000e+00 1.700000e+01 \n",
+ "max 3.025933e+06 1.000000e+01 3.208000e+03 \n",
+ "\n",
+ " Correct Step Duration (sec) Error Step Duration (sec) \\\n",
+ "count 1.751638e+06 515913.000000 \n",
+ "mean 1.171716e+01 46.292087 \n",
+ "std 2.645318e+01 81.817794 \n",
+ "min 0.000000e+00 0.000000 \n",
+ "25% 3.000000e+00 11.000000 \n",
+ "50% 5.000000e+00 22.000000 \n",
+ "75% 1.100000e+01 47.000000 \n",
+ "max 1.204000e+03 3208.000000 \n",
+ "\n",
+ " Correct First Attempt Incorrects Hints Corrects \n",
+ "count 2.270384e+06 2.270384e+06 2.270384e+06 2.270384e+06 \n",
+ "mean 7.722359e-01 4.455044e-01 1.184311e-01 1.062878e+00 \n",
+ "std 4.193897e-01 2.000914e+00 6.199071e-01 6.894285e-01 \n",
+ "min 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 \n",
+ "25% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
+ "50% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
+ "75% 1.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
+ "max 1.000000e+00 3.600000e+02 1.020000e+02 9.200000e+01 "
+ ]
+ },
+ "execution_count": 4,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "data.describe()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "id": "92cc0aab",
+ "metadata": {
+ "scrolled": false
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Part of missing values for every column\n",
+ "Row 0.000000\n",
+ "Anon Student Id 0.000000\n",
+ "Problem Hierarchy 0.000000\n",
+ "Problem Name 0.000000\n",
+ "Problem View 0.000000\n",
+ "Step Name 0.000000\n",
+ "Step Start Time 0.001103\n",
+ "First Transaction Time 0.000000\n",
+ "Correct Transaction Time 0.034757\n",
+ "Step End Time 0.000000\n",
+ "Step Duration (sec) 0.001248\n",
+ "Correct Step Duration (sec) 0.228484\n",
+ "Error Step Duration (sec) 0.772764\n",
+ "Correct First Attempt 0.000000\n",
+ "Incorrects 0.000000\n",
+ "Hints 0.000000\n",
+ "Corrects 0.000000\n",
+ "KC(Default) 0.203407\n",
+ "Opportunity(Default) 0.203407\n",
+ "dtype: float64\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"Part of missing values for every column\")\n",
+ "print(data.isnull().sum() / len(data))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "id": "0187b3b5",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "the number of records:\n",
+ "2270384\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"the number of records:\")\n",
+ "print(len(data))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "id": "701b6633",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "how many students are there in the table:\n",
+ "1338\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"how many students are there in the table:\")\n",
+ "print(len(data['Anon Student Id'].unique()))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "id": "bf7b246f",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "how many problems are there in the table:\n",
+ "91913\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"how many problems are there in the table:\")\n",
+ "print(len(data['Problem Name'].unique()))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e0602c47",
+ "metadata": {},
+ "source": [
+ "## Sort by Anon Student Id"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "id": "8051cc2b",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ "0 2000 4000 6000 8000 271emtbxq8pa- 3cjD21W- 271zbdm1lcgj- 271swzglvvxm- 271jonvpgijj- 24841uicq- E9dzBix- 7OalbuD- a8YLu01- 271sjweu45ee- MYmjG5R- ug982yk- 271g8nye4tne- 271zzbwqtqht- 271k4y8incfb- 271tt6j61n7d- 271rvro73lce- 7LZr10z- F713eQN- 271g7beuc4s1- Top 40 students by number of steps they have done count Anon Student Id "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "ds = data['Anon Student Id'].value_counts().reset_index()\n",
+ "ds.columns = [\n",
+ " 'Anon Student Id',\n",
+ " 'count'\n",
+ "]\n",
+ "ds['Anon Student Id'] = ds['Anon Student Id'].astype(str) + '-'\n",
+ "ds = ds.sort_values('count').tail(40)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " ds,\n",
+ " x='count',\n",
+ " y='Anon Student Id',\n",
+ " orientation='h',\n",
+ " title='Top 40 students by number of steps they have done'\n",
+ ")\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "66ef53ad",
+ "metadata": {},
+ "source": [
+ "## Percent of corrects, hints and incorrects"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "id": "c8f1539c",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ "65.3% 27.4% 7.28% corrects incorrects hints Percent of corrects, hints and incorrects "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "count_corrects = data['Corrects'].sum()\n",
+ "count_hints = data['Hints'].sum()\n",
+ "count_incorrects = data['Incorrects'].sum()\n",
+ "\n",
+ "total = count_corrects + count_hints + count_incorrects\n",
+ "\n",
+ "percent_corrects = count_corrects / total\n",
+ "percent_hints = count_hints / total\n",
+ "percent_incorrects = count_incorrects / total\n",
+ "\n",
+ "dfl = [['corrects', percent_corrects], ['hints', percent_hints], ['incorrects', percent_incorrects]]\n",
+ "\n",
+ "df = pd.DataFrame(dfl, columns=['transaction type', 'percent'])\n",
+ "\n",
+ "fig = px.pie(\n",
+ " df,\n",
+ " names=['corrects', 'hints', 'incorrects'],\n",
+ " values='percent',\n",
+ " title='Percent of corrects, hints and incorrects'\n",
+ ")\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3b097141",
+ "metadata": {},
+ "source": [
+ "## Sort by Problem Name"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "id": "6d668c43",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ "0 200 400 600 800 1000 1PTFB11- DISTFB03_SP- DIST11_SP- PROP06- DIST09_SP- DIST10_SP- EG4-CONSTANT 3(x+2) = 15- JAN06- FEB04- NOV13- PROP03- PROP12- FEB11- RATIO2-001- PROP05- JAN09- PROP04- EG1-CONSTANT 7(8+2)- LDEMO_WSLVR- L5FB16- Top 40 useful problem count Problem Name "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "storeProblemCount = [1]\n",
+ "storeProblemName = [data['Problem Name'][0]]\n",
+ "currentProblemName = data['Problem Name'][0]\n",
+ "currentStepName = [data['Step Name'][0]]\n",
+ "lastIndex = 0\n",
+ "\n",
+ "for i in range(1, len(data), 1):\n",
+ " pbNameI = data['Problem Name'][i]\n",
+ " stNameI = data['Step Name'][i]\n",
+ " if pbNameI != data['Problem Name'][lastIndex]:\n",
+ " currentStepName = [stNameI]\n",
+ " currentProblemName = pbNameI\n",
+ " if pbNameI not in storeProblemName:\n",
+ " storeProblemName.append(pbNameI)\n",
+ " storeProblemCount.append(1)\n",
+ " else:\n",
+ " storeProblemCount[storeProblemName.index(pbNameI)] += 1\n",
+ " lastIndex = i\n",
+ " elif stNameI not in currentStepName:\n",
+ " currentStepName.append(stNameI)\n",
+ " lastIndex = i\n",
+ " else:\n",
+ " currentStepName = [stNameI]\n",
+ " storeProblemCount[storeProblemName.index(pbNameI)] += 1\n",
+ " lastIndex = i\n",
+ "\n",
+ "dfData = {\n",
+ " 'Problem Name': storeProblemName,\n",
+ " 'count': storeProblemCount\n",
+ "}\n",
+ "df = pd.DataFrame(dfData).sort_values('count').tail(40)\n",
+ "df[\"Problem Name\"] += '-'\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df,\n",
+ " x='count',\n",
+ " y='Problem Name',\n",
+ " orientation='h',\n",
+ " title='Top 40 useful problem'\n",
+ ")\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "id": "1b965aa4",
+ "metadata": {
+ "scrolled": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ "SYLT-2X&YGE-2X+9- ROOTS1-001- SY=2X&Y=-3X+5- PROBABILITY1-006- EG-RE-DISTRIB-05 (2x+4)/(3x+6)- PROBABILITY1-070- G3X-YLE5&3X-YGE15- BUSES- PEANUTS-CASHEWS- EXPONENT2-012- TVS3- PROBABILITY5-001- EXPONENT3-071- EXPONENT2-046- EXPONENT5-001- PROBABILITY2-001- GLFM-BUSES- EG-EPS01-CONSTANT (4(x^2)*y^4)(3(x^3)*y^3)- PROBABILITY6-002- PROBABILITY6-001- 0 0.5 1 SYLT-2X&YGE-2X+9- ROOTS1-001- SY=2X&Y=-3X+5- PROBABILITY1-006- EG-RE-DISTRIB-05 (2x+4)/(3x+6)- PROBABILITY1-070- G3X-YLE5&3X-YGE15- BUSES- PEANUTS-CASHEWS- EXPONENT2-012- TVS3- PROBABILITY5-001- EXPONENT3-071- EXPONENT2-046- EXPONENT5-001- PROBABILITY2-001- GLFM-BUSES- EG-EPS01-CONSTANT (4(x^2)*y^4)(3(x^3)*y^3)- PROBABILITY6-002- PROBABILITY6-001- Correct rate of each problem (top 20) (total transactions of each problem are required to be more than 500) Correct rate Problem Name "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "image/svg+xml": [
+ "RATIONAL1-091- RATIONAL1-165- RATIONAL1-034- RATIONAL1-058- RATIONAL1-075- RATIONAL1-035- RATIONAL1-281- RATIONAL1-261- RATIONAL1-177- RATIONAL1-064- RATIONAL1-147- RATIONAL1-021- RATIONAL1-121- RATIONAL1-008- RATIONAL1-288- RATIONAL1-109- BH1T31B- RXMX_3C- EG-RE-CANCEL-04 ((x-9)^3)/(x-9)^4- RATIONAL2-205- 0 0.1 0.2 0.3 RATIONAL1-091- RATIONAL1-165- RATIONAL1-034- RATIONAL1-058- RATIONAL1-075- RATIONAL1-035- RATIONAL1-281- RATIONAL1-261- RATIONAL1-177- RATIONAL1-064- RATIONAL1-147- RATIONAL1-021- RATIONAL1-121- RATIONAL1-008- RATIONAL1-288- RATIONAL1-109- BH1T31B- RXMX_3C- EG-RE-CANCEL-04 ((x-9)^3)/(x-9)^4- RATIONAL2-205- Correct rate of each problem (bottom 20) (total transactions of each problem are required to be more than 500) Correct rate Problem Name "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "data['total transactions'] = data['Incorrects'] + data['Hints'] + data['Corrects']\n",
+ "df1 = data.groupby('Problem Name')['total transactions'].sum().reset_index()\n",
+ "df2 = data.groupby('Problem Name')['Corrects'].sum().reset_index()\n",
+ "df1['Corrects'] = df2['Corrects']\n",
+ "df1['Correct rate'] = df1['Corrects'] / df1['total transactions']\n",
+ "\n",
+ "df1 = df1.sort_values('total transactions')\n",
+ "count = 0\n",
+ "standard = 500\n",
+ "for i in df1['total transactions']:\n",
+ " if i > standard:\n",
+ " count += 1\n",
+ "df1 = df1.tail(count)\n",
+ "\n",
+ "df1 = df1.sort_values('Correct rate')\n",
+ "\n",
+ "df1['Problem Name'] = df1['Problem Name'].astype(str) + \"-\"\n",
+ "\n",
+ "df_px = df1.tail(20)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df_px,\n",
+ " x='Correct rate',\n",
+ " y='Problem Name',\n",
+ " orientation='h',\n",
+ " title='Correct rate of each problem (top 20) (total transactions of \\\n",
+ "each problem are required to be more than 500)',\n",
+ " text='Problem Name'\n",
+ ")\n",
+ "fig.update_layout(title_font_size=10)\n",
+ "fig.show(\"svg\")\n",
+ "\n",
+ "df_px = df1.head(20)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df_px,\n",
+ " x='Correct rate',\n",
+ " y='Problem Name',\n",
+ " orientation='h',\n",
+ " title='Correct rate of each problem (bottom 20) (total transactions of \\\n",
+ "each problem are required to be more than 500)',\n",
+ " text='Problem Name'\n",
+ ")\n",
+ "fig.update_layout(title_font_size=10)\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2bee6e04",
+ "metadata": {},
+ "source": [
+ "These two figures present the correct rate of problems. Problems with low correct rate deserve more attention from teachers and students."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0047e839",
+ "metadata": {},
+ "source": [
+ "## Sort by KC"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "id": "c6e910e0",
+ "metadata": {
+ "scrolled": false
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/svg+xml": [
+ "[Rule: RF right ([SolverOperation rf],{components with property canReduceFractionsNoMultWhole of right})]- Enter second extreme in equation- Proportional-Constant-Expression-Gla:Student-Modeling-Analysis- Enter Calculated value of rate- [SkillRule: Combine like terms, no var; CLT]- Enter square of leg label- Compare medians - removed outlier- Enter number of total outcomes in table- Find square of given leg- Enter ratio quantity to right of \"to\"- Enter fractional probability of event- [SkillRule: Select Multiply; {MT; MT no fraction coeff}]- Write base of exponential from given whole number as product- Write decimal multiplier from given scientific notation- Enter ratio quantity to right of colon- PM-ROW-1- Changing axis intervals- unspecified- [Rule: unnec-elems ([SolverOperation unnec-elems],)]- Select second event- 0 0.2 0.4 0.6 0.8 1 Correct rate of each KC(Default) (top 20) (total transactions of each KC are required to be more than 300) correct rate "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "image/svg+xml": [
+ "unknown bug element~~CLT-ROW-1-COEFF- unknown bug element~~CLT-ROW-1- Plot terminating improper fractions- Plot decimal - thousandths- Plot terminating mixed number- Plot imperfect radical- Plot decimal - hundredths- Setting the slope- Plot terminating proper fraction- Plot percent- Plot non-terminating proper fraction- Plot non-terminating improper fraction- Entering slope, GLF- Placing coordinate point- Plot decimal - tenths- Finding the intersection, SIF- Finding the intersection, Mixed- [Rule: CLT nested, no LCD ([SolverOperation clt],)]- 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 Correct rate of each KC(Default) (bottom 20) (total transactions of each KC are required to be more than 300) correct rate "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "data.dropna(subset=['KC(Default)'], inplace=True)\n",
+ "\n",
+ "data['total transactions'] = data['Corrects'] + data['Hints'] + data['Incorrects']\n",
+ "df1 = data.groupby('KC(Default)')['total transactions'].sum().reset_index()\n",
+ "df2 = data.groupby('KC(Default)')['Corrects'].sum().reset_index()\n",
+ "df1['Corrects'] = df2['Corrects']\n",
+ "df1['correct rate'] = df1['Corrects'] / df1['total transactions']\n",
+ "\n",
+ "count = 0\n",
+ "standard = 300\n",
+ "for i in df1['total transactions']:\n",
+ " if i > standard:\n",
+ " count += 1\n",
+ "df1 = df1.sort_values('total transactions').tail(count)\n",
+ "\n",
+ "df1 = df1.sort_values('correct rate')\n",
+ "\n",
+ "df1['KC(Default)'] = df1['KC(Default)'].astype(str) + '-'\n",
+ "\n",
+ "df_px = df1.tail(20)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df_px,\n",
+ " x='correct rate',\n",
+ " y='KC(Default)',\n",
+ " orientation='h',\n",
+ " title='Correct rate of each KC(Default) (top 20) (total transactions of \\\n",
+ "each KC are required to be more than 300)',\n",
+ " text='KC(Default)'\n",
+ ")\n",
+ "fig.update_yaxes(visible=False)\n",
+ "fig.update_layout(title_font_size=10)\n",
+ "fig.show(\"svg\")\n",
+ "\n",
+ "df_px = df1.head(20)\n",
+ "\n",
+ "fig = px.bar(\n",
+ " df_px,\n",
+ " x='correct rate',\n",
+ " y='KC(Default)',\n",
+ " orientation='h',\n",
+ " title='Correct rate of each KC(Default) (bottom 20) (total transactions of \\\n",
+ "each KC are required to be more than 300)',\n",
+ " text='KC(Default)'\n",
+ ")\n",
+ "fig.update_yaxes(visible=False)\n",
+ "fig.update_layout(title_font_size=10)\n",
+ "fig.show(\"svg\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0feef8a1",
+ "metadata": {},
+ "source": [
+ "These two figures present the correct rate of KCs. KCs with low correct rate deserve more attention from teachers and students."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "22d99527",
+ "metadata": {},
+ "source": [
+ "## Postscript"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "09bc0903",
+ "metadata": {},
+ "source": [
+ "Given that the whole data package is composed of 5 data sets and data files in these 5 data sets that can be used to conduct data analysis share the same data format, the following analysis based on \"algebra_2006_2007_train\" is just an example of data analysis on KDD Cup, and the code can be used to analyse other data files with some small changes on the file path and column names.\n"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.9.6"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
From e278a839f575b2b5f2738c35affbec41089fcf69 Mon Sep 17 00:00:00 2001
From: gguu1314 <88154940+gguu1314@users.noreply.github.com>
Date: Sat, 14 Aug 2021 22:41:05 +0800
Subject: [PATCH 10/10] Update AUTHORS.md
---
AUTHORS.md | 2 ++
1 file changed, 2 insertions(+)
diff --git a/AUTHORS.md b/AUTHORS.md
index 7bd89a0..0006a5d 100644
--- a/AUTHORS.md
+++ b/AUTHORS.md
@@ -22,5 +22,7 @@
[Weizhe Huang](https://github.com/weizhehuang0827)
+[Haoxiang Guan](https://github.com/gguu1314)
+
The stared contributors are the main authors.