-
Notifications
You must be signed in to change notification settings - Fork 0
/
Solutions_Fill-In
129 lines (96 loc) · 6.38 KB
/
Solutions_Fill-In
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
#######################################################################################
# README FILE
# Introduction to Machine Learning
# Assignment 1
#
# After completing the assignment, fill in this README file. It asks you to duplicate
# your answers to several questions. Be certain that your answers in this README file
# (which will be used for auto-grading) match the answers in your PDF writeup.
#
# This README file is formatted in YAML to allow it to be machine readable. Please
# be very careful with how you edit it, being careful to following the formatting.
# Make certain not to change any text in all UPPERCASE. After editing, you can make
# certain that your README file follows proper YAML syntax by running it through an
# online YAML checker, such as http://yaml-online-parser.appspot.com/
#
# The most frequent error is multi-line strings, such as "SOURCESCONSULTED" and
# "FEEDBACK_ERRORS". Make certain that the previous line ends with a | followed by a
# newline. Then, make certain that the subsequent lines are indented one level in
# (4 spaces). This sounds complex, but just follow the existing format of this file.
# If you run into any problems that you can't fix easily in listing your sources or
# providing feedback on the assignment, just include your multi-line string answers to
# these parts as comments. Everything else (all simple one-word or numeric answers) must
# be properly formatted YAML.
#######################################################################################
# Personal information
FIRSTNAME: Eric
LASTNAME: Eaton
PENNKEY: eeaton
PENNID: 123456
# Which course are you enrolled in? (enter 419 or 519)
COURSE: 419
# List all sources of help that you consulted while completing this assignment
# (other students, colleagues, textbooks, websites, etc.). This includes anyone you
# briefly discussed the homework with. If you received help from the following sources,
# you do not need to cite it: course instructor, course teaching assistants, course
# lecture notes, course textbooks or other readings.
#
# If you didn't receive help from anyone, write "none".
SOURCESCONSULTED: |
While completing the assignment, I consulted the following sources:
- scikit-learn.org (documentation on DecisionTreeClassifier)
- scipy.org (documentation on np.vstack and np.std)
- docs.python.org (documentation on lists, dictionaries, and classes)
- Andrew Ng’s CS229 lecture notes on matrix derivatives
- Bayesian Reasoning and Machine Learning by David Barber (linear algebra section)
#######################################################################################
# Answers to Problem 1: Decision Tree Learning
#
# Please list your final answers below for auto-grading.
#######################################################################################
# Problem 1a: information gain for the outlook attribute (only your final answer)
DT_INFO_GAIN_OUTLOOK: 0.24675
# Problem 1a: information gain for the humidity attribute (only your final answer)
DT_INFO_GAIN_HUMIDITY: 0.04533
# Problem 1b: gain ratio for the outlook attribute (only your final answer)
DT_GAIN_RATIO_OUTLOOK: 0.15643
# Problem 1b: gain ratio for the humidity attribute (only your final answer)
DT_GAIN_RATIO_HUMIDITY: 0.04821
#######################################################################################
# Answers to Implementation Exercise 1.3: Decision Trees
#
# Please list your final answers below for auto-grading.
#######################################################################################
# What was the mean of the unpruned decision tree's accuracies? (only your final answer)
SHOWDOWN_DT_TEST_ACCURACY_MEAN: 0.7556
# What was the standard deviation of the unpruned decision tree's accuracies? (only your final answer)
SHOWDOWN_DT_TEST_ACCURACY_STDDEV: 0.08651
# What was the mean of the decision stump's accuracies? (only your final answer)
SHOWDOWN_DSTUMP_TEST_ACCURACY_MEAN: 0.8056
# What was the standard deviation of the decision stump's accuracies? (only your final answer)
SHOWDOWN_DSTUMP_TEST_ACCURACY_STDDEV: 0.0792
# What was the mean of the 3-level decision tree's accuracies? (only your final answer)
SHOWDOWN_DT3_TEST_ACCURACY_MEAN: 0.7744
# What was the standard deviation of the 3-level decision tree's accuracies? (only your final answer)
SHOWDOWN_DT3_TEST_ACCURACY_STDDEV: 0.08589
#######################################################################################
# Answer to Implementation Exercise 2.5: Closed form of linear regression
#######################################################################################
# Copy and paste your one-line matrix equation for computing the closed form solution
# to linear regression (e.g., Line 171 of test_linreg_univariate.py or Line 51 of
# test_linreg_multivariate.py)
LINREGIMPLEMENTATION_CLOSEDFORM: thetaClosedForm = np.dot(pinv(np.dot(X.T,X)),np.dot(X.T,y))
#######################################################################################
# Feedback on the Assignment
#
# The following information will help us improve future versions of this assignment.
# It is completely optional, but highly appreciated. Please be honest.
#######################################################################################
# Approximately how many hours did it take you to complete this assignment?
FEEDBACK_NUM_HOURS: 40
# Please list any typos / errors you noticed in the assignment description or skeleton code
FEEDBACK_ERRORS: |
None
# Please describe any problems you encountered while completing this assignment
FEEDBACK_PROBLEMS: |
I will admit that I do not come from a strong CS background. I am aware the Professor Eaton mentioned during the first day of class that having taken CIS121 was almost essential for taking this course. Now having done this first problem set, I understand what he meant. Most of my 40 hours spent working on this assignment was researching and reading up on computer science basics, especially pertaining to data structures. I feel like I got lucky on this assignment because Python documentation is very thorough so I was able to understand the basic concepts quickly. I know if I want to make it through this class, I will have to put in an extraordinary amount of effort. Luckily I enjoy the material and the lectures, so it’s all quite fun! I also want to apologize for any sloppily written code…I promise I will get better!