CME 250: Introduction to Machine Learning

Winter 2019

Mon, Wed 4:30-5:50pm
Bishop Auditorium

Course Description:   A four week short course presenting the principles behind when, why, and how to apply modern machine learning algorithms. We will discuss a framework for reasoning about when to apply various machine learning techniques, emphasizing questions of overfitting/underfitting, regularization, interpretability, supervised/unsupervised methods, and handling of missing data. The principles behind various algorithms—the why and how of using them—will be discussed, while some mathematical detail underlying the algorithms—including proofs—will not be discussed. Unsupervised machine learning algorithms presented will include k-means clustering, principal component analysis (PCA), and independent component analysis (ICA). Supervised machine learning algorithms presented will include support vector machines (SVM), neural nets, classification and regression trees (CART), boosting, bagging, and random forests. Imputation, the lasso, and cross-validation concepts will also be covered.

Announcements

  • Feb 15: Homework 4 has been posted and will be due Friday, March 1 at 5pm.
  • Feb 6: Homework 3 has been posted and will be due Friday, February 15 at 5pm.
  • Jan 23: There will be no lecture next week! Lectures resume Feb 4. Homework 2 has been posted and will be due Friday, February 1 at 5pm.
  • Jan 16: Homework 1 has been posted and will be due Friday, January 25 at 5pm.
  • Jan 7: Welcome to CME 250! The first lecture will be next Monday, January 14, at 4:30pm in Bishop Auditorium.

Course Info

Instructor
  • Sherrie Wang
  • Email: sherwang [at] stanford [dot] edu (Please post questions on course content to Piazza)
  • Office Hours: Tue 6-7pm Y2E2 362
Requirements
  • Course is 1 unit, graded Satisfactory / No Credit.
  • To receive credit, students must complete the 4 homework assignments at a passing level (70+%).
Prerequisites

Course assumes no prior background in machine learning. Previous exposure to undergraduate-level mathematics (calculus, linear algebra, statistics) and basic programming (R/Matlab/Python) helpful.


Schedule

Week Monday Wednesday Assignment
1 No lecture No lecture
2 Lecture 1: Overview of Machine Learning Lecture 2: Linear and Logistic Regression
3 No lecture (MLK Day) Lecture 3: Regularization and Sparsity HW1 due Friday 5:00pm
4 No lecture No lecture HW2 due Friday 5:00pm
5 Lecture 4: Cross-validation and Imputation Lecture 5: Support Vector Machines
6 Lecture 6: Classification and Regression Trees (CART) Lecture 7: Unsupervised Methods HW3 due Friday 5:00pm
7 No lecture (President's Day) Lecture 8: Neural Networks
8 No lecture No lecture HW4 due Friday 5:00pm


Homework

Homework assignments will appear here as they are assigned. Solutions will be posted after the due date.

Submission

Assignments will be submitted through Google Forms and Gradescope. Part 1 of each assignment is a conceptual multiple choice submitted via Google Forms. Parts 2 and 3 contain exercises that will be submitted via Gradescope. You should have received an invite to Gradescope for CME 250. Login via the invite and submit assignments on time. If you have not received an invite, please email me.

Collaboration Policy and Honor Code

You are encouraged to form study groups and discuss homework with other students. Submitted homework assignments must however be written and coded from scratch independently, and must not refer to notes from group discussion. In other words, each student should understand the solutions deeply enough to reproduce them by him/herself. If homework exercises from the reference text have solutions available online or via other sources, do not copy or refer to them when doing the assignments for this course.


References

Textbooks
  • Main reference text: An Introduction to Statistical Learning with Applications in R by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani (pdf available free online). This book is a shorter and simpler version of the next listed text (below).
  • Comprehensive reference: The Elements of Statistical Learning by Trevor Hastie, Robert Tibshirani, and Jerome Friedman. This comprehensive reference presents more material, and at a higher mathematical level, than the preceding text. The full pdf is freely available from the authors at the above link.
Websites
  • Course Piazza for discussion on methods covered, homework assignments, and course logistics.
  • Winter 2016 iteration of this course, covering similar material using different assignments.