NYU CS-UY 4563
Introduction to Machine Learning
A broad introduction to the exciting field of machine learning through a mixture of hands-on experience and theoretical foundations.
Course Team:
Administrative Information
Lectures: Mon./Wed. 9:00-10:20am. Zoom link on NYU Classes.
Syllabus: Available here. Please review carefully!
Zige's Office hours: Thurs., 11am-11pm, Permanent Zoom link.
Prathamesh's Office hours: Thurs., 11am-1pm, Permanent Zoom link.
Raphael's Office hours: Tues., 3-5pm, Permanent Zoom link.
Professor Office hours: Wed., 1-3pm, Permanent Zoom link.
Git repository: All assignments and labs should be downloaded from our GitHub repository. To stay organized, we suggest students access material from this repository using a git client. Instructions can be found here.
Piazza: Course announcements will be made over Piazza, so please create an account and join our site. Unless asked in person, all questions should be posted to Piazza (not sent via emails). We prefer that questions about lectures or homework are asked publicly, since they will often help your classmates, but Piazza supports private questions for things relevant only to you.
Homework: Homework (both written problems and coding labs) must be turned in to NYU Classes by the specified due date. Labs should be turned in as evaluated Jupyter notebooks. Do not clear the output before turning in. While not required, I encourage students to prepare problem sets in LaTeX or Markdown (with math support.) You can use this template for LaTeX. While there is a learning curve for LaTeX (less for Markdown), it typically saves students time by the end of the semester! If you write problems by hand, please scan and upload as a PDF (using a scanner app on your phone is fine).
For homework problems collaboration is allowed, but solutions and any code must be written independently. Students must list collaborators on their problem sets (for each problem separately). See the syllabus for full details.
Course Summary
Coursework: Weekly programming labs and written problems sets (30% of grade). Two in-class midterm exams on March 9th and April 20th (20% of grade each). Final open-ended project with a partner (20% of grade). Class participation is the remaining 10% of the grade. Please consult the formal syllabus for more information. Different students contribute in different ways, so you don't need to be vocal in class to get a good class participation grade. We also count participation in office hours, on Piazza, etc.
Project: Please refer to the information sheet for requirements and deadlines.
Python and Jupyter: Demos and labs in this class use Python, run through Jupyter notebooks. Jupyter lets you create and edit documents with live Python code and rich comments and images. You can install Python and Jupyter notebooks on any personal computer, or run it in the cloud via Google Colaboratory. Instructions can be found here.
Prerequisites: Modern machine learning uses a lot of math! Probably more than any other subject in computer science that you will study as an undergraduate. You can get pretty far with an understanding of just probability and linear algebra, but that understanding needs to be solid for you to succeed in this course. Formally we require a prior course in Algorithms and Data Structures, a course in Probability or Statistics, and a course in Linear Algebra.
Resources: There is no textbook to purchase. I may post readings, some of which will come from the following books which are available as free PDFs:
- An Introduction to Statistical Learning by James, Witten, Hastie, and Tibshirani.
- Python Machine Learning by Raschka.
Lecture # | Topic | Reading | Homework |
---|---|---|---|
Regression and Function Fitting | |||
1. 1/27 | Introduction to Machine Learning |
|
|
2. 1/29 | Simple Linear Regression, Loss Functions |
|
|
3. 2/3 | Multiple Linear Regression, Data Transformations |
|
|
4. 2/5 | Finish Multiple Linear Regression, Model Selection, Cross Validation |
|
|
5. 2/10 | Model selection, regularization. |
|
|
Classification | |||
6. 2/12 | Naive Bayes, the Bayesian Perspective | ||
2/17 | Presidents Day, No Class | ||
7. 2/19 | The Bayesian Perspective cont. |
|
|
8. 2/24 | Bayesian Regression, Linear Classifiers |
|
|
9. 2/26 | Logistic Regression |
|
|
10. 3/2 | Optimization, Gradient Descent |
|
|
11. 3/4 | Finish Gradient Descent, Midterm Review | ||
3/9 | Midterm 1 |
|
Will cover lectures 1 - 10. Two-sided cheat-sheet allowed. |
12. 3/11 | k-Nearest Neighbors, Kernel Methods | ||
Beyond Linear Methods: Neural Networks | |||
3/16 | Spring break, no class. | ||
3/18 | Spring break, no class. | ||
13. 3/23 | Kernel Methods |
|
|
3/25 | CANCELED | ||
14. 3/30 | Support Vector Machines |
|
|
15. 4/1 | Neural Networks 1: Introduction |
|
|
16. 4/6 | Neural Networks 2: History + Backprop | ||
17. 4/8 | Neural Networks 3: Finish Backprop, Stochastic Gradient Descent |
|
|
18. 4/13 | Convolution, Feature Extraction, Edge Detection, Feature Transfer |
|
|
19. 4/15 | Convolutional networks, deep learning |
|
|
Unsupervised Learning | |||
4/20 | Auto-encoders, Dimensionality Reduction | ||
20. 4/22 | Auto-encoder applications, Principal Component Analysis |
|
|
21. 4/27 | Finish PCA | ||
22. 4/29 | Semantic Embeddings, Clustering | ||
Selected Topics | |||
23. 5/4 | Introduction to Reinforcement Learning | ||
5/6 | Project Presentations | ||
5/11 | Project Presentations |
|