NYU CSGY 6923
Machine Learning
A broad introduction to the exciting field of machine learning through a mixture of handson experience and theoretical foundations.
Course Team:
Lectures: 215 Rogers Hall. Virtually via Zoom (links on Brightspace).
Professor office hours: Weekly on Mondays 11am1pm. Zoom link.
Thomas office hours: Weekly on Wednesdays 121pm. Zoom link.
Siddharth office hours: Weekly on Tuesdays 34pm. Zoom link.
Ozlem office hours: Weekly on Tuesdays 122pm. Zoom link.
Syllabus: here.
Grading breakdown: Written Problem Sets 25%, Programming Labs (including miniproject) 25%, Midterm 20%, Final Exam 20%, Participation 10%
Ed Stem: All course communicate will be via Ed, so please create an account and join our site. All questions should also be posted to Ed (not sent via emails). We prefer that questions about lectures or homework are asked publicly, since they will often help your classmates, but Ed supports private questions for things relevant only to you.
Python and Jupyter: Demos and labs in this class use Python, run through Jupyter notebooks. Jupyter lets you create and edit documents with live Python code and rich comments and images. We suggest that students run their Jupyter notebooks via Google Colaboratory, and we will share them via Colab. Uou also have the option of installing and running everything on your personal computer. Instructions can be found here.
Prerequisites: Modern machine learning uses a lot of math! Probably more than any other subject in computer science outside theoretical computer science. You can get pretty far with an understanding of just calculus, probability, and linear algebra, but that understanding needs to be solid for you to succeed in this course. Formally we require a prior course in probability or statistics. If you need to freshen up on linear algebra, this quick reference from Stanford is helpful.
Homework: Homework (both written problems and coding labs) must be turned in to Gradescope by the specified deadline. Use the code P5D5BP to join the class on Gradescope. We do not accept late work without prior permission.
Labs should be turned in as evaluated Jupyter notebooks. Do not clear the output before turning in.
While not required, for written problem sets I encourage students to prepare problem sets in LaTeX or Markdown (with math support.)
You can use this template for LaTeX. While there is a learning curve, these tools typically save students time in the end! If you do write problems by hand, scan and upload as a PDF.
Discussion is allowed on homework, but solutions and code must be written independently. See the syllabus for details. We have a zero tolerance policy for copied code or solutions: any students with duplicate or very similar material will receive a zero on the offending assignment. My advice is to never share code or solutions with other students.
Resources: There is no textbook to purchase. I may post readings, some of which will come from the following book, which is available free online via the NYU library:
 An Introduction to Statistical Learning by James, Witten, Hastie, and Tibshirani.
Final Project: See guidelines for the final project here.
Lecture #  Topic  Reading  Homework 

Regression and Function Fitting  
1. 1/27  Introduction to Machine Learning, Simple Linear Regression, Loss Functions 


2. 2/3  Multiple Linear Regression, Data Transformations, Model Selection, Regularization 


3. 2/10  Finish model selection, Regularization, Start Bayesian Perspective 


4. 2/17  Naive Bayes, the Bayesian Perspective 


Classification  
5. 2/24  Linear Logistic Regression, Optimization, Gradient Descent 


6. 3/3  Optimization, Gradient Descent, Stochastic Gradient Descent 


7. 3/10 
Midterm Exam (first half of class)
Learning Theory, the PAC model 


3/17  Spring break, no class.  
Beyond Linear Methods  
8. 3/24  kNearest Neighbors, Kernel Methods 


9. 3/31  Support Vector Machines, Neural Networks 1: Introduction, History 


10. 4/7  Neural Networks 2: Backpropagation, Convolution 


11. 4/14  Convolution, Feature Extraction, Transfer Learning 


Unsupervised Learning  
12. 4/21  Autoencoders, Principal Component Analysis 


13. 4/28  Semantic Embeddings, Beyond Autoencoders  
Selected Topics  
14. 5/5  Introduction to Reinforcement Learning 

