NYU CSUY 4563
Introduction to Machine Learning
A broad introduction to the exciting field of machine learning through a mixture of handson experience and theoretical foundations.
Course Team:
Administrative Information
Lectures: Mon./Wed. 9:0010:20am. Zoom link on NYU Classes.
Syllabus: Available here. Please review carefully!
Zige's Office hours: Thurs., 11am11pm, Permanent Zoom link.
Prathamesh's Office hours: Thurs., 11am1pm, Permanent Zoom link.
Raphael's Office hours: Tues., 35pm, Permanent Zoom link.
Professor Office hours: Wed., 13pm, Permanent Zoom link.
Git repository: All assignments and labs should be downloaded from our GitHub repository. To stay organized, we suggest students access material from this repository using a git client. Instructions can be found here.
Piazza: Course announcements will be made over Piazza, so please create an account and join our site. Unless asked in person, all questions should be posted to Piazza (not sent via emails). We prefer that questions about lectures or homework are asked publicly, since they will often help your classmates, but Piazza supports private questions for things relevant only to you.
Homework: Homework (both written problems and coding labs) must be turned in to NYU Classes by the specified due date. Labs should be turned in as evaluated Jupyter notebooks. Do not clear the output before turning in. While not required, I encourage students to prepare problem sets in LaTeX or Markdown (with math support.) You can use this template for LaTeX. While there is a learning curve for LaTeX (less for Markdown), it typically saves students time by the end of the semester! If you write problems by hand, please scan and upload as a PDF (using a scanner app on your phone is fine).
For homework problems collaboration is allowed, but solutions and any code must be written independently. Students must list collaborators on their problem sets (for each problem separately). See the syllabus for full details.
Course Summary
Coursework: Weekly programming labs and written problems sets (30% of grade). Two inclass midterm exams on March 9th and April 20th (20% of grade each). Final openended project with a partner (20% of grade). Class participation is the remaining 10% of the grade. Please consult the formal syllabus for more information. Different students contribute in different ways, so you don't need to be vocal in class to get a good class participation grade. We also count participation in office hours, on Piazza, etc.
Project: Please refer to the information sheet for requirements and deadlines.
Python and Jupyter: Demos and labs in this class use Python, run through Jupyter notebooks. Jupyter lets you create and edit documents with live Python code and rich comments and images. You can install Python and Jupyter notebooks on any personal computer, or run it in the cloud via Google Colaboratory. Instructions can be found here.
Prerequisites: Modern machine learning uses a lot of math! Probably more than any other subject in computer science that you will study as an undergraduate. You can get pretty far with an understanding of just probability and linear algebra, but that understanding needs to be solid for you to succeed in this course. Formally we require a prior course in Algorithms and Data Structures, a course in Probability or Statistics, and a course in Linear Algebra.
Resources: There is no textbook to purchase. I may post readings, some of which will come from the following books which are available as free PDFs:
 An Introduction to Statistical Learning by James, Witten, Hastie, and Tibshirani.
 Python Machine Learning by Raschka.
Lecture #  Topic  Reading  Homework 

Regression and Function Fitting  
1. 1/27  Introduction to Machine Learning 


2. 1/29  Simple Linear Regression, Loss Functions 


3. 2/3  Multiple Linear Regression, Data Transformations 


4. 2/5  Finish Multiple Linear Regression, Model Selection, Cross Validation 


5. 2/10  Model selection, regularization. 


Classification  
6. 2/12  Naive Bayes, the Bayesian Perspective  
2/17  Presidents Day, No Class  
7. 2/19  The Bayesian Perspective cont. 


8. 2/24  Bayesian Regression, Linear Classifiers 


9. 2/26  Logistic Regression 


10. 3/2  Optimization, Gradient Descent 


11. 3/4  Finish Gradient Descent, Midterm Review  
3/9  Midterm 1 

Will cover lectures 1  10. Twosided cheatsheet allowed. 
12. 3/11  kNearest Neighbors, Kernel Methods  
Beyond Linear Methods: Neural Networks  
3/16  Spring break, no class.  
3/18  Spring break, no class.  
13. 3/23  Kernel Methods 


3/25  CANCELED  
14. 3/30  Support Vector Machines 


15. 4/1  Neural Networks 1: Introduction 


16. 4/6  Neural Networks 2: History + Backprop  
17. 4/8  Neural Networks 3: Finish Backprop, Stochastic Gradient Descent 


18. 4/13  Convolution, Feature Extraction, Edge Detection, Feature Transfer 


19. 4/15  Convolutional networks, deep learning 


Unsupervised Learning  
4/20  Autoencoders, Dimensionality Reduction  
20. 4/22  Autoencoder applications, Principal Component Analysis 


21. 4/27  Finish PCA  
22. 4/29  Semantic Embeddings, Clustering  
Selected Topics  
23. 5/4  Introduction to Reinforcement Learning  
5/6  Project Presentations  
5/11  Project Presentations 
