NYU CSGY 9223D (3943B)
Algorithmic Machine Learning
and Data Science
Advanced theory course exploring contemporary algorithmic techniques and recent research on computational methods that enable machine learning and data science at scale.
Course Team:
Administrative Information
Lectures: 370 Jay St., Room 1201 and via Zoom (link on NYU Classes).
Lecture component: Wed. 11:00am12:15pm.
Flipped component: Wed. 12:30pm1:30pm.
Recorded component: Posted Thurs. by EOD.
Syllabus: here. Please see for information on COVID19 changes.
Final project guidelines: here.
Professor Office hours: Thurs. 10am12pm, Permanent Zoom link.
Raphael's Office hours (undergrads only): Mon. 12pm2pm, Permanent Zoom link.
Paper Reading Group: TBA, Permanent Zoom link.
Piazza: Signup Link.
Student run Slack: Link.
Quizzes: Weekly checkin quizzes will be administered via Google Forms. Link will be posted on this site. They must be completed by 11:00am ET the Wed. after they are posted.
Homework: Homework must be turned in to NYU Classes by the specified due date. While not required, I encourage students to prepare problem sets in LaTeX or Markdown (with math support.) You can use this template for LaTeX. While there is a learning curve for LaTeX (less for Markdown), it typically saves students time by the end of the semester! If you write problems by hand, please scan and upload as a PDF.
Collaboration is allowed on homework, but solutions and any code must be writtenup independently. Writing should not be done in parallel with others. Students must list collaborators on their problem sets (for each problem separately). See the syllabus for full details.
Course Summary
Prerequisites: This course is mathematically rigorous, and is intended for graduate students and advanced undergraduates. Formally we require previous courses in machine learning, algorithm design and analysis, and linear algebra. Experience with probability and random variables is necessary. See the syllabus for more details and email Prof. Musco if you have questions about your preparation for the course!
Coursework: One meeting per week. Short weekly quiz due before next class (10% of grade). Problem sets every two weeks involving analysis and application of algorithmic methods learned in class, with some programming exercises (40% of grade). Athome midterm exam (15% of grade). Final project to be completed in groups of two (25% of grade). Class participation is the remaining 10% of the grade. Please consult the formal syllabus for more information.
Resources: There is no textbook to purchase. Course material will consist of my written lecture notes, as well as assorted online resources, including papers, notes from other courses, and publicly available surveys. Please refer to the course webpage before and after lectures to keep uptodate as new resources are posted.
Optional Reading Group: It's an exciting time for research at the intersection of algorithm design and the data sciences. Most of the topics covered in this course are still the subject of active research. We will hold a reading group where students will present and discuss recent papers (time TBA). This is a great opportunity to learn extra material, learn to read papers, and find a project topic.
Week #  Topic  Reading  Homework  

The Power of Randomness  
1. 9/2 
In class: Concentration of random variables, applications to hashing Supplemental: Load balancing, the union bound. Link. 



9/9  No Class, Monday Schedule for Labor Day  
2. 9/16 
In Class: Sketching and streaming algorithms, MinHash.
Supplemental: Exponential tail bounds (Chernoff + Bernstein) 



3. 9/23 
In class: Highdimensional geometry Supplemental: The JohnsonLindenstrauss lemma + applications 



4. 9/30 
In class: Randomized near neighbor search Supplemental: Analyzing locality sensitive hash functions 


FirstOrder Optimization  
5. 10/7 
In class: Analyzing gradient descent for convex and nonconvex problems Supplemental: Stochastic and online gradient descent 

6. 10/14 
In class: Smoothness, strong convexity, conditioning, preconditioning. Supplemental: Acceleration, coordinate descent 

7. 10/21 
In class: Learning from experts, multiplicative weights Supplemental: 

8. 10/28 
In class: Constrained optimization, linear programming Supplemental: LP relaxations 

Spectral Methods and Linear Algebra  
9. 11/04 
In class: Singular value decomposition, Power Method Supplemental: Krylov subspace methods and a taste of approximation theory 

10. 11/11 
In class: Spectral graph theory and spectral clustering Supplemental: Generative models for networks, stochastic block model 

11. 11/18 
In class: Randomized numerical linear algebra, sketching for linear regression, εnets arguments Supplemental: Fast and Sparse JohnsonLindenstrauss methods 

12. 11/25 
In class: Matrix functions and their computation Supplemental: Spectral Density Estimation 

Fourier Methods  
13. 12/2 
In class: Compressed sensing, the restricted isometry property, basis pursuit Supplemental: Iterative methods for faster sparse recovery 

14. 12/9 
In class: Kernel methods in machine learning, Bochner's theorem, random Fourier features Supplemental: Matrix subsampling, Nyström approximation 