NYU CS-GY 9223D (3943B)
Algorithmic Machine Learning
and Data Science

Advanced theory course exploring contemporary algorithmic techniques and recent research on computational methods that enable machine learning and data science at scale.

Course Team:

Professor
Christopher Musco

Project Mentor
Aline Besa

Undergrad Mentor
Raphael Meyer

Administrative Information

Lectures: 370 Jay St., Room 1201 and via Zoom (link on NYU Classes).
Lecture component: Wed. 11:00am-12:15pm.
Flipped component: Wed. 12:30pm-1:30pm.
Recorded component: Posted Thurs. by EOD.

Syllabus: here. Please see for information on COVID-19 changes.
Final project guidelines: here.
Midterm information and practice: here.

Professor Office hours: Thurs. 10am-12pm, Zoom link.
Raphael's Undergrad. Office hours: Mon. 12-2pm, Zoom link.
Paper Reading Group: Mon. 1-3pm via Zoom (link on NYU Classes).
Piazza: Sign-up Link.
Student run Slack: Link.

Quizzes: Weekly check-in quizzes will be administered via Google Forms. Link will be posted on this site. They must be completed by 11:00am ET the Wed. after they are posted.

Homework: Homework must be turned in to NYU Classes by the specified due date. While not required, I encourage students to prepare problem sets in LaTeX or Markdown (with math support.) You can use this template for LaTeX. While there is a learning curve for LaTeX (less for Markdown), it typically saves students time by the end of the semester! If you write problems by hand, please scan and upload as a PDF.

Collaboration is allowed on homework, but solutions and any code must be written-up independently. Writing should not be done in parallel with others. Students must list collaborators on their problem sets (for each problem separately). See the syllabus for full details.

Course Summary

Prerequisites: This course is mathematically rigorous, and is intended for graduate students and advanced undergraduates. Formally we require previous courses in machine learning, algorithm design and analysis, and linear algebra. Experience with probability and random variables is necessary. See the syllabus for more details and email Prof. Musco if you have questions about your preparation for the course!

Coursework: One meeting per week. Short weekly quiz due before next class (10% of grade). Problem sets every two weeks involving analysis and application of algorithmic methods learned in class, with some programming exercises (40% of grade). At-home midterm exam (15% of grade). Final project to be completed in groups of two (25% of grade). Class participation is the remaining 10% of the grade. Please consult the formal syllabus for more information.

Resources: There is no textbook to purchase. Course material will consist of my written lecture notes, as well as assorted online resources, including papers, notes from other courses, and publicly available surveys. Please refer to the course webpage before and after lectures to keep up-to-date as new resources are posted.

Optional Reading Group: It's an exciting time for research at the intersection of algorithm design and the data sciences. Most of the topics covered in this course are still the subject of active research. We will hold a reading group where students will present and discuss recent papers (time TBA). This is a great opportunity to learn extra material, learn to read papers, and find a project topic.

Week #	Topic	Reading	Homework
The Power of Randomness
1. 9/2	In class: Concentration of random variables, applications to hashing Supplemental: Load balancing, the union bound. Link.	Lecture 1 notes (annotated). Lecture 1 Supplemental notes. Probability review! None of this should be new, but you might want to brush up if you haven't taken prob/stat in a while. Typed lecture notes covering hashing topics discussed in class, and proving 2-universality of random linear hash function.	Week 1 Check-in quiz, due Wed. 9/9. Problem Set 1, due Fri. 9/18.
9/9	No Class, Monday Schedule for Labor Day
2. 9/16	In Class: Sketching and streaming algorithms, MinHash. Supplemental: Exponential tail bounds (Chernoff + Bernstein)	Lecture 2 notes (annotated). Lecture 2 Supplemental notes (annotated). Book chapter on Jaccard similarity, MinHash, etc. Original paper giving a loglog(n) space algorithm for the distinct elements problem. Follow-up work on state-of-the-art Hyperloglog algorithm. Cool paper on an application of MinHash to detecting seismic events. Additional reading on concentration bounds can be found in Terry Tao's notes. For a proof of the "power of two choices'' result, see Section 2 in this survey.	Week 2 Check-in quiz, due Wed. 9/23.
3. 9/23	In class: High-dimensional geometry Supplemental: The Johnson-Lindenstrauss lemma + applications	Lecture 3 notes (annotated). Lecture 3 Supplemental notes (annotated). Aline's Lecture 3 notes. Much of the lecture content was adapted from Chapter 2 of Foundations of Data Science. It's a great read! Typed lecture notes on the Johnson-Lindenstrauss lemma. Ignore Section 4.1 -- we will cover that in a future lecture. Helpful notes on the JL Lemma by Anupam Gupta.	Problem Set 2, due Fri. 10/9. Week 3 Check-in quiz, due Wed. 9/30.
4. 9/30	In class: Randomized near neighbor search, analyzing locality sensitive hash functions Supplemental: None	Lecture 4 notes (annotated). Good overview of similarity estimation and locality sensitive hashing in Chapter 3 here. These lecture notes are also helpful. These resources use slightly different language (and a slightly different version of MinHash) than I used in class. If you want to learn more about worst-case runtime guarantees (like the Indyk-Motwani result mentioned in class) take a look at my typed lecture notes.	Week 4 Check-in quiz, due Wed. 10/7. Optional Reading Group paper: LSH-Sampling Breaks the Computational Chicken-and-egg Loop in Adaptive Stochastic Gradient Descent Select project partner by 10/14. See project guideline document for details.
First-Order Optimization
5. 10/7	In class: Analyzing gradient descent and project gradient descent for convex problems	Lecture 5 notes (annotated). If you need to freshen up on linear algebra, now is good time! This quick reference from Stanford mostly covers what we need. Good book on optimization which is freely available online through NYU libraries. Excellent lecture notes from Aleksander Mądry for more reading on analyzing gradient descent.	Week 5 Check-in quiz, due by class on Wed. 10/14. Choose project partner and email me by Wed. 10/14. No Reading Group Monday 10/12 due to Columbus Day.
6. 10/14	In class: Online and stochastic gradient descent	Lecture 6 notes (annotated). Elad Hazan's book on online convex optimization is a great reference if you are interested in this topic.	Optional Reading Group paper: Analyzing Graph Structure via Linear Measurements
7. 10/21	In class: Smoothness, strong convexity, conditioning.	Lecture 7 notes (annotated). Moritz Hardt's lecture notes with proofs of gradient descent convergence in all the regimes discussed in class. Sébastien Bubeck's convex optimization book mentioned in class. Fairly technical, but great reference.
8. 10/28	In class: Preconditioning, acceleration, coordinate descent, non-convex optimization	Lecture 8 notes (annotated). Proof of accelerated gradient descent for general convex functions. Diverse, theory-leaning blog which discusses lots of problems and approaches in understanding the optimization of non-convex problems. Useful document for linear algebra review. Section 3 is especially important.	Complete midterm by midnight ET on Fri. 10/30. One page project proposal due Wed. 11/04.
Spectral Methods and Linear Algebra
9. 11/04	In class: Singular value decomposition, Power Method Supplemental: Krylov subspace methods and a taste of approximation theory	Lecture 9 notes (annotated). Supplemental Lanczos method notes Detailed notes on power method analysis. This should be easier to follow than my lecture proof! See Section 3.7 in Foundations of Data Science for an analysis of power method that does not involve singular value gaps. Student recommended video lectures that give nice intuitive introductions to linear algebra concepts. Read Foundations of Data Science, Chapter 3 for more on singular value decomposition and low-rank approximation. Other good notes here and here.	Week 9 Check-in quiz, due Thurs. 11/12. Problem Set 3, due Wed. 11/25. Required data file UScities.txt. Required readingfor Problem 1. You will likely want to use Claim 4.
10. 11/11	In class: Spectral graph theory and spectral clustering, generative models for networks, stochastic block model	Lecture 10 notes. (annotated). For more details, consult written notes on the stochastic block model. Wayne Zachary's Karate Club paper. Stanford lecture notes introducing spectral graph theory.	Optional Reading Group paper: Finding Frequent Items in Data Streams
11. 11/18	In class: Randomized numerical linear algebra, sketching for linear regression, ε-nets arguments	Lecture 11 notes. (annotated). My written notes on sketched regression and ε-nets. Jelani Nelson's course notes from Harvard with a lot more on randomized linear algebra, including methods for sparse JL sketching and randomized low-rank approximation. ε-net arguments are used all over the place in learning theory, algorithm design, and high dimensional probability. Here's an example of how they appear in a different context.
Fourier Methods
12. 11/25	In class: Sampling from vectors, Fast Johnson-Lindenstrauss	Lecture 12 notes. (annotated). Original FJLT paper.	OPTIONAL Problem Set 4, due Wed. 12/16.
13. 12/2	In class: Sparse recovery and compresses sensing, restricted isometry property, basis pursuit	Lecture 13 notes. (annotated). Written notes on compressed sensing. Lots of great resources under weeks 8 and 9 of Stanford's Algorithmic Toolbox course.
14. 12/9	In class: Leverage scores, spectral sparsifiers, and a taste of my research.	Lecture 14 notes. (annotated).

NYU CS-GY 9223D (3943B)Algorithmic Machine Learning and Data Science

Course Team: Professor Christopher Musco Project Mentor Aline Besa Undergrad Mentor Raphael Meyer

Administrative Information

Course Summary

NYU CS-GY 9223D (3943B)
Algorithmic Machine Learning
and Data Science

Course Team:

Professor
Christopher Musco

Project Mentor
Aline Besa

Undergrad Mentor
Raphael Meyer