NYU CS-GY 6763 (3943)
Algorithmic Machine Learning
and Data Science

Advanced theory course exploring contemporary computational methods that enable machine learning and data science at scale.

Course Team:

Professor
Christopher Musco

Course Assistant
Aarshvi Gajjar

Course Assistant
Atsushi Shimizu

Course Assistant
Lucas Rosenblatt

Lectures: Wednesday 11:00am-1:30pm, Rogers Hall, Room 325. Recordings available through Brightspace.
Reading group: Calendar here. Sign up sheet here.
Prof. office hours (general): Mondays 11:00pm-12:30pm, Zoom link.
Lucas office hours (general): Tuesdays 10:30am-11:30pm, 8th Floor common area, 370 Jay
Atsushi office hours (general): Fridays 4:00pm-5:00pm, Zoom link
Aarshvi office hours (undergrad only): Mondays 12:30-1:30pm, Zoom link.

Syllabus: here.
Gradescope Entry Code: RZ56X4
Link to Join Ed Discussion: https://edstem.org/us/join/eYUQ22 Grading breakdown: Quizzes 0%, Problem Sets 50%, Midterm 20%, Final Project OR Final Exam 20%, Partipation 10%
Final project guidelines: here.

Quizzes: Optional weekly check-in quizzes will be administered via GradeScope. They can be completed either after each lecture to check your understanding, or whenver you feel would be helpful (e.g. before an exam).

Problem Sets: Problem sets must be turned in via GradeScope. While not required, I encourage students to prepare problem sets in LaTeX or Markdown (with math support.) You can use this template for LaTeX. While there is a learning curve, these tools typically save students time in the end! If you do write problems by hand, scan and upload as a PDF. Collaboration is allowed on homework, but solutions and code must be written independently. Writing should not be done in parallel, and students must list collaborators for each problem separately.

Prerequisites: This course is mathematically rigorous, and is intended for graduate students and advanced undergraduates. Formally we require previous courses in machine learning, algorithms, and linear algebra. Experience with probability and random variables is necessary. See the syllabus for more details and email Prof. Musco if you have questions about your preparation for the course!

Resources: There is no textbook to purchase. Course material will consist of my written lecture notes, as well as assorted online resources, including papers, notes from other courses, and publicly available surveys. Please refer to the course webpage before and after lectures to keep up-to-date as new resources are posted.

Reading Group: It's an exciting time for research at the intersection of algorithm design and the data sciences. Most of the topics covered in this course are still the subject of active research. Starting midway through the semester we will be holding a reading group for students working on final projects (and any others who wish) to discuss and workshop papers.

Problem Sets:
Problem Set 1 (due Tuesday, Sept. 27th by 11:59pm ET).
Problem Set 2 (due Tuesday, Oct. 18th by 11:59pm ET).
Midterm Information (exam on Wednesday, Oct. 26th).
Problem Set 3, covid_data.csv (due Tues, Nov. 22th by 11:59pm ET).
Problem Set 4 (due Thursday, Dec. 15th by 11:59pm ET).
Final Exam Information (exam on Wednesday, Dec. 21st).

Week #	Topic	Reading	Homework
The Power of Randomness
1. 9/7	Random variables and concentration, Markov's inequality, applications	Lecture 1 notes (annotated). Probability review! None of this should be new, but you might want to brush up if you haven't taken prob/stat in a while. A student recommended this resource for review. Typed lecture notes covering another application of Markov's inequality to analyzing hashing. Interesting paper on applications of mark-and-recapture to network size estimation, and some cool improved methods.	Problem Set 1, due Tues. 9/27.
2. 9/14	Chebyshev inequality, exponential tail bounds (Chernoff + Bernstein), and applications	Lecture 2 notes (annotated). Typed lecture notes proving universality of random linear hash function. Book chapter on MinHash, Jaccard similarity (covered next class), etc. Original paper giving a loglog(n) space algorithm for the distinct elements problem. Follow-up work on state-of-the-art Hyperloglog algorithm. Cool paper on an application of MinHash to detecting seismic events. Additional reading on concentration bounds can be found in Terry Tao's notes. For a proof of the "power of two choices'' result, see Section 2 in this survey
3. 9/21	High-dimensional geometry	Lecture 3 notes (annotated). Much of the lecture content was adapted from Chapter 2 of Foundations of Data Science. It's a great read!
4. 9/28	Johnson-Lindenstrauss Lemma and Dimensionality Reduction	Lecture 4 notes (annotated). Typed lecture notes on the Johnson-Lindenstrauss lemma. Ignore Section 4.1 -- we will cover that in a future lecture. Helpful notes on the JL Lemma by Anupam Gupta. Good overview of similarity estimation and locality sensitive hashing in Chapter 3 here.	Problem Set 2, due Tues. 10/18.
5. 10/5	Locality Sensitive Hashing, Introduction to Optimization	Lecture 5 notes (annotated). Helpful lecture notes on locality sensitive hashing. These resources use slightly different language (and a slightly different version of MinHash) than I used in class. If you want to learn more about worst-case runtime guarantees (like the Indyk-Motwani result mentioned in class) take a look at my typed lecture notes.
Optimization
6. 10/12	Gradient descent and projected gradient descent	Lecture 6 notes (annotated). If you need to freshen up on linear algebra, now is good time! This quick reference from Stanford mostly covers what we need. Good book on optimization which is freely available online through NYU libraries. Good lecture notes from Aleksander Mądry for more reading on analyzing gradient descent. Moritz Hardt's lecture notes with proofs of gradient descent convergence in all the regimes discussed in class. Sébastien Bubeck's convex optimization book mentioned in class. Fairly technical, but great reference.
7. 10/19	Second order conditions, Online and stochastic gradient descent, preconditioning	Lecture 7 notes (annotated). Elad Hazan's book on online convex optimization is a great reference if you are interested in this topic. Useful document for linear algebra review. Section 3 is especially important. Proof of accelerated gradient descent for general convex functions.
8. 10/26	Midterm Exam (first half of class) Differential Privacy	Lecture 8 notes (annotated).
9. 11/2	Constrained optimiziation, center of gravity method, linear programming, LP relaxation.	Lecture 9 notes (annotated). My written lecture notes the ellipsoid method and interior point method, if you are interested. Nisheeth Vishnoi's very complete book, which covers ellipsoid in full detail. More information on proving Grünbaum's theorem can be found in Jonathan Kelner's lecture notes.	Project proposal due Wednesday, 11/9. See final project guidelines for more information. Problem Set 3, due Tuesday 11/22
Spectral Methods and Linear Algebra
10. 11/9	Singular value decomposition, Krylov methods	Lecture 10 notes (annotated). Supplemental notes on the Lanczos method Detailed notes on power method analysis. See Section 3.7 in Foundations of Data Science for an analysis of power method that does not involve singular value gaps. Read Foundations of Data Science, Chapter 3 for more on singular value decomposition and low-rank approximation. Other good notes here and here.
11. 11/16	Spectral graph theory, spectral clustering, generative models for networks	Lecture 11 notes. (annotated). For more details, consult written notes on the stochastic block model. Wayne Zachary's Karate Club paper. Stanford lecture notes introducing spectral graph theory.
11/23	NO CLASS -- THANKSGIVING BREAK
12. 11/30	Stochastic Block Model, Randomized numerical linear algebra, sketching for linear regression, ε-nets	Lecture 12 notes. (annotated). My written notes on sketched regression and ε-nets. Jelani Nelson's course notes from Harvard with a lot more on randomized linear algebra, including methods for sparse JL sketching and randomized low-rank approximation.
Fourier Methods
13. 12/7	Fast Johnson-Lindenstrauss Lemma, introduction to sparse recovery and compressed sensing	Lecture 13 notes. (annotated). Original FJLT paper. Written notes on compressed sensing. Lots of great resources under weeks 8 and 9 of Stanford's Algorithmic Toolbox course.
14. 12/14	Restricted Isometry Property, finish up compressed sensing, introduction to importance sampling and leverage scores.	Lecture 14 notes. (annotated).
15. 12/21	Final Exam (during regular class time)

NYU CS-GY 6763 (3943)Algorithmic Machine Learning and Data Science

Course Team: Professor Christopher Musco Course Assistant Aarshvi Gajjar Course Assistant Atsushi Shimizu Course Assistant ​​​​Lucas Rosenblatt

NYU CS-GY 6763 (3943)
Algorithmic Machine Learning
and Data Science

Course Team:

Professor
Christopher Musco

Course Assistant
Aarshvi Gajjar

Course Assistant
Atsushi Shimizu

Course Assistant
Lucas Rosenblatt