Math 490, Mathematics of Machine Learning, Fall 2019



Professor:Dr. Kay Kirkpatrick
Contact:231 Illini Hall, kkirkpat(at)illinois.edu
TA:Sophie Phuong Le, phuong2(at)illinois.edu
Course site:https://faculty.math.illinois.edu/~kkirkpat/490fall2019.html
Lectures: MWF 2:00-2:50pm in 245 Altgeld Hall through August 30; then 341 Altgeld Hall starting on September 4.
Office hours: Mondays and Fridays, 3:00-3:50, or by appointment. I would be happy to answer your questions in my office anytime as long as I'm not otherwise engaged, and before and after class are good times to catch me either in my office or in the classroom.
Textbook: The main text will be Understanding Machine Learning: From Theory to Algorithms, 1st Edition by Shai Shalev-Shwartz and Shai Ben-David: https://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/understanding-machine-learning-theory-algorithms.pdf
Grading policy: Homework: 40% of the course grade
Two Midterms: 15% each, one on October 11, and one on December 11. Please let me know as soon as possible if you need any accommodation.
Final Project: 30%, on a topic of your choice related to the course. See HW#3 for options. Our assigned final exam time is 8:00-11:00am, Monday, December 16.
Inclusion and Justice: I am committed to affirming the identities, realities and voices of all students, especially students from historically marginalized or under-represented backgrounds. I value the use of self-specified gender pronouns, and I require respect for everybody. Please contact me to receive disability accommodations. You should also know that I'm a mandatory reporter. My pronouns are she/her/hers.


Homework (due Fridays in class or to my email address or my mailbox in AH 250 by the end of class): You are encouraged to work together on the homework, but I ask that you write up your own solutions and turn them in separately.

Late homework will not be graded, so I will drop your two lowest homework scores.

HW #1 is due the second Friday by 5pm: please send me an email introducing yourself: for instance, what name you prefer to be called, your major and hobbies, why you're interested in probability, or anything else you want to share or have questions about. It would also be helpful to attach a photo of yourself to help me connect your face with your name. Please also answer the questions on the info sheet here.

HW #2 is due Fri 9/13 in class: available here, but excluding the book problem 2.1.

HW #3 is due Fri 9/20:
Part 1) Textbook problems: p. 20 #2.1, p. 30 #3.6, p. 35 #4.2, p. 41 #5.1, due by email (with copy to the TA) or to my mailbox in AH 250 by 2:50pm, because class on Friday 9/20 is cancelled.
Part 2) Email me by 5pm (with copy to the TA) about your final project, indicating (a) whether you'll pick a talk, paper, poster or webpage, (b) which topics you are considering, (c) what is your motivation, and (d) 2 or 3 references you will work from--books, articles, and technical websites are all fine. Talks will be 15 minutes each, subject to time constraints. Papers will be 3-10 pages. Webpage will be 2-5 screens long, possibly with an interactive simulation that you program.

ANNOUNCEMENT: I will be giving grace period(s) for emailed homework parts related to the final project (but not problem-set HWs): each of you has 48 hours total of automatic grace period(s), which can be used as two 24-hour grace periods or one 48-hour one, etc. In order to use some grace hours, please email me by the time that the HW is due, letting me know how many hours of grace period you are taking, i.e., what time I can expect your completed HW to be sent.

The first exam will be October 11, covering all of the course material (lectures and HWs) up to and including the Friday before the exam, Oct 4.




All lecture videos are available here.

Week 1: Introduction: probability background, ML set-up, defining error/loss/risk, Empirical Risk Minimizatiom algorithm, sections 2.2-2.3 (including Appendix B as needed)
Monday: lecture notes and example of how to use the CLT
Wednesday slides
Friday lecture notes

Week 2: Defining learnability, Sections 2.3-3.2
Wednesday lecture notes
Friday lecture notes

Week 3: An example not in the book; Sections 3.2-5.1, exercise 4.1, and the beginning of the proof of the No Free Lunch Theorem.
Monday lecture notes
Wednesday lecture notes
Friday lecture notes

Week 4: Sections 5.1-6.3
Monday lecture notes which has the probability lemma and proof that we used on 9/13.
Wednesday lecture notes

NO CLASS Friday 9/20.

Week 5: Sections 6.3-
Monday lecture notes
Wednesday lecture notes
Friday lecture notes


All final projects are due via email by the final exam time, Monday, Dec. 16 at 9am (for speakers, this means emailing final slides the night before). Attendance/participation will be graded during the final exam period, estimated to run from 9am to 11am on Dec. 16, in the usual classroom.

Departures from "traditional" machine learning will be encouraged for the final project, and you have a choice of paper (3-10 pages), talk (15 minutes), poster (regular size), or webpage (3-6 screens) Here's an example of a project webpage: https://faculty.math.illinois.edu/~kkirkpat/percolation.html

Final project ideas: summary/analysis of anything by Alan Turing, Sarah M. Brown (http://sarahmbrown.org/research/), Timnit Gebru (http://ai.stanford.edu/~tgebru/), Sanmi Koyejo (http://sanmi.cs.illinois.edu/publications.html), Giuseppe Longo (https://www.di.ens.fr/users/longo/download.html).
Bengio's The Consciousness Prior (https://arxiv.org/pdf/1709.08568.pdf).
Neuronal Synchrony in Complex-Valued Deep Networks (https://arxiv.org/abs/1312.6115v5).
Quantum computing and quantum information: Peter Shor's famous paper, Deutsch and Marletto.
Biological phenomena that don't fit the DNA=code metaphor, e.g., Denis Noble's work.
Ethics and algorithms: Cathy O'Neil's book Weapons of Math Destruction; Safiya Umoja Noble's book Algorithms of Oppression.
Reproducibility crises in psychology and ML/AI.
Spurious correlations in Big Data: Calude and Longo, 2016.
Backpropagation vs. neural back-feed of information to pre-synaptic neuron.
Causality and causal inference: Judea Pearl's work.


Syllabus

This is an advanced course on the mathematics of machine learning, and some probability/statistics and programming are prerequisites for this course. Machine learning is a growing field at the intersection of probability, statistics, optimization, and computer science, which aims to develop algorithms for making predictions based on data. This course will cover foundational models and mathematics for machine learning, including statistical learning theory and neural networks, with a project component.

Topics: We will be covering most of the topics in Chapters 2, 3, 4, 5, 6, 9, 10, 12, 13, 14, 15, 16, 19, and 20. We will skip most of Chapters 7, 8, 11, 17, and 18. At the end of the semester, time permitting, we will cover some of my recent research. The final exam period will probably be spent on talks by your classmates, so please plan to be here then.


Why work on your communication skills?

"It usually takes me more than three weeks to prepare a good impromptu speech." --Mark Twain

I think that success in your career (any career) depends in part on how well you communicate your ideas and persuade other people, so I am giving you a chance to learn and practice good writing or presenting skills. Some of the homework assignments will lead up to the final project, for which you will have a choice of topic (related to the course) and of communication format (paper or talk). The homework will be graded partly on clarity, brevity, and coherence. This is a great opportunity to improve your writing or presenting skills, in order to make your ideas more clear and persuasive--and to succeed.

"I am sorry I have had to write you such a long letter, but I did not have time to write you a short one." --Blaise Pascal

Emergency information link and the new one.


Some more resources for writing and speaking:

Su: Good Math Writing
Halmos: How to Write Mathematics
Gopen and Swan: The Science of Scientific Writing
Williams: Style: The Basics of Clarity and Grace (book, any edition), Longman.


Bruce Reznick's list of resources
Gallian: How to Give a Good Talk
Shaw: Making Good Talks Into Great Ones
Gallo: Public-Speaking Lessons from TED Talks
Lerman: Math job talk advice
Steele: Speaking tips organized in categories that includes this great but little-known tip about graphs on slides






FAQ here: Course info: https://courses.illinois.edu/schedule/DEFAULT/DEFAULT/MATH/490

FAQ0 Announcement: the first few lectures (Monday, August 26th, and Wednesday, the 28th, and possibly more) will be held in a larger room, AH 245, instead of the previously scheduled room. Everyone interested in the course is welcome to attend, because of the abundance of seats. Once the number of attendees gets small enough, we will switch from AH 245 to the previously scheduled room, AH 341. The first day of class will have some lecturing on the general set-up for machine learning, in addition to an in-class demonstration centered on the question: How can one learn whether coins are fair or not, and how long does it take to learn that? The second day of class will be an introductory-level talk about my research based on my Beckman talk slides titled BIO-LOGIC, about biological computation. The third class meeting will be a traditional lecture introducing some more fundamental concepts about the machine learning framework and empirical risk minimization.

FAQ1: How much time and work does this course require?
A1: That depends on your preparation as well as how much you want to get out of the course (more work usually correlates with more benefit, at least before the point of diminishing returns). An average of 5 hours of work per week outside the classroom is expected, and the range of actual work might vary between, say, 1 hour and 10 or more. If you believe that the course is taking you significantly more time than it should, please let me know with data.

FAQ2: I would like to register for your course but it is full. Will I be able to get in?
A2: It is very common for people to drop the course during the first few weeks. I would advise checking seating availability every day and being ready to pounce once a seat opens up. In the meantime, you are allowed to attend lecture, but you should make sure that every student who is registered gets to sit down in a seat. So you might need to stand or to sit on the floor until you are registered. I may request a larger room for the first lecture or two, and I may have the lectures video-recorded. Please watch the course website for such updates.

FAQ3: Will Math 490 be offered again after Fall 2019?
A3: Yes, Math 490 will be offered in Spring 2020 too.

FAQ4: I have taken some probability/statistics but not Math 461 or Stat 410. Will I be prepared for Math 490?
A4: If your previous probability class included a lot of advanced concepts, such as laws of large numbers, Markov and Chebyshev inequalities, central limit theorem, etc., then you may be sufficiently prepared for Math 490.

FAQ5: I have not taken any probability/statistics courses. Can I still take Math 490?
A5: One possibility for highly motivated students (who are ready for a lot of extra studying) is take a probability class (M461 or S410) at the same time as M490. But I would recommend taking these courses in the prerequisite order that I have determined, because probability is fundamental and crucial for ML theory.