DS 5220: Supervised Machine Learning and Learning Theory
GENERAL INFORMATION 
Instructor: Prof. Ehsan Elhamifar
Instructor Office Hours: Mondays, 4:30pm—5:30pm, 310E WVH
Class: Mondays and Wednesdays 14:50—16:30, Behrakis Health Sciences Cntr 315
TAs: Shantam Gupta (gupta.sha [at] husky.neu.edu), Office Hour: Fridays, 1011am, 462WVH
Discussions, Lectures, Homeworks on Piazza

DESCRIPTION 
This course covers practical algorithms and the theory for supervised machine learning from a variety of perspectives. Topics include generative/discriminative learning, parametric/nonparametric learning, deep neural networks, support vector machines, decision trees and forests as well as learning theory (bias/variance tradeoffs, VC theory). The course will also discuss recent applications of machine learning, such as computer vision, data mining, natural language processing, speech recognition and robotics.

PREREQUISITES 
Introduction to Probability and Statistics, Linear Algebra, Algorithms.

SYLLABUS 
Linear regression, Overfitting, Regularization, Sparsity
Maximum likelihood estimation
Bayesian learning, MAP estimation
Logistic regression
Naive Bayes
Perceptron
Convex optimization, Lagrangian function, Optimality conditions
SVM and kernels
Neural networks and deep learning: DNNs, CNNs
Decision trees and Ensemble methods
Hidden Markov Models

GRADING 
Homeworks are due at the beginning of the class on the specified dates. No late homeworks or projects will be accepted.
Homeworks: 4 HWs (40%)
Project (30%)
Final Exam (30%)
Homework consist of both analytical questions and programming assignments. Programming assignments must be done via Python. Both codes and results of running codes on data must be submitted.
The exam consist of analytical questions from topics covered in the class. Students are allowed to bring a single cheat sheet to the exam.

TEXTBOOKS 
[CB] Christopher Bishop, Pattern recognition and machine learning. [Required]
[KM] Kevin P. Murphy, Machine Learning: A Probabilistic Perspective. [Optional]
[KF] Daphne Koller and Nir Friedman, Probabilistic Graphical Models. [Optional]

READINGS 
Lecture 1: Introduction to ML, Linear Algebra Review
Lecture 2: Introduction to Regression
Lecture 3: Linear Regression: Convexity, Closedform Solution, Gradient Descent
Lecture 4: Robust Regression, Overfitting, Regularization
Lecture 5: Basis Function Expansion, Hyperparameter Tuning, Cross Validation, Probability Review
Lecture 6: Maximum Likelihood Estimation
Lecture 7: Bayesian Learning, Maximum A Posteriori (MAP) Estimation, Classification
 Chapter 3 and 4.3 from CB book.
Lecture 8: Logistic Regression, Parameter Learning via Maximum Likelihood, Overfitting
 Chapter 4.3 from CB book.
Lecture 9: Softmax Regression, Discriminate vs Generative Modeling, Generative Classification
 Chapter 4.2 from CB book.
Lecture 10: Generative Classification, Naive Bayes
 Chapter 4.2 from CB book.
Lecture 11: Generative Classification, Naive Bayes
 Chapter 4.2 from CB book.
Lecture 12: Convex Optimization, Lagrangian Function, KKT Conditions
 See lecture notes on piazza.
Lecture 13: Project pitch
Lecture 14: Suport Vector Machines
Lecture 15: Suport Vector Machines: Vanilla SVM, Dual SVM
Lecture 16: Suport Vector Machines: SoftMargin SVM, Kernel SVM, MultiClass SVM
Lecture 17: Neural Networks
Lecture 18: Neural Networks: Training, Forward and Back Propagation

ADDITIONAL RESOURCES 
Probability Review
Linear Algebra Review

ETHICS 
All students in the course are subject to the Northeastern University's Academic Integrity Policy. Any submitted report/homework/project by a student in this course for academic credit should be the student's own work. Collaborations are only allowed if explicitly permitted. Per CCIS policy, violations of the rules, including cheating, fabrication and plagiarism, will be reported to the Office of Student Conduct and Conflict Resolution (OSCCR). This may result in deferred suspension, suspension, or expulsion from the university.

