ATML: Optimisation in machine learning

Large neural networks trained with suitable data often work out well, and produce high accuracy models. But why do they work? What is going on inside neural networks? What about the stochastic gradient descent training algorithm makes it so successful?

In this track, we will try to understand what makes good model and good training.

Tutorials 

All tutorials at Appleton Tower M2, starting in Week 3.

You can find the following people at these times for discussion.

  • Mondays 13:10 -- 14:00. Rik Sarkar 
  • Mondays 14:10 -- 15:00 Sahel Torkamani 
  • Wednesdays 13:10 -- 14:00 Sahel Torkamani

Lectures 

  • Mondays 17:10 – 18:00
  • Lecture Theatre B, 40 George Square

We will start with some basic concepts and definitions in machine learning and move toward more advanced topics. Slides, notes, exercises will be uploaded as we proceed. 

 

 TopicResources
Week 1            Introduction and basics          
Week 2Generalisation, linear classifiers and Optimisation
Week 3Gradient Descent, Convex optimisation, stability
   
   

 

Topics

  • ML Basics — definitions and notations
    • The elements of a general ML system -- domains, hypothesis classes and loss functions
    • How to define and describe an ML system
    • Generalization Vs Memorisation
  • Linear classifiers and convex optimization
    • Convex optimisation
    • Gradient descent and stochastic gradient descent
  • Neural nets
    • ReLU vs other activations — what makes ReLU successful?
    • Why are deep networks better than shallow networks?
    • Cross entropy loss and loss landscapes — why neural networks overfit
  • Why SGD works — The various factors  that affect optimisation and generalisation
    • Randomness
    • Sharp and flat minima
    • Fractal dimension and generalisation
  • What we know about neural networks
    • Neural collapse — what happens inside NN classifiers
    • Overparameterization and pruning — why we need large networks and how much we really need in a large network.
  • Additional topics
    • Fairness -- definitions and impossibility -- why fairness is hard
    • Explainability -- what does it mean to explain the model behaviour? How do we do it? 

 

Sample exam

A sample set of questions are here: Sample exam (Track 1). These questions are only indicative and based on the first few weeks of lectures. Actual exam will contain more diverse types of questions covering more topics. 

 

 

License
All rights reserved The University of Edinburgh