452SM - DEEP LEARNING 2020
Section outline
-
Part I (roughly the first month) is devoted to the foundations of Deep Learning.
We will start from the basic, covering mainly
Fully connected architectures
The mechanics of training
Regularization
Optimization
We will develop also practical skills that will allow you to deepen your knowledge of FC networks and, furthermore:
Monitor learning dynamics: accuracy, loss, parameters and their gradients
Create custom layers and loss functions
Modify learning rules
as an effect of introducing regularization
explicitly constraining the dynamics on particular regions of the parameter space, e.g. on low-dimensional hyperplanes
masking a chosen subset of parameters
Extract representations in hidden layers for further study
Study basic aspects of representations including their PCA decomposition and linear decoding of latent features
We will conclude the first part with a guided exercise on network pruning by masking.
Please refer to MS Teams for the recordings of the lectures
-
While the first Part was devoted to the mechanics of Deep Learning, this second part will be focused on two architecture families that incorporate useful inductive bias in specific domains:
CNNs for images
RNNs for sequences
We will review the basics of these architectural domains and we will develop practical skills to code - in PyTorch - a few selected models.
Only with this practice the students will be able to perform independent work in deep learning problems.
Towards the end of this Part we will start looking at several aspects of “attention” in deep networks, that will culminate with an implementation of a simple sequence to sequence model with explicit attention.
If we have time we will also briefly describe models endowed with attention and memory: Neural Turing Machines and variations.
The Keynote lecture of this part will address density estimation of representations: this is a powerful unsupervised technique to analyze general datasets, and in particular representations in deep networks.
-
We already used RNNs to generate text: this is an example of generative model, specifically an autoregressive generative model.
In this Part we will introduce generative models, and in particular some families
Autoregressive models: RNNs for generation
Latent variables models: Variational Autoencoders (VAE)
Generative Adversarial Networks (GANs)
We will explain its modules in detail, with a full review of the original paper, and we will implement it in PyTorch as a guided exercise, and make it work on a simple problem.
This will prepare us for the final guest lecture in which the modern applications of these models will be illustrated in sequence to sequence problems.