Schema della sezione

  • Part I (roughly the first month) is devoted to the foundations of Deep Learning.

    We will start from the basic, covering mainly


    • Fully connected architectures

    • The mechanics of training

    • Regularization

    • Optimization


    We will develop also practical skills that will allow you to deepen your knowledge of FC networks and, furthermore:


    • Monitor learning dynamics: accuracy, loss, parameters and their gradients

    • Create custom layers and loss functions

    • Modify learning rules 

      • as an effect of introducing regularization

      • explicitly constraining the dynamics on particular regions of the parameter space, e.g. on low-dimensional hyperplanes

      • masking a chosen subset of parameters

    • Extract representations in hidden layers for further study

    • Study basic aspects of representations including their PCA decomposition and linear decoding of latent features



    We will conclude the first part with a guided exercise on network pruning by masking.


    Please refer to MS Teams for the recordings of the lectures

  • While the first Part was devoted to the mechanics of Deep Learning, this second part will be focused on two architecture families that incorporate useful inductive bias in specific domains:

    •  CNNs for images 

    •  RNNs for sequences

    We will review the basics of these architectural domains and we will develop practical skills to code - in PyTorch - a few selected models.

    Only with this practice the students will be able to perform independent work in deep learning problems.

    Towards the end of this Part we will start looking at several aspects of “attention” in deep networks, that will culminate with an implementation of a simple sequence to sequence model with explicit attention.

    If we have time we will also briefly describe models endowed with attention and memory: Neural Turing Machines and variations.

    The Keynote lecture of this part will address density estimation of representations: this is a powerful unsupervised technique to analyze general datasets, and in particular representations in deep networks.


  • We already used RNNs to generate text: this is an example of generative model, specifically an autoregressive generative model.

    In this Part we will introduce generative models, and in particular some families

    • Autoregressive models: RNNs for generation

    • Latent variables models: Variational Autoencoders (VAE)

    • Generative Adversarial Networks (GANs)

    In the second half of this Part we will then shift our attention to the fully attentional model Transformer, due to its importance for modern applications. 

    We will explain its modules in detail, with a full review of the original paper, and we will implement it in PyTorch as a guided exercise, and make it work on a simple problem.  

    This will prepare us for the final guest lecture in which the modern applications of these models will be illustrated in sequence to sequence problems.