Corso: 481SM - INFORMATION RETRIEVAL AND DATA VISUALIZATION 2021

Schema della sezione

Seleziona sezione Introduzione

Minimizza Espandi
Introduzione

Minimizza tutto Espandi tutto
- Seleziona attività Annunci
  
  Annunci Forum
Seleziona sezione Information Retrieval

Minimizza Espandi
Information Retrieval
The Information Retrieval part of the course will be structured with recorded in-person lectures.
There will be two lectures each week:
Monday, from 11:00 to 13:00. Aula Morin, H2bis building.
Thursday, ~~from 10:00 to 12:00~~ from 9:00 to 11:00. Room 4C, H2bis building.

The first lecture will be on October 4, 2021.
The code to access the MS Teams of the course is: t0i8w8d
Exam
The exam for the Information Retrieval part of the course consists of a project plus an oral exam.
For December and January the dates of the exam are the following ones:
December 21, 2021. From 9:30 am in room 5B, H2bis building.
January 14, 2022. From 9:30 am in room 5B 4D, H2bis building. Check the MS Teams of the course for the schedule!
Please submit your project (code or report) by email at least 5 days before the exam.
If you need to have a remote exam please inform me at the moment of the submission, so that I can update the schedule.
If you need to anticipate the exam for scheduling purposes (e.g., to the previous week) please write to me so that, if enough people require it, I can fix an additional date in December.
After January the exam will be by appointment.
- Seleziona attività Projects
  
  Projects File PDF
- Seleziona attività Doodle for the selection of the project
  
  Doodle for the selection of the project URL
Seleziona sezione Lecture 01

Minimizza Espandi
Lecture 01
- Seleziona attività Slides
  
  Slides File PDF
Seleziona sezione Lecture 02

Minimizza Espandi
Lecture 02
- Seleziona attività Slides
  
  Slides File PDF
- Seleziona attività Boolean Retrieval System (Notebook)
  
  Boolean Retrieval System (Notebook) File
Seleziona sezione Lecture 03

Minimizza Espandi
Lecture 03
- Seleziona attività Slides
  
  Slides File PDF
Seleziona sezione Lecture 04

Minimizza Espandi
Lecture 04
- Seleziona attività Slides
  
  Slides File PDF
Seleziona sezione Lecture 05

Minimizza Espandi
Lecture 05
- Seleziona attività Slides
  
  Slides File PDF
- Seleziona attività Boolean Retrieval System with spelling correction (Notebook) [to fill]
  
  Boolean Retrieval System with spelling correction (Notebook) [to fill] File
- Seleziona attività Boolean Retrieval System with spelling correction (Notebook)
  
  Boolean Retrieval System with spelling correction (Notebook) File
Seleziona sezione Lecture 06

Minimizza Espandi
Lecture 06
- Seleziona attività Slides
  
  Slides File PDF
Seleziona sezione Lecture 07

Minimizza Espandi
Lecture 07
- Seleziona attività Slides
  
  Slides File PDF
- Seleziona attività Vector space model (Notebook)
  
  Vector space model (Notebook) File IPYNB
- Seleziona attività Dataset
  
  Dataset File ALL
Seleziona sezione Lecture 08

Minimizza Espandi
Lecture 08
- Seleziona attività Slides
  
  Slides File PDF
Seleziona sezione Lecture 09

Minimizza Espandi
Lecture 09
- Seleziona attività Slides
  
  Slides File PDF
Seleziona sezione Lecture 10

Minimizza Espandi
Lecture 10
- Seleziona attività Slides
  
  Slides File PDF
Seleziona sezione Lecture 11

Minimizza Espandi
Lecture 11
- Seleziona attività Slides
  
  Slides File PDF
Seleziona sezione Lecture 12

Minimizza Espandi
Lecture 12
- Seleziona attività PageRank (Notebook)
  
  PageRank (Notebook) File IPYNB
- Seleziona attività PageRank (Json data)
  
  PageRank (Json data) File JSON
Seleziona sezione Data Visualization

Minimizza Espandi
Data Visualization
The lectures on Data Visualization will start on Wednesday, November 17. The lectures will take place in building H2bis, room 4C for three weeks as follows:
- on Wednesdays from 14:00 to 17:00
- on Thursdays from 9:00 to 13:00
The lectures will be recorded and available on Teams, but I encourage attendance in person whenever possible.
Note that the lessons on November 17 and 18 will be held online.
- Seleziona attività Data sources
  
  Data sources File PDF
  
  A collection of various freely available data sources
Seleziona sezione Lecture 1

Minimizza Espandi
Lecture 1
- Seleziona attività Introduction
  
  Introduction File PDF
  
  Introduction to the course
- Seleziona attività Foundations
  
  Foundations File PDF
  
  Foundations of data visualization: definition, motivation and historical visualizations. The purposes of data visualization. The three principles of good visualization design: trustworthiness, accessibility and elegance. Differences between truth and trust and various ways to lie with visualizations. Guidelines for creating accessible visualizations. The importance of data-ink ratio and the reasons against uncommon charts. Inspiration for elegant visualizations.
- Seleziona attività Data abstraction
  
  Data abstraction File PDF
  
  Data abstraction: motivation, types of datasets, attribute types and semantics.
- Seleziona attività Task abstraction
  
  Task abstraction File PDF
  
  Task abstraction: motivation, goals and tasks, actions and targets, examples of efficient designs for particular visual tasks
Seleziona sezione Lecture 2

Minimizza Espandi
Lecture 2
- Seleziona attività Visual perception
  
  Visual perception File PDF
  
  How vision works, the importance of attention and its implications for design. How memory works and its implications for presentations.The visual encoding and the expressiveness of the visual channels. Channel accuracy and its implications for visualization design. Channel discriminability, salience and separability, and their implications for visualization design. The Gestalt laws of grouping (proximity, similarity, connection, enclosure, closure and figure/ground), their hierarchy and their use in data visualization. How to obtain visual order.
- Seleziona attività First assignment
  
  First assignment File PDF
  
  List issues with the given visualization and improve it
- Seleziona attività First assignment (data)
  
  First assignment (data) File CSV
Seleziona sezione Lecture 3

Minimizza Espandi
Lecture 3
- Seleziona attività First assignment (results)
  
  First assignment (results) File PDF
  
  The results of the first assignment
- Seleziona attività Required Python libraries
  
  Required Python libraries File PDF
  
  An installation guide to the Python libraries required for the hands-on lesson
  Note that these instructions have been updated because the original ones weren't working for some of the students.
- Seleziona attività Test Plotly
  
  Test Plotly File IPYNB
  
  A test of the Plotly Python library
- Seleziona attività Test Dash
  
  Test Dash File IPYNB
  
  A test of the Dash Python library
- Seleziona attività Color
  
  Color File PDF
  
  The importance of color. Color perception: the anatomy and physiology of the human eye, the trichromatic theory and the opponent processing theory. Color specification with color spaces RGB and HSL, CIE Lab and HCL. Demonstrations using various color pickers. Intuitivity and perceptual uniformity of each space. Use of color. Sequential color maps. Issues with the rainbow color map. Alternative sequential color maps (including cubehelix). Diverging and categorical color maps. The desired properties of univariate color maps. Bivariate color maps. Using established color maps (colorbrewer) or constructing new ones. The semantics of color. Considerations for colorblind people. The importance of size and contrast. The importance of background and surrounding colors. Advice on choosing colors.
Seleziona sezione Lecture 4

Minimizza Espandi
Lecture 4
- Seleziona attività Visualization design
  
  Visualization design File PDF
  
  The seven steps of visualization design. Different ways to acquire data. The importance of parsing and filtering data. Mining data for exploratory data analysis. Choosing the right representation for the given data and the given task. Online repositories of charts for various purposes. Refining the visualization and supporting interactivity.
  The dos and don’ts of basic charts (line charts, bar charts). Stacked bar charts and pie charts. Visualizing geographical data with dot distribution maps and choropleth maps. Visualizing geographical data with tile maps. Visualizing networks and trees with node-link diagrams and adjacency matrices. Visualizing multidimensional data with Chernoff faces, bubble plots, the scatter plot matrix, parallel coordinates, radar charts, radial histograms, small multiples and horizon charts. Using principal component analysis and multidimensional scaling for visual exploratory data analysis.
  
  Visualizing uncertain and missing data. Advantages and disadvantages of interactivity. Using interactivity for data adjustments (framing, navigating, animating, sequencing and contributing) and presentation adjustments (focusing, annotating and orientating). Examples of interaction, animation and storytelling. Examples of available visualization tools (D3, Observable, Tableau and Processing).
- Seleziona attività Second assignment
  
  Second assignment File PDF
  
  Instructions for the second assignment
- Seleziona attività Second assignment (data)
  
  Second assignment (data) File CSV
- Seleziona attività Exam
  
  Exam File PDF
  
  Information about the exam (in project form)
Seleziona sezione Lecture 5

Minimizza Espandi
Lecture 5
- Seleziona attività Second assignment (results)
  
  Second assignment (results) File PDF
  
  The results of the second assignment
- Seleziona attività Examples of (un)trustworthy visualizations
  
  Examples of (un)trustworthy visualizations File PDF
  
  Visualizations that lie by using dubious data, such as unrepresentative data and missing data. Using non-comparable data in comparisons. Using absolute instead of cumulative data (and vice versa). Using absolute instead of relative data on maps. Examples of ignoring conventions (unequal intervals, pie charts that do not add up to 100%) and abusing scales (bar charts with truncated axis, aspect ratio bias, dual axes, improper scaling of areas and pictograms). Examples of misrepresenting data by using unnecessary 3-D visualizations. Examples of improper categorization and oversimplification. Examples of cherry-picking data in order to hide (unfavorable) data or conceal existing patterns. Examples of visualizations suggesting patterns that are not there. Examples of misrepresenting or concealing uncertainty. Examples of erroneous interpretation of visualizations due to confirmation bias.
Seleziona sezione Lecture 6

Minimizza Espandi
Lecture 6
- Seleziona attività Example of accessible visualizations
  
  Example of accessible visualizations File PDF
  
  Example of accessible visualizations: redesign of diversity of aging, plots in Excel, slope graphs, connected scatter plots, smoothed line charts, the importance of notations. Guidelines for creating accessible visualizations.
- Seleziona attività Gapminder-data
  
  Gapminder-data File CSV
  
  Entire Gapminder data
- Seleziona attività Gapminder-info
  
  Gapminder-info File CSV
  
  Additional information about the Gapminder data
- Seleziona attività Gapminder-mini
  
  Gapminder-mini File CSV
  
  A small snippet of the Gapminder data containing only the data for Italy and South Africa between 2016 and 2020
- Seleziona attività Plotly library basics
  
  Plotly library basics File IPYNB
  
  A Jupyter notebook that help to understand how the Plotly library constructs figures
- Seleziona attività Gapminder
  
  Gapminder File IPYNB
  
  A Jupyter notebook visualizing Gapminder data in an interactive way (using the Plotly and Dash libraries):
  
  Creating a basic scatter plot and styling it to meet the Gapminder style
  Adding animation to the Gapminder bubble chart, choosing chart axes with Dash dropdowns