481SM - INFORMATION RETRIEVAL AND DATA VISUALIZATION 2022
Schema della sezione
-
The Data Visualization part of the course will take place in October 2022 (four weeks, two days per week):
- on Thursdays from 14:00 to 17:00, room 5C (not 5B as erroneously noted on the schedule) in the H2bis building
- on Fridays from 9:00 to 12:00, room 5A in the H2bis building
The first lecture starts on the 6th of October. The lectures will be recorded and available on Teams, but I encourage attendance in person whenever possible.
Join the Teams channel with the code cibx2mz.
-
A collection of various freely available data sources
-
-
-
Introduction to the course
-
Foundations File PDF
Foundations of data visualization: definition, motivation and historical visualizations. The purposes of data visualization. The three principles of good visualization design: trustworthiness, accessibility and elegance. Differences between truth and trust and various ways to lie with visualizations. Guidelines for creating accessible visualizations. The importance of data-ink ratio and the reasons against uncommon charts. Inspiration for elegant visualizations.
-
Data abstraction: motivation, types of datasets, attribute types and semantics.
-
Task abstraction: motivation, goals and tasks, actions and targets, examples of efficient designs for particular visual tasks
-
-
-
How vision works, the importance of attention and its implications for design. How memory works and its implications for presentations.The visual encoding and the expressiveness of the visual channels. Channel accuracy and its implications for visualization design. Channel discriminability, salience and separability, and their implications for visualization design. The Gestalt laws of grouping (proximity, similarity, connection, enclosure, closure and figure/ground), their hierarchy and their use in data visualization. How to obtain visual order.
-
List issues with the given visualization and improve it
-
-
-
The results of the first assignment
-
Color File PDF
The importance of color. Color perception: the anatomy and physiology of the human eye, the trichromatic theory and the opponent processing theory. Color specification with color spaces RGB and HSL, CIE Lab and HCL. Demonstrations using various color pickers. Intuitivity and perceptual uniformity of each space. Use of color. Sequential color maps. Issues with the rainbow color map. Alternative sequential color maps (including cubehelix). Diverging and categorical color maps. The desired properties of univariate color maps. Bivariate color maps. Using established color maps (colorbrewer) or constructing new ones. The semantics of color. Considerations for colorblind people. The importance of size and contrast. The importance of background and surrounding colors. Advice on choosing colors.
-
-
-
Instructions for the second assignment
-
The seven steps of visualization design. Different ways to acquire data. The importance of parsing and filtering data. Mining data for exploratory data analysis. Choosing the right representation for the given data and the given task. Online repositories of charts for various purposes. Refining the visualization and supporting interactivity.
The dos and don’ts of basic charts (line charts, bar charts). Stacked bar charts and pie charts. Visualizing geographical data with dot distribution maps and choropleth maps. Visualizing geographical data with tile maps. Visualizing networks and trees with node-link diagrams and adjacency matrices. Visualizing multidimensional data with Chernoff faces, bubble plots, the scatter plot matrix, parallel coordinates, radar charts, radial histograms, small multiples and horizon charts. Using principal component analysis and multidimensional scaling for visual exploratory data analysis.
Visualizing uncertain data. Visualizing missing data. Advantages and disadvantages of interactivity. Using interactivity for data adjustments (framing, navigating, animating, sequencing and contributing). Using interactivity for presentation adjustments (focusing, annotating and orientating). Examples of interaction, animation and storytelling. A look at available visualization tools, in particular D3, RAW Graphs, Observable, Tableau and Processing.
-
-
-
The results of the second assignment
-
-
-
Use visualization to answer two questions
-
An installation guide to the Python libraries required for the hands-on lesson
-
Test Plotly File IPYNB
A notebook with a simple test of the Plotly library
-
Test Dash File IPYNB
A notebook with a simple test of the Dash library
-
Visualizations that lie by using dubious data, such as unrepresentative data and missing data. Using non-comparable data in comparisons. Using absolute instead of cumulative data (and vice versa). Using absolute instead of relative data on maps. Examples of ignoring conventions (unequal intervals, pie charts that do not add up to 100%) and abusing scales (bar charts with truncated axis, aspect ratio bias, dual axes, improper scaling of areas and pictograms). Examples of misrepresenting data by using unnecessary 3-D visualizations. Examples of improper categorization and oversimplification. Examples of cherry-picking data in order to hide (unfavorable) data or conceal existing patterns. Examples of visualizations suggesting patterns that are not there. Examples of misrepresenting or concealing uncertainty. Examples of erroneous interpretation of visualizations due to confirmation bias.
-
-
-
The results of the third assignment
-
Redesign of diversity of aging, plots in Excel, slope graphs, connected scatter plots, smoothed line charts, the importance of notations. Adding alt text to plots. Guidelines for creating accessible visualizations.
-
Exam File PDF
Information about the exam (group projects)
-
Entire Gapminder data
-
Additional information about the Gapminder data
-
A small snippet of the Gapminder data containing only the data for Italy and South Africa between 2016 and 2020
-
A Jupyter notebook that can help understand how the Plotly library constructs figures
-
-
-
Gapminder notebook File IPYNB
A Jupyter notebook visualizing Gapminder data using the Plotly and Dash libraries:
- A basic bubble chart styled to meet the Gapminder style
- An animated bubble chart to show data from different years
- An interactive bubble chart where the axes data is chosen using Dash dropdowns
- An interactive bubble chart where data about selected points is shown in a text area
- A basic bubble chart styled to meet the Gapminder style
-
A Jupyter notebook with information on how to use templates and colors in Plotly
-
-
Exams dates
- December 21, 2022. From 11:00 in room 4D, H2bis building
- January 20, 2023. From 10:00 in room 4D, H2bis building
- February 20, 2023. From 10:00 in room 4D, H2bis building
Please submit (by email) your project at least 3-4 days before the exam. After February all exams will be by appointment.
Notes: the datasets are available only by downloading them following the link on the slides or via teams (they are too large for the 20MB maximum allowed by Moodle).
-
Lecture 4 - code File IPYNB
-
Projects File PDF