481SM - INFORMATION RETRIEVAL AND DATA VISUALIZATION 2023
Section outline
-
The Data Visualization part of the course will consist of six 4-hour lectures on Tuesdays from 14:00 to 18:00, room 5C in the H2bis building.
The first lecture will take place on the 26th of September (we start at 14:00 sharp) and the last on the 7th of November (there will be no lecture on the 24th of October). The lectures will be recorded and available on Teams, but I encourage attendance in person whenever possible.
You can join the Teams channel with the code d1cjj0u.
-
A collection of various freely available data sources
-
Challenges File PDF
Information about various data visualization challenges
-
-
-
Introduction to the course
-
Foundations File PDF
Foundations of data visualization: definition, motivation and historical visualizations. The purposes of data visualization. The three principles of good visualization design: trustworthiness, accessibility and elegance. Differences between truth and trust and various ways to lie with visualizations. Guidelines for creating accessible visualizations. The importance of data-ink ratio and the reasons against uncommon charts. Definition of elegance and resources to inspire elegant visualizations.
-
Data abstraction: motivation, types of datasets, attribute types and semantics.
-
Task abstraction: motivation, goals and tasks, actions and targets, examples of efficient designs for particular visual tasks.
-
First assignment: analyze two visualizations about the peak time of day for sports and leisure.
-
-
-
Results of the first assignment
-
Second assignment: make your own visualization
-
Data for the second assignment
-
Visual perception: introduction. Motivation to study visual perception. How vision works. The importance of attention and its implications for design. How memory works and its implications for presentations. The visual encoding and the expressiveness of the visual channels. Channel accuracy and its implications for visualization design. Channel discriminability, salience and separability, and their implications for visualization design. The Gestalt laws of grouping (proximity, similarity, connection, enclosure, closure and figure/ground), their hierarchy and their use in data visualization. How to obtain visual order.
-
-
-
Color File PDF
Color: the importance of color. Color perception: the anatomy and physiology of the human eye, the trichromatic theory and the color opponent processing theory. Color specification with RGB, HSL, CIE Lab and HCL color spaces. Demonstrations using various color pickers. Intuitivity and perceptual uniformity of each space. Use of color. The desired properties and examples of sequential color maps. Issues with the rainbow color map, alternative sequential color maps. The desired properties and examples of diverging, cyclic and categorical color maps. Bivariate color maps. Using established color maps (colorbrewer) and constructing new ones. The semantics of color. Considerations for colorblind people. The importance of size. Relative perception, importance of contrast, background and surrounding colors. Advice on choosing colors.
-
The results of the second assignment
-
Third assignment: list issues with the given visualization and improve it
-
Data for the third assignment
-
-
-
Updated results of the second assignment
-
Results of the third assignment
-
Visualization design: the dos and don’ts of basic charts (line charts, bar charts). Stacked bar charts and pie charts. Visualizing geographical data with dot distribution maps and choropleth maps. Visualizing geographical data with tile maps. Visualizing networks and trees with node-link diagrams and adjacency matrices. Visualizing multidimensional data with Chernoff faces, bubble plots, the scatter plot matrix, parallel coordinates, radar charts, radial histograms, small multiples and horizon charts. Using principal component analysis and multidimensional scaling for visual exploratory data analysis. Visualizing uncertain data. Visualizing missing data. Advantages and disadvantages of interactivity. Using interactivity for data adjustments (framing, navigating, animating, sequencing and contributing). Using interactivity for presentation adjustments (focusing, annotating and orientating). Storytelling definition and usage. A look at available visualization tools, in particular D3, RAW Graphs, Observable, Tableau and Processing.
-
Fourth assignment: visualize survey results
-
-
-
Results of the fourth assignment
-
Fifth assignment: create a single visualization for the given data
-
The data for the fifth assignment
-
Information about the required Python libraries for the hands-on lessons
-
Examples of (un)trustworthy visualizations. Visualizations that lie by using dubious data, such as unrepresentative data and missing data. Using non-comparable data in comparisons. Using absolute instead of cumulative data (and vice versa). Using absolute instead of relative data on maps. Examples of ignoring conventions (unequal intervals, pie charts that do not add up to 100%) and abusing scales (bar charts with truncated axis, aspect ratio bias, dual axes, improper scaling of areas and pictograms). Examples of misrepresenting data by using unnecessary 3-D visualizations. Examples of improper categorization and oversimplification. Examples of cherry-picking data in order to hide (unfavorable) data or conceal existing patterns. Examples of visualizations suggesting patterns that are not there. Examples of misrepresenting or concealing uncertainty. Examples of erroneous interpretation of visualizations due to confirmation bias.
-
Example of (in)accessible visualizations: redesign of diversity of aging, plots in Excel, smoothed line charts, slope graphs, connected scatter plots, the importance of notations. Adding alt text to plots. Guidelines for creating accessible visualizations.
-
Test Plotly File IPYNB
A notebook with a simple test of the Plotly library
-
Test Dash File IPYNB
A notebook with a simple test of the Dash library
-
-
-
Results of the fifth assignment
-
Exam File PDF
Information about the exam
-
All the notebooks and data needed for the hands-on lecture in a single zipped file.
-
Entire Gapminder data
-
Additional information about the Gapminder data
-
A small snippet of the Gapminder data containing only the data for Italy and South Africa between 2016 and 2020
-
Plotly basics File IPYNB
Plotly basics: the interfaces to figures of the Plotly library
-
Gapminder start File IPYNB
The starting notebook for the hands-on lecture on recreating Gapminder using the Plotly and Dash libraries.
-
A snapshot of Gapminder to use as inspiration (needed by the Gapminder start notebook)
-
A notebook with some code snippets to help in the hands-on lecture
-
A notebook with a demonstration of how to use templates and colors in the Plotly library
-
A notebook with a demonstration of how to use styles and colors in the Matplotlib library
-
Gapminder final File IPYNB
The final state of the notebook used for the hands-on lecture on recreating Gapminder using the Plotly and Dash libraries
-
-
-
Projects File PDF