Corso: 340SM - ALGORITHMIC DATA MINING 2023

Schema della sezione

Seleziona sezione Introduzione

Minimizza Espandi
Introduzione

Minimizza tutto Espandi tutto
Dear students, please find below a SAMPLE FOR THE WRITTEN PART of the exam.
- Seleziona attività Annunci
  
  Annunci Forum
- Seleziona attività Sample exam
  
  Sample exam File PDF
Seleziona sezione Detailed program

Minimizza Espandi
Detailed program
Amortized analysis: aggregate analysis and accounting method. Chapters 17.1 and 17.2 of Cormen's book.
Red-black trees: definition and properties. Rotations and insertion procedure. Chapters 13.1, 13.2 and 13.3 of Cormen's book.
Hash tables: definition and properties of hash tables and hash functions. Collision resolution by chaining. Analysis of hashing with chaining. Dynamic hash tables and their analysis. Chapters 11.1, 11.2, 11.3 (for the latter, only the paragraph "What makes a good hash function?") and 17.4 of Cormen's book.
Exact pattern matching on strings: definition of string matching.The naive algorithm. The Knuth-Morris-Pratt algorithm and its analysis. The Boyer-Moore-Galil algorithm. Chapter 32 of Cormen's book, excluding 32.2 and 32.3; Chapter 2.2 of Gusfield's book (please find it in Teams: Materiale del corso)
Multiple exact pattern matching: definition of suffix tree and suffix links. Representation of the branches of the suffix tree and implied time and space complexities. Use of the suffix tree for (multiple) exact pattern matching. Definition of suffix array and its use for exact pattern matching. Chapters 5, 6.1 (for the latter, only the definition of suffix links), 6.5, 7.1, 7.14 (until 7.14.4) of Gusfield's book; see also Ben Langmead's slides on tries, suffix trees and suffix arrays and the "survey_suffix_tree" paper.
Approximate pattern matching: definition of Hamming distance. Definition of k-mismatch problem. Definition of Longest common extensions. Solution to k-mismatch via longest common extensions. Definition of Edit distance. Dynamic programming algorithm for computing the edit distance. Chapters 9.1, 9.4, 11.2 and 11.3 of Gusfield's book.
Frequent pattern mining: definition of support, maximal frequent itemset, and the frequent itemset mining problem. The support monotonicity and the downward closure property (aka Apriori property). The Apriori algorithm. Definition and use of the enumeration tree. TreeProjection and DepthProject methods. Chapters 4.1, 4.2, 4.3, 4.4.1, 4.4.2 (excluding 4.4.2.1), 4.4.3 (until p.110 excluding p.109) of C. Aggarwal, Data Mining - The Textbook
Graph mining: definition of graph mining. Algorithm based on the enumeration tree. Chapter 11 of Mohammed J. Zaki, Wagner Meira, Jr., Data Mining and Machine Learning: Fundamental Concepts and Algorithms
Social networks analysis: properties of social networks. The betweenness measure. The community detection problem. The Girvan–Newman Algorithm for community detection. Chapters 19.1, 19.2.1, 19.2.2, 19.2.3, 19.2.5.3, 19.3 (intro), 19.3.2 of C. Aggarwal, Data Mining - The Textbook
Data stream model: The membership problem: Bloom filter and its analysis. The counting problem: Count-min sketch and its analysis. The cardinality problem: the k-bottom algorithm and its analysis. Finding similar items: definition of Jaccard distance between sets. K-shingling for text documents. Similarity-Preserving Summaries of Sets. Minhashing and Minhash signatures. Locality-sensitive hashing and its analysis. Chapters 12.2.2 (intro), 12.2.2.1, 12.2.2.2 of C. Aggarwal, Data Mining - The Textbook; Chapters 3.1.1, 3.2.1, 3.2.3, 3.3, 3.4 of J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets. See also Ben Langmead's slides.
Seleziona sezione Red-black trees, Hashing, Amortised analysis

Minimizza Espandi
Red-black trees, Hashing, Amortised analysis
- Seleziona attività 07.1 Red-black-trees
  
  07.1 Red-black-trees File KEY
- Seleziona attività 07 Amortization Hashing
  
  07 Amortization Hashing File KEY
- Seleziona attività Exercises lecture10
  
  Exercises lecture10 File PDF
Seleziona sezione Exact pattern matching

Minimizza Espandi
Exact pattern matching
- Seleziona attività 08 Exact Pattern Matching
  
  08 Exact Pattern Matching File PDF
- Seleziona attività 09 Exact Pattern Matching2
  
  09 Exact Pattern Matching2 File PDF
- Seleziona attività Exercises lecture12
  
  Exercises lecture12 File PDF
- Seleziona attività 10 SuffixTree
  
  10 SuffixTree File PDF
- Seleziona attività Boyer-Moore
  
  Boyer-Moore File PDF
Seleziona sezione Frequent pattern mining

Minimizza Espandi
Frequent pattern mining
- Seleziona attività 01-Frequent pattern mining
  
  01-Frequent pattern mining File PDF
Seleziona sezione Filters and sketches

Minimizza Espandi
Filters and sketches
- Seleziona attività 02 Bloom filters Langmead
  
  02 Bloom filters Langmead File PDF
- Seleziona attività 05 Similarity Langmead
  
  05 Similarity Langmead File PDF
- Seleziona attività 04 Cardinality Langmead
  
  04 Cardinality Langmead File PDF
- Seleziona attività 03 Countmin Langmead
  
  03 Countmin Langmead File PDF
- Seleziona attività Ullman LSH
  
  Ullman LSH File PDF
Seleziona sezione Exam papers

Minimizza Espandi
Exam papers
Dear students,
Please find below a set of papers among which you can pick the one to present for the exam. When you choose a paper, please write the title and your name in this file Paper_choice.xlsx to reserve it for the date on which you intend to take the exam.
- Seleziona attività 14.Streaming Pattern-Matching
  
  14.Streaming Pattern-Matching File PDF
- Seleziona attività 16.FM Index
  
  16.FM Index File PDF
- Seleziona attività 12.Exact and Approximate Pattern Matching in the Streaming Model
  
  12.Exact and Approximate Pattern Matching in the Streaming Model File PDF
- Seleziona attività 13.LCA revisited
  
  13.LCA revisited File PDF
- Seleziona attività 15.Weight Ancestors Suffix Tree
  
  15.Weight Ancestors Suffix Tree File PDF
- Seleziona attività 11.Antiperiods
  
  11.Antiperiods File PDF
- Seleziona attività 17.FM index revisited
  
  17.FM index revisited File PDF
- Seleziona attività 34.Property matching
  
  34.Property matching File PDF
- Seleziona attività 33.Suffix Trays
  
  33.Suffix Trays File PDF
- Seleziona attività affix array
  
  affix array File PDF
- Seleziona attività sparse suffix trees
  
  sparse suffix trees File PDF
- Seleziona attività CDAWGS
  
  CDAWGS File PDF
- Seleziona attività Suffix cactus
  
  Suffix cactus File PDF
- Seleziona attività 5.Set Sketching
  
  5.Set Sketching File PDF
- Seleziona attività 2.Cuckoo Filter
  
  2.Cuckoo Filter File PDF
- Seleziona attività 3.Simplified Cuckoo Filter
  
  3.Simplified Cuckoo Filter File PDF
- Seleziona attività 7.Resizable arrays until page12
  
  7.Resizable arrays until page12 File PDF
- Seleziona attività Myers 2014 6106
  
  Myers 2014 6106 File PDF
- Seleziona attività 20.Safely Filling Gaps
  
  20.Safely Filling Gaps File PDF
- Seleziona attività 39.RNA assembly
  
  39.RNA assembly File PDF
- Seleziona attività 38.WhatsHap
  
  38.WhatsHap File PDF
- Seleziona attività 18.DP RNA folding
  
  18.DP RNA folding File PDF
- Seleziona attività 19.Omnitigs
  
  19.Omnitigs File PDF
- Seleziona attività Counting quotient filter
  
  Counting quotient filter File PDF
- Seleziona attività Streaming quotient filter
  
  Streaming quotient filter File PDF
- Seleziona attività Stream statistics sliding window (until page 13)
  
  Stream statistics sliding window (until page 13) File PDF
- Seleziona attività Improving the Sensitivity of MinHash Through Hash-Value Analysis
  
  Improving the Sensitivity of MinHash Through Hash-Value Analysis File PDF
- Seleziona attività Count-min Sketch with Variable Number of Hash Functions: an Experimental Study
  
  Count-min Sketch with Variable Number of Hash Functions: an Experimental Study File PDF
Seleziona sezione Topic 7

Minimizza Espandi
Topic 7
Seleziona sezione Topic 8

Minimizza Espandi
Topic 8
Seleziona sezione Topic 9

Minimizza Espandi
Topic 9
Seleziona sezione Topic 10

Minimizza Espandi
Topic 10
Seleziona sezione Topic 11

Minimizza Espandi
Topic 11
Seleziona sezione Topic 12

Minimizza Espandi
Topic 12