Topic outline
- General
- Slides
Slides
- Additional course materials
Additional course materials
Weighted Density definition from Wasserman and Faust (1994) Social Network analysis handbook.
v_k is the value of the k-th edge and g is the number of nodes in the network.
- Homeworks
Homeworks
Readings (see additional materials):
Barabasi A. (2012) The network takeover, Nature Physics, 8, pp. 14-16.
Brandes U., Robins G., McCrannie A., Wassermann S. (2013) What is network science? Network Science, 1, pp.1-15.
Please deliver a file with your comments on the two papers by monday 14th morning.2nd Assignment (By April 4th)
1) Reading: Kolaczyk E.D. (2009) Statistical Analysis of Network Data. Methods and Models, Springer, New York.
Section 2.1, pp.15-22 (see file)
2) Solve exercise 2.1, pp. 45-46 in the book by Kolaczyk (2009)
Please add a text with the solutions and some discussion. Here's the screen from the textbook:
Let consider the following two datasets
data(aidsblog) and data(yeast) (See R Lab April 28th)
According to the nature of the data:
1) compute and comment some global network statistics (including size, number of edges, and diameter, please refer for instance to the example of co-authorship in the data collection slides)
2) if needed decompose the network into distinct components
3) compute some centrality scores for the individual vertices
4) extract the ego-network the most central vertex according to at east 2 centrlity score (e.g., degree and closeness) and compute and comment the local clustering coefficient
5) count the cliques of size k (you choose k)
Please deliver an R script containing both your code and comments
Considering the fblog data (it's in the sand library), that is a snapshot of the network of interactions in a single day of October 2006 among political blogs in France, produce a R file reporting the codes for the following tasks:
1) produce a graphical representation of the fblog network coloring the nodes according to the political party affiliation2) using the same layout of the point 1 (saving as coordinate vectors), produce a graphical representation of the network where the size of each vertex is proportional to a given centrality measure. Add some comments where you define and explain the chosen centrality measure3) plot and comment the degree distribution of the data4) plot the network consisting of only those blogs belonging to the most represented political party (the most frequent in the dataset).5) choose 2 political parties (say P1 and P2) and plot the network of blogs beloging to the chosen parties using different colors for intra links (P1 <--> P1 and P2 <--> P2) and for interlinks (P1<-->P2)
For this assignment it's not necessary to write a function. Just try to work out using the instructions from the previous scripts.
1) Try to use some igraph functions to produce a graph like the one reproduced in the attached picture
Using the graph of the point 1 do the following exercises:
2) Choose the value k and use the pruning algorithm to detect the cliques of size k (see slides community detection #1, from slides 10 on)
3) Perform a k-core decomposition and produce the hierarchical plot using the appropriate plot
4) obtain the MDS coordinates following the approach from the slides (see community detection #2, slide 7 on)
5) Compute the initial value of the modularity if all nodes were placed in distinct communities
6) Run the Louvain algorithm with the built-in igraph function
7) Implement Newman-Girvan on the same graph and check if you obtain the same solution of the Louvain algorithm
Please, deliver only the commented R script (no need to upload pictures)
- Lab
Lab
- Final Exam assignment
Final Exam assignment
The exam will consist in the presentation and discussion of the analysis (description and model fitting) of a real network dataset, explaining the working steps and the obtained results. The writing of a short report is also requested, with the commented R code.
During the presentations, few questions will be asked to assess the preparation on the topics of the course.
Students are asked individually to carry out the following tasks:
1) descriptive analysis (including visualization) of the assigned network(s) (and attributes) dataset using the indices and tools learnt in the course.
2) exploratory analysis by finding communities with an appropriate algorithm. Possibly, please try to characterize the communities using the eventually available set of additional covariates.
3) specification and estimation of an ERG model on binary network, choosing an appropriate threshold to dychotomise the weighted relations. Since networks are large-sized, please use a meaningful sub-network to fit the ERGM in a reasonable amount of time.
4) Since you deal with more than one network, please perform the analysis for each network and compare results for the points 1-3
Delivery deadline of the short report and the commented R code:
28th July, 11.59 p.m.
For questions contact me by e-mail or for an appointment on Teams.
Students enrolled for the july 29th session received the datasets by e-mail.