Practical Exercise 4 – Network Modeling
Due 11/12, 3:59 PM on Canvas
This exercise is completed in teams of two or three. The exercise is worth 50 points, with an opportunity for 3 extra credit points.
Input Resources
This exercise makes use of two files in the Practical Exercises folder on Canvas:
1. blogs_edgelist.csv contains the edges in a network of blogs covering United States politics in 2004.
· Each row should be read as “<Blog in the Source column> has a hyperlink to <Blog in the Target column>.” For example, blog 267 has a hyperlink to blog 1394. The screenshot below is of the blog theleftcoaster.com; the links to the blogs it follows are in blue on the right:
2. blogs_nodes.csv contains one attribute of the blogs in the edge list: the political ideology (conservative or liberal) of the blog.
Steps
Carry out the following steps in R and answer the accompanying questions.
Required packages: statnet, igraph, sna, and intergraph.
Optional package: RColorBrewer
1. Import blogs_edgelist.csv and blogs_nodes.csv into variables names blogs_edges and blogs_nodes respectively in R. [4]
2. Create a network from the blogs_edgelist using the package network from statnet and name it blogs_net. The network is directed. [3]
3. Assign the Ideology attribute from blogs_nodes.csv to blogs_nodes. [4]
Be sure to check that the right nodes are assigned the right attribute. Run the following code before assigning the attributes:
network_names <- blogs_net %v% 'vertex.names' # get list of names from the network
blogs_nodes <- blogs_nodes[ which(blogs_nodes$Name %in% network_names), ] #
blogs_nodes <- blogs_nodes[order(factor(blogs_nodes$Name, levels = network_names)),]# in case the column isn't ordered correctly
stopifnot(network_names == blogs_nodes$Name) # Code will throw an error if something isn’t right
The code to assign the Ideology attribute is as follows:
blogs_net %v% “Ideology” <- blogs_nodes$Ideology
4. What percentage of the network is conservative? [2]
5. Plot blogs_net with nodes colored by Ideology and attach the plot below. Include a legend in your plot. [6]
6. Describe the imported network by filling in the value and interpretation of value columns of Table 1. [10]
a) To get the number of isolates use the following code: length(isolates(blogs_net)). The function isolates is found in the sna package.
Term
|
Value
|
Interpretation of value
|
Size
|
|
|
Density
|
|
|
Reciprocity
|
|
|
Number of isolates
|
|
|
Table 1: Summary of class network
Follow the material in Chapter 10 of Luke (2015) to complete the remaining steps.
7. Run a null model of blogs_net using the package ergm and record the AIC in Table 2. The code is as follows: [2]
null_model <- ergm(blogs_net ~ edges)
summary(null_model)
8. Run a reciprocity model on blogs_net as follows and record the AIC in Table 2. [2]
reciprocity_model <- ergm(blogs_net ~ edges + mutual)
summary(reciprocity_model)
9. Run a reciprocity model on blogs_net as follows and record the AIC in Table 2. [2]
homophily_reciprocity_model <- ergm(blogs_net ~ edges + mutual + nodematch('Ideology') + nodefactor(‘Ideology’))
summary(homophily_reciprocity_model)
Model
|
AIC
|
Null_model
|
|
Reciprocity_model
|
|
Homophily_reciprocity_model
|
|
Table 2: ERGMs and their AICs
10. What is the best model of the 3 models you’ve run? Justify your response. [3]
11. Interpret the results of running the best ERGM you chose in (10.) and explain whether blogs_net exhibits homophily or not. [4]
12. Consider the meaning of your answer in (12.) i.e. whether blogs_net exhibits homophily. Does that bode well for political discourse in the US or not? [4]
13. Check the goodness of fit of your chosen model based on pages 181 – 182 of the book. Look specifically at the last section of the output, titled “Goodness-of-fit for model statistics.” Does the output suggest that the terms in your model are being adequately captured? Justify your response. [4]
Extra credit:
14. Create a better model than the best ERGM you identified in (10.) and interpret the results. [3]