8 Network Topography in statnet
8.1 Setup
Find and open your RStudio Project associated with this class. Begin by opening a new script. It’s generally a good idea to place a header at the top of your scripts that tell you what the script does, its name, etc.
#################################################
# What: Network Topography in R
# Created: 02.28.14
# Revised: 01.18.22
#################################################
If you have not set up your RStudio Project to clear the workspace on exit, your environment contain the objects and functions from your prior session. To clear these before beginning use the following command.
rm(list = ls())
Proceed to place the data required for this lab (Anabaptists Leaders.csv
, and Anabaptists Attributes.csv
) also inside your R Project folder. We have placed it in a sub folder titled data
for organizational purposes; however, this is not necessary.
For this exercise, we’ll use the Anabaptist Leadership network and its related attribute data, both of which can be found in the file we shared with you. The data set includes 67 actors, 55 who were sixteenth century Anabaptist leaders and 12 who were prominent Protestant Reformation leaders who had contact with and influenced some of the Anabaptist leaders included in this data set. These network data build upon a smaller dataset (Matthews et al. 2013) that did not include some leading Anabaptist leaders, such as Menno Simons, who is generally seen as the “founder” of the Amish and Mennonites.
8.2 Load Libraries
Load statnet library.
library(statnet)
It is not currently possible to calculate the E-I index in statnet and igraph, but a package, isnar, has been developed to do just that. Its functionality is demonstrated at the end of this lab. We’ve included the scripts in both statnet and igraph versions of this lab, but you need to do this section only once.
In addition to statnet, we will be introducing and using isnar. Since this may be the first time you are using this tool, please ensure you install it prior to loading it. You will need to install remotes in order to use the function install_github()
to download and set up isnar as it is not published on the CRAN.
install.packages("remotes")
Now install isnar.
::install_github("mbojan/isnar") remotes
Before moving forward, let’s load the isnar package:
library(isnar)
8.3 Import Data
Let’s import the data, which we’ve stored as a matrix, using the read.csv()
function nested within as.matrix()
in order to return a matrix
class object, which is one format required by the as.network()
function to generate a network
object.
# First, read it the matrix of relations
<- as.matrix(
anabaptist_mat read.csv("data/Anabaptist Leaders.csv",
header = TRUE,
row.names = 1,
check.names = FALSE)
)
Now transform the matrix to a network
object.
<- as.network(anabaptist_mat) anabaptist_net
Take a look at the newly created object.
anabaptist_net
Network attributes:
vertices = 67
directed = TRUE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges= 366
missing edges= 0
non-missing edges= 366
Vertex attribute names:
vertex.names
No edge attributes
8.4 Network Size and Interconnectedness
8.4.1 Network Size
Network size is a basic descriptive statistic that is important to know because many of the subsequent measures are sensitive to it. Network size is easy to get with the network.size()
function.
network.size(anabaptist_net)
[1] 67
8.4.2 Density and Average Degree
Network density equals actual ties divided by all possible ties. However, density tends to decrease as social networks get larger because the number of possible ties increases exponentially, whereas the number of ties that each actor can maintain tends to be limited. Consequently, we can only use it to compare networks of the same size. An alternative to network density is average degree centrality, which is not sensitive to network size and thus can be used to compare different sized networks. Let’s see how we can get these two measures in statnet.
First, calculate density using the gden()
function.
gden(anabaptist_net)
[1] 0.08276798
In order to calculate the average degree centrality, you will have to calculate vertex degree and proceed taking the average of this vector of scores.
mean(
degree(anabaptist_net,
# Indicate the type of graph evaluated as undirected
gmode = "graph")
)
[1] 5.462687
8.4.3 Cohesion and Fragmentation
In statnet the connectedness()
function takes a graph and returns the Krackhardt connectedness score (Krackhardt 1994), which other programs, such as UCINET
, call cohesion. Fragmentation is simply the additive inverse of cohesion
First take a look at how to calculate connectedness.
connectedness(anabaptist_net)
[1] 1
Now calculate fragmentation.
1 - connectedness(anabaptist_net)
[1] 0
8.4.4 Compactness and Breadth
Because the network is not disconnected, cohesion is 1.00 and fragmentation is 0.00. However, with a little manipulation, we can also compute distance weighted cohesion and fragmentation, what other programs, such as UCINET, calls compactness and breadth.
Calculating compactness requires calculating the geodesic distances between all nodes in the network. Then take the inverse of these scores, which are the reciprocal geodesic distance. Remove self loops. Finally, replace the infinity scores, which occur in disconnected graphs, with 0
.
First, let’s begin by calculating the distances:
<- geodist(anabaptist_net,
distance # Replace the Inf values with 0s
inf.replace = 0)
Take a look at the matrix of distances, here only the first four rows and columns:
$gdist[1:4, 1:4] distance
[,1] [,2] [,3] [,4]
[1,] 0 2 1 2
[2,] 2 0 2 3
[3,] 1 2 0 1
[4,] 2 3 1 0
We can read these distances as steps between nodes. So node one is two steps away from node two.
Proceed with the remaining steps outlined above to calculate the desired measure.
# Calculate reciprocal distances
<- 1/distance$gdist
reciprocal_distances # Modify the reciprocal_distances matrix
diag(reciprocal_distances) <- NA
== Inf] <- 0
reciprocal_distances[reciprocal_distances # Calculate compactness
<- mean(reciprocal_distances, na.rm = TRUE)
compactness compactness
[1] 0.3800372
For breadth, we could, of course, just take the additive inverse of compactness.
<- 1 - compactness
breadth breadth
[1] 0.6199628
We can automate the process of calculating compactness by turning the process into a function.
<- function(dat, na_rm = TRUE) {
my_compactness stopifnot(!is.network(dat) == "dat must be network object.")
stopifnot(!is.logical(na_rm) == "na_rm must be a logical.")
# Get reciprocal distances:
<- 1/geodist(anabaptist_net,
reciprocal_distances inf.replace = 0)$gdist
# Clean up the matrix:
diag(reciprocal_distances) <- NA
== Inf] <- 0
reciprocal_distances[reciprocal_distances # Calculate compacteness
mean(reciprocal_distances, na.rm = na_rm)
}
Run the function.
my_compactness(anabaptist_net)
[1] 0.3800372
8.4.5 Table of Interconnectedness Scores
We can create a table of interconnectedness scores and save them to a csv file. You can check your working directory to see the results in the interconnectedness.csv
file.
# Create a data.frame with the desired measures
<- data.frame(
interconnectedness "Size" = network.size(anabaptist_net),
"Density" = gden(anabaptist_net),
"Average Degree" = mean(degree(anabaptist_net, gmode = "graph")),
"Cohesion" = connectedness(anabaptist_net),
"Fragmentation" = 1 - connectedness(anabaptist_net),
"Compactness" = my_compactness(anabaptist_net),
"Breadth" = 1 - my_compactness(anabaptist_net)
)# Take a look
str(interconnectedness)
'data.frame': 1 obs. of 7 variables:
$ Size : num 67
$ Density : num 0.0828
$ Average.Degree: num 5.46
$ Cohesion : num 1
$ Fragmentation : num 0
$ Compactness : num 0.38
$ Breadth : num 0.62
Now write it to a CSV:
write.csv(interconnectedness,
file = "interconnectedness.csv",
row.names = FALSE)
8.6 Calculating the E-I Index with isnar
This section is in both statnet and igraph versions of this lab. You only need to do this section one time.
E-I Index indicate the ration of ties a group has to nongroup members. The index equals 1.0 for groups that have all external ties, while a group with -1.0 score has all internal ties. If the internal and external ties are equal, the index equals 0.0.
The E-I Index is not common to many R packages, and it is not as simple as one would think it would be to program. However, there is a package called isnar that does calculate it (Bojanowski 2021). It is written and maintained by Michal Bojanowski (m.bojanowski@icm.edu.pl) as a supplement to igraph. The only thing is that isnar is only available through GitHub. GitHub is a repository for open-source software, like R packages in development.
To estimate the E-I index, we need an attribute vector. Here, we’ll use the Melchiorite attribute included in the attribute file.
<- read.csv("data/Anabaptist Attributes.csv",
attributes header = TRUE)
Take a look at the vector names.
names(attributes)
[1] "ï..Names" "Believers.Baptism" "Violence"
[4] "Munster.Rebellion" "Apocalyptic" "Anabaptist"
[7] "Melchiorite" "Swiss.Brethren" "Denck"
[10] "Hut" "Hutterite" "Other.Anabaptist"
[13] "Lutheran" "Reformed" "Other.Protestant"
[16] "Tradition" "Origin.." "Operate.."
The Melchiorite
vector can be accessed using the [[
accessor. Now, use the ei()
function to get the E-I index.
We’ve found that calculating the E-I index works best with igraph
objects. If you start with statnet and you would like to run the E-I index, then we recommend using intergraph to convert your network
object into an igraph
object using asIgraph()
.
<- intergraph::asIgraph(anabaptist_net)
anabaptist_ig
ei(anabaptist_ig, attributes[["Melchiorite"]],
loops = FALSE, directed = FALSE)
[1] -0.9344262
That’s all for statnet now.