Network analysis, or network science, is a framework for mathematically representing and understanding connections between things. In it, entities (or things), in network-speak known as nodes, are connected by what are known as edges. When several of these nodes and edges come together we have what is know as a network: this is actually a mathematical object which can then be analysed. We can look, for example, for the most-connected or most central nodes, or to look for separate communities within the network.
There are a number of options for you to work with, from here. The aim is to give you an idea of the tools that are out there and the kind of data they work with, so don’t worry too much about finding interesting results! The options are:
Upload some sample network data to Palladio, a free online application which will allow you to make some network visualisations very quickly.
Upload the data to a tool called Network Navigator: just as easy to use, this tool gives you a range of network statistics, which can be used to understand the nodes within the network.
Try out network analysis in a programming language, using a very simple interactive document.
The instructions for options 1 and 2 are contained within this page, but to use the interactive notebook, you need to load an interactive environment called MyBinder. If you don’t have any coding experience or would prefer to get straight to the data, skip this step and move on to the first section.
First, open the following in a new window: https://mybinder.org/v2/gh/yann-ryan/dh_intro_gates/main?urlpath=rstudio
This will starting loading a new Binder instance - an interactive coding environment. it might take a minute or two (so it might be worth going through some of the tutorials below while you’re waiting).
Once it has finished, you should see this screen:
This is called R-Studio: an application designed for writing and running code. We’re going to open a pre-made ‘notebook’. The bottom-right pane contains a list of files. Look for one called ‘network_analysis_r.Rmd’ and click on it. It will open the notebook on the top-left pane.
From here, follow the instructions in the notebook text.
There are a number of other tools which are very useful for doing network analysis, including Gephi, Nodegoat, and the Vistorian.
Palladio is a simple web application which allows you to upload your own network data and make some basic network visualisations.
In order to make a network visualisation with Palladio, it needs a simple piece of data called an edge list. This is a spreadsheet listing all the connections between entities in your network. For this tutorial, we’ll work with a very small sample of letters. Letters are sent between two people, so in order to create a network, we can represent the people as nodes, and the connection between them, through the action of sending a letter, as the edges.
An edge list is simply a list of the letters sent and received.
First, go to http://hdlab.stanford.edu/palladio/ in your browser, and click on the ‘start’ button. You should now have this screen:
The next page will present you with a few options for uploading your data to Palladio. We’re going to copy and paste a list of authors and recipients of letters, separated by a comma.
In a second tab, go to https://raw.githubusercontent.com/yann-ryan/dh_intro_gates/main/sample_letter_data_network.csv
You should see a simple text file. Select all the text in the window (click and drag or use control + a or command + a) and copy it to the clipboard.
Return to the tab with Palladio open and paste the text into the text box under ‘Load .csv or spreadsheet’, and press the ‘load’ button underneath.
The Palladio app will load with your data. There are a few options here, but we’re going to focus on the ‘graph’. Click on the ‘graph’ tab at the top of the screen:
Click on the ‘source’ drop down, and select the ‘author’ option in the popup box, and close it:
Do the same for target: click on the ‘target’ dropdown and select the ‘recipient’ option in the popup.
You should now have a basic network visualisation, with the senders and recipients of letters visualised as circles, and the links between them as connecting lines. The algorithm to create the visualisation places highly-connected nodes at the centre, and less-well-connected ones at the periphery. It also groups connected nodes together.
Try out some the other options, and see if if you can figure out what they are doing to the visualisation.