Recall that a graph consists of a collection of nodes and a collection of edges between them. Graphs typically come in two flavors: undirected and directed, each having their own purpose.

As a sort of look ahead, let's think of a few model problems where each type of graph may occur.

Our first problem deals with an internet service provider who would like to expand its services and provide fiber optic internet to a town. Of course, they want to do this at the lowest cost.

In order to figure out the best way to do this, we'll model the locations we must connect as nodes in a graph and potential lines of cable to run as (undirected) edges. In addition, these edges are weighted by a cost due to things such as the cost of cable or the labor that goes into digging trenches for the cable. How should we solve this problem?

Of course, this is a rather simplified version of a more realistic problem, as we're focused on a narrow sense of connectivity. A more reasonable requirement would build in some redundency. For example, if a cable were damaged, the entire network wouldn't go down.

Our second problem deals with scheduling. What we've got is a bunch a tasks which must be completed; the catch is that some of the tasks can't be started until other ones are complete. For example, if I'm building a house, I must excavate the ground before laying the foundation. On the other hand, determining the kind of lights or wallpaper I'd like is completely independent and someone else can do this simultaniously.

We'll model this problem as a directed graph where the tasks are the nodes and the directed edges encode dependencies between tasks. So, maybe I have an edge "Task A -> Task B" if "Task B" requires "Task A" to be completed first. We'll see that there is a nice algorithm called topological sorting which gives us an ordered list of tasks which ensures that all dependencies as we complete the list.

First, we'll look at the `Graph` and `DiGraph` in NetworkX to create some simple graphs. We'll start by importing NetworkX and building a simple cyclic graph on three nodes.

In [219]:

```
%matplotlib inline
import networkx as nx
```

In [236]:

```
G = nx.Graph()
G.add_node('A')
G.add_node('B')
G.add_node('C')
G.add_edge('A', 'B')
G.add_edge('B', 'C')
G.add_edge('A', 'C')
plt.figure()
nx.draw(G, with_labels=True)
plt.show()
```

In [235]:

```
G = nx.Graph()
G.add_nodes_from(['A', 'B', 'C'])
G.add_edges_from([('A', 'B'), ('A', 'C'), ('B', 'C')])
plt.figure()
nx.draw(G, with_labels=True)
plt.show()
```

Now, what if we want an undirected graph? We just change `Graph` to `DiGraph`.

In [234]:

```
G = nx.DiGraph()
G.add_nodes_from(['A', 'B', 'C'])
G.add_edges_from([('A', 'B'), ('B', 'C'), ('C', 'A')])
plt.figure()
nx.draw(G, with_labels=True)
plt.show()
```

Notice the difference in how the edges are rendered! Now the order matters!

In [233]:

```
G = nx.DiGraph()
G.add_edges_from([('A', 'B')])
plt.figure()
nx.draw(G, with_labels=True)
plt.show()
G = nx.DiGraph()
G.add_edges_from([('B', 'A')])
plt.figure()
nx.draw(G, with_labels=True)
plt.show()
```

Try creating a directed "circle" graph with $n$ nodes "by hand". The graph should have nodes $1, 2, \ldots, n$ with edges between $(1,2), (2, 3), \ldots, (n-1,n), (n,1)$.

Try creating a complete graph on 5 nodes. (A complete graph is an undirected graph with an edge between every pair of nodes.)

*We'll see that there some helper functions to do this kind of thing in a moment, but consider this a warm-up.*

NetworkX provides a few constructors for commonly occuring graphs. For example:

In [239]:

```
G = nx.complete_graph(6)
plt.figure()
nx.draw_circular(G)
plt.show()
```

In [242]:

```
G = nx.cycle_graph(13)
plt.figure()
nx.draw_circular(G)
plt.show()
```

There's even some rather exotic ones!

In [251]:

```
G = nx.lollipop_graph(5, 2)
plt.figure()
nx.draw(G)
plt.show()
```

Graph traversal provides us with a way to explore a graph some some starting node. One example you can think about is, if we're stuck in a maze, graph traversal would allow us to explore until we found an exit. (But in a principled way...not just randomly wandering around.)

In some sense, we saw this much earlier as the challenge problem during the second day in our implementation of depth-first search. Let's see how we can leverage NetworkX to do this for us. We'll also see how to easily change to an alternative search strategy known as breadth-first search.

We'll start with a "double diamond" graph this time:

In [291]:

```
G = nx.DiGraph()
G.add_nodes_from([0, 1, 2, 3, 4, 5, 6])
G.add_edges_from([
(0, 1),
(0, 2),
(1, 3),
(2, 3),
(3, 4),
(3, 5),
(4, 6),
(5, 6),
])
plt.figure()
nx.draw(G, with_labels=True)
plt.show()
```

Now we'll use the built-in NetworkX algorithms.

In [295]:

```
T = nx.depth_first_search.dfs_tree(G, 0)
plt.figure()
nx.draw(T, with_labels=True)
plt.show()
```

In [296]:

```
T = nx.depth_first_search.dfs_tree(G, 1)
plt.figure()
nx.draw(T, with_labels=True)
plt.show()
```

In [297]:

```
T = nx.depth_first_search.dfs_tree(G, 3)
plt.figure()
nx.draw(T, with_labels=True)
plt.show()
```

By changing one line of code, we can also employ a different search method:

In [298]:

```
T = nx.breadth_first_search.bfs_tree(G, 0)
plt.figure()
nx.draw(T, with_labels=True)
plt.show()
```

This one speaks for itself. How many times have you wanted to know the shortest route to get from point A to point B?

We'll model this problem on either an undirected or directed graph with weighted edges.

Shortest path also tends to come in two flavors:

The first is single-source shortest path. This computes the shortest paths to all other nodes from a particular starting node. For example, compute the best route from a starting location to and ending location is typically done this way. The classic algorithm for this is Dijkstra's algorithm. This also tends to be a good problem to see why "greedy" algorithms don't work. That is, just repeatedly choosing the shortest edge you can see.

The second flavor is the all-pairs shortest path problem. This time, we want to compute the shortest path between any two pairs of nodes on the graph. This is typically done with the Floyd-Warshall algorithm.

With the background out of the way, let's see how we can represent a weighted graph and compute the shortest path. I'll be using the example from the Dijkstra's Wikipedia page.

In [398]:

```
G = nx.Graph()
G.add_nodes_from([1, 2, 3, 4, 5, 6])
G.add_weighted_edges_from([
(1, 2, 7),
(2, 4, 15),
(4, 5, 6),
(5, 6, 9),
(6, 1, 14),
(1, 3, 9),
(2, 3, 10),
(3, 4, 11),
(3, 6, 2),
], weight='distance') # call the additional weight values 'distance'
```

Out[398]:

In [399]:

```
S = nx.shortest_path_length(G, 1, weight='distance') # use those 'distance' values here
S
```

Out[399]:

In [403]:

```
P = nx.shortest_path(G, 2)
P
```

Out[403]:

So, we can see that the shortest path from 1 to 5 goes through 1 -> 2 -> 4 -> 5.

One of our motivating examples regarded an internet service provider who was laying down fiber optic cable to expand its services. We'll look a litte more carefully at how they may solve this problem.

Let's construct a simple "crossed-box" graph with weighted edges and try to compute a spanning tree of minimum weight in order to connect the network.

First we'll suppose that the weights come purely from the geometry of the box.

In [346]:

```
G = nx.Graph()
G.add_nodes_from([0, 1, 2, 3])
G.add_weighted_edges_from([
(0, 1, 1.0),
(1, 2, 1.0),
(2, 3, 1.0),
(3, 0, 1.0),
(0, 2, 2.0**.5),
(1, 3, 2.0**.5),
])
T = nx.minimum_spanning_tree(G)
plt.figure()
nx.draw(G, with_labels=True)
plt.show()
plt.figure()
nx.draw(T, with_labels=True)
plt.show()
```

In [348]:

```
G.edge[0][2]['weight'] = 0.4 # update the weight to take into account availible infrastructure
T = nx.minimum_spanning_tree(G) # recompute minimal spanning tree
plt.figure()
nx.draw(T, with_labels=True)
plt.show()
```

This time, we'll change terminology a little bit. We'll consider a graph which has certain *capacities* on its edges. For example, a road network may only be able to support a maximum amount of traffic. Or, a compute network may have limited bandwidth.

The goal is, we have two specials nodes called the source and the sink and we want to compute the maximum amount of "stuff" we can transport from the source to the sink. The constraint is, we cannot exceed the capacities of the edges and any node we use along the way must have the same amount of incoming "stuff" and outgoing "stuff".

If you'd like a more formal reference, take a look at this.

Let's try this on the simplest example: Just a single path!

In [379]:

```
G = nx.DiGraph()
G.add_nodes_from([0, 1, 2, 3])
G.add_weighted_edges_from([
(0, 1, 4),
(1, 2, 3),
(2, 3, 6),
], weight='capacity')
plt.figure()
nx.draw(G, with_labels=True)
plt.show()
```

In [380]:

```
nx.maximum_flow_value(G, 0, 3)
```

Out[380]: