# Graph Algorithms : Network Flow

Definition of a Flow Network:
A directed graph (V,E) with the following properties:
• Associated with each edge e is a capacity c(e) , c(e)>0.
• There is a single source node s belonging to V
• There is a single sink node t belonging to V
Nodes other than s and t are called internal nodes.
Assumptions:
• All capacities are integers.
• There must be at least one edge leaving source
• There must be at least one edge entering the sink
Defining Flow: Basically carrying traffic or flow. More formally, flow is a function mapping each edge to a value f(e) satisfying the following three constraints:
• {Capacity Constraint} For each edge e,  f(e)<=c(e)
• {Skew Symmetry} For each pair of vertices u,v ,  f(u,v) = -f(v,u)
• {Flow Conservation} For each node v other than s and t,  sum of all flow going out of v must be zero. In other words, total +ve flow coming into v (or fin(v)) equals total +ve flow going out of v (or fout(v)).
Hence f can be both negative or positive, also called the net flow of that edge. The value of network flow is thence summation of flow going out of source.
Note: Generally flow for an edge corresponds to the positive flow.
Defining Flow between sets:
f(X,Y) = sum of f(x,y) , x belongs to X, y belongs to Y.
Some Basic Identities:
• f(X,X) = 0
• f(X,Y) = -f(Y,X)
• f(XUY,Z) =  f(X,Z) + f(Y,Z) if X and Y are disjoint.
• f(s,V) = f(V,t)
Proof:
f(s,V)
= f(V,V) – f(V-s,V)
= f(V,V-s)
= f(V,t) + f(V,V-s-t)
= f(V,t) – f(V-s-t,V)
= f(V,t)

The Maximum-Flow Problem

Natural Goal – Maximizing the flow from source to sink without violating capacity constraints.
Note: Problem for multiple sources and sinks can be reduced making a global source and a global sink and connecting with infinite capacities.

Example: Formulate maximum flow as linear programming problem.

Designing The Algorithm:
• DP won’t work here.
• If start with a greedy approach then we can counter that by compensating some flows and directing it to other edges.
• Hence the concept of augmenting paths

Residual Graph: G’ defined w.r.t. some flow f. • Node set of G’ same as G
• Consists of edges that can admit more net flow.
• For every edge having flow less than capacity (assuming forward to be the direction of the capacity)
• A forward edge with residual capacity
• A backward edge describing the flow for that edge
• Mathematically, cf(u,v)=c(u,v)-f(u,v) {c(u,v) will be zero for the reverse edge in nearly all cases}

Note: If f is a flow in G and  f’ is a flow in Gf, then f+f’ is also a flow in G. Proof is trivial using the definition of flow network.
Augmenting Flow using the residual graph:
For any residual graph G’ , a simple path P with the bottleneck edge b as the minimum cost edge in P is called an augmenting path.

fp(u,v): b if (u,v) belongs to P, -b if (v,u) belongs to P, 0 otherwise.

Augmenting flow is defined as f’(u,v) = f(u,v)+fp(u,v)
Proof of f’ being a flow:

If e is a forward edge on P ->  0 <= f(e) <=  f’(e) = f(e) + b <=  f(e) + (ce – f(e)) = ce;
If e is a backward edge on P -> ce >= f(e) >= f’(e) = f(e) – b >= f(e) – f(e) = 0;
Proof for conservation of each node:
If node not in P, no effect. If node is in P then one outgoing flow increases and one incoming flow decreases by the same amount.
Flows and Cuts:
Notion of Cut:  partition the nodes into two sets S & T, s belongs to S, t belongs to T. Then (S,T) form a cut.
Identity: Flow of a cut ( f(S,T) ) equals f(s,V).
Proof:
• f(S,T)
• = f(S,V-S)
• = f(S,V)
• = f(s,V) + f(V-s,V)
• = f(s,V)
• = |f|
Capacity of the Cut: c(S,T) sum of capacities of all edges from S to T.
Theorem: Max Flow = Min Cut for a flow network.
Proof:
• Any Flow <= Capacity of any Cut
• Trivial
• If f is maximum,
• Gf has no augmenting path
• Proof:
• |f’| = |f| + |fp| hence flow increasing with each augmentation.
• |f| = c(S,T) for some S,T
• Proof:
• Take S as all nodes reachable from s in the Residual Graph.
• As t is not reachable from s, S must not contain t.
• Take T as V-S.
• As there are no outgoing edges from S to T in Gf , f(S,T) must be equal to c(S,T).
Ford-Fulkerson Algorithm for finding Max-Flow:
Max-Flow
Initially f(e)=0 for all e in G
While there is an s-t path in the residual graph Gf
• Let P be any (not necessarily simple, simple makes it the generic flow algorithm) s-t path in Gf
• f’= augment(f, P)
• Update f to be f’
• Update the residual graph Gf to be Gf
Endwhile
Return
Points to remember:
• After every augmentation, the flow values and residual capacities are integers
• While augmenting the flow is taken along the direction of the path.
• The total value of flow increases by the bottleneck edge of s-t path cost as all other points conserved
• At most |F| iterations where |F| is maximum flow.
• Hence running time at most O(m|F|) as BFS in O(m+n) and n<=2*m as all nodes have at least one incident edge.
Edmond-Karp Algorithm for finding Max-Flow:
Same as the Generic flow algorithm, with augmentation done only with shortest BFS paths (edge weights as 1).
Running Time With This Modification: O(m2n).
Proof that running time is O(m2n):
• Let δ(s,v) be the level of v in BFS tree.
• After an augmentation, for each edge:
• Either a backward edge is added (if existing was forward)
• Or a forward edge is added (if existing was backward)
• Which means, in a BFS tree, if Li to Li+1 was removed, Li+1 to Li would be added.
• Hence, δ(s,v) must increase at every iteration.
• If an edge starting at u is chosen as the bottleneck edge, u’s level increases by 2.
• Hence a total of 2m * n/2 edges can be chosen as bottleneck.
• Total no. of iterations <= mn.
• Hence O(m2n).
Final Points:
• The flow returned by Ford-Fulkersen Algorithm is a maximum flow.
• Our time analysis assumes that the capacities are integers.
• Given a flow of maximum value, we can compute s-t cut of minimum capacity in O(m) time.
• In every flow network, the maximum value of  s-t flow is equal to the minimum capacity of s-t cut.
• Dinic’s algorithm computes Max Flow in O(n2m).

# Graph Algorithms – a revision of all the basic ones.

Hello friends! This post will be a short review of the the most common and basic algorithms for undirected graphs.

#### Note: This is not a post intended for folks who are not at all familiar with graphs. For beginners a better starting point would be a book like this.

We will start by discussing the basic problems and then we will have a look at some common applications.

## Basic Problem 1 : Traversal

Problem Statement: graph traversal is the problem of visiting all the nodes in a graph in a particular manner, updating and/or checking their values along the way.

#### Solution 1 : Breadth First Traversal/Search (BFS) TL;DR: search is limited to essentially two operations: (a) visit and inspect a node of a graph, (b) gain access to visit the nodes that neighbour the currently visited node.

Points to remember:

• Layer-wise search forming layers L,L,…,L[i] of nodes
• L[i] are neighbours to L[i-1] ,  and are not present in any of the earlier layers.
• L[j] is exactly j distant from S. (assuming non-weighted edges , each with cost 1)
• BFS tree can be obtained after running BFS.
• Any edge can be present between nodes not differing by more than one layer
• Any BFS routine produces the connected component that contains the root.
• O(m+n) running time and space.
• Efficient implementation using a queue.

Steps:

1. Explore all neighbours of root. Call them level 1.
2. Then explore all neighbours of nodes of level 1. Call them level 2.
3. Keep Exploring like this 🙂

Code:

Without Queue:

 #define nodecount 10000 bool discovered[nodecount]; vector > L;//Layers void BFS(int s, vector > & adjList)//BFS for arrays, without queue, non-recursive, by @arkanathpathak { fill(discovered, discovered+nodecount, false); discovered[s] = true; for(int v=0;v)); L.push_back(s); while(L[i].size()!=0) { L.push_back(*(new vector)); for(int j=0;j < (int)L[i].size(); j++) { int u = L[i][j]; for(int k=0;k<(int)adjList[u].size();k++) { int v = adjList[u][k]; if(discovered[v]==false) { discovered[v] = true; L[i+1].push_back(v); } } } i++; } }

view raw
BFS, with layers
hosted with ❤ by GitHub

With Queue:

 vector adjList[N]; int visited[N]; void bfs(int u) { visited[u] = 1;//mapping the visited nodes, by default 0 queue nq;//queue used nq.push(u);//push initial node while(!nq.empty()) { int tt = nq.front(); visited[tt] = 1; nq.pop(); for(int i=0;i

view raw
BFS with Queue
hosted with ❤ by GitHub

#### Solution 2 : Depth First Traversal/Search (DFS) TL;DR: Greedy search, where search retreats only when it encounters a dead-end of the branch,i.e. all neighbours.

Points to remember:

• Like the BFS tree, a DFS tree is obtained.
• O(m+n) running time and space.
• If an edge (x,y) is not present in the DFS tree, then one is a descendant of the other
• Can be used to produce Topological Sorting.
• Can be used to produce SCCs(Strongly connected components) using Kosaraju’s two-pass algorithm.
• Simple recursive implementation possible that makes use of recursion stacks
• Non-recursively efficiently implemented using Stacks.
• Discovery times of every node follows parenthesis structure (Source: CLRS)
• For finding all paths, we can unmark the node  as visitied when we finish the loop.
• Levels are processed in reverse order for non recursive with respect to that of recursive implementation

Steps:

1. Explore first neighbour of root.
2. Then explore the first unexplored neighbour of the first neighbour
3. Keep exploring until you reach a node with no unexplored neighbours.
4. Backtrack at that node until you reach a node with unexplored neighbours.

Code:

With Recursion:

 vector adjList[N]; int visited[N]; void dfsrecursive(int u) { visited[u] = 1; for(int i=0;i

view raw
DFS with Recursion
hosted with ❤ by GitHub

Without Recursion, with Stack:

 vector adjList[N]; int visited[N]; void dfsstack(int u) { stack nstack;//stack used in place of recursion stack visited[u] = 1; nstack.push(u); while(!nstack.empty()) { int tp = nstack.top(); nstack.pop(); for(int i=0;i

view raw
DFS with Stack
hosted with ❤ by GitHub

#### Notes on Traversal

• BFS and DFS have same running time and space efficiency
• BFS and DFS can be used to find connected components.
• BFS and DFS can be used to find connected cycles (DFS is more efficient and can be used in directed graphs also, as backward edge produces cycle).
• Classification Of Edges(directed): Tree Edge, Forward Edge(connecting to descendant), Backward Edge(connecting to ancestor), Cross Edge(all other). In DFS, only tree edge or backward edge is possible.
• Classification Of Edges(un-directed): Tree Edge, Forward Edge(discovered by an ancestor), Backward Edge(discovered by a successor), Cross Edge. In DFS, only backward or forward or tree edges are possible.In BFS, only cross edges are possible.

## Basic Problem 2 : Shortest Paths

Single Source Problem Statement:  Given a undirected weighted graph G and a source node u, find the shortest cost paths from u  to all other nodes.

#### Solution: Dijkstra’s Algorithm TL;DR: Algorithm starts at the source vertex, u, it grows a tree, T, that ultimately spans all vertices reachable from u. Vertices are added to T in order of distance i.e., first u, then the vertex closest to u, then the next closest, and so on.

Steps:

• Initialise a set P and a set Q, with P initially storing u with cost 0, and all other in Q with cost infinity.
• Change the priorities of the nodes in Q having an edge with u as their edge costs.
• Dequeue the minimum key node and add it to P.
• Adjust the priorities of other nodes in Q so that the cost of addition gets minimised.
• At the end, the keys are the shortest path costs.

Cost of addition is given by Code:

 int shortestdistance[N]; int explored[N]; vector > adjList[N]; // stores edges with cost void dijkstra(int node)// Does not use priority queue, O(M+N2) implementation { shortestdistance[node] = 0; explored[node] = 1; int discovered = 1; int lastadded = node; while(discovered

view raw
Dijkstra's Algorithm
hosted with ❤ by GitHub

Proof of Correctness:
Proof by Induction on set P’s size: Lets assume that the paths are correct for all the previous nodes in P for some node v. Then if there exists some path with smaller cost it must cross P some time, let’s say it crosses with edge passing from x to y. Then algorithm would have selected y before v as all edge lengths are positive, which is a contradiction.

Running Time Analysis:
Total deleteMin – O(nlogn) with Priority Queue, O(n2) without Priority Queue
Total change priorities – O(mlogn) with Priority Queue, O(m) without Priority Queue

Hence total running time = O((n+m)logn) with Priority Queue, O(m+n2) without Priority Queue

Points to remember:

• Uses Adjacency List Representation of Graph
• Much Similar to Prim’s Algorithm
• Requires all edge costs to be non-negative

Okay, now let’s look at a problem where we are not required to find the shorted distances from only one source, but from each node.

All Pairs Shortest Paths Problem Statement:  Given a undirected weighted graph G and, find the shortest cost paths between all pairs.

#### Naive Solution: Run Dijkstra’s algorithm on all nodes. O(mnlogn) = O(n3logn).

Solution 1: Floyd-Warshall Algorithm

TL;DR: Dynamic Programming approach with subproblem as shortest distance between node i and node j using only the first k vertices in between.

Points to remember:

• O(n3) running time
• No negative cost cycles should be present.
• In place implementation possible as the (k+1)th row, and (k+1)th column don’t change in one iteration.

Steps:

• Maintain n 2d arrays D[n][][], such that D[k][i][j] gives the shortest distance between i and j using only the first k vertices in between.
• Final result will be D[n-1]
• Initiate D[-1][i][j] as 0 if i==j, cost of (i,j) if edge exists, infinity otherwise.
• Using dynamic programming, find D[k+1][i][j] = min(D[k][i][j] , D[k+1][i][k+1] + D[k+1][k+1][j]).

Steps for finding the shortest paths:

1. Same kind of implementation, use a matrix N[][][]
2. N[-1][i][j] = j if (i,j) exists, undefined otherwise.
3. N[k+1][i][j] = N[k][i,k+1] if k+1 path is shorter, else N[k][i][j]
4. Now for finding the path between i,j keep following the array N and printing the nodes.

## Basic Problem 3 : Minimum Spanning Trees

Definition: A spanning tree of a graph is a tree that covers all the nodes of the original graph, e.g. BFS/DFS trees. A minimum spanning tree is a spanning tree with the minimum cost.

The MST Lemma (Cut Property): Let P and Q be two disjoint node sets that together give the union of the original graph nodes, then among all the edges between P and Q, the node with minimum edge cost must be present in an MST(there might be more than one MST)

Proof: Let T be an MST that doesn’t contain e{u,v} (the minimum edge cost), then adding e to it will produce a cycle. Traverse along that cycle , and whenever it crosses from P to Q, replace that edge with e. This produces an MST again.

The MST Cycle Lemma (Cycle Property): The costliest edge of a connected undirected graph must not be in an(any if edge costs are distinct in that cycle) MST.

Proof: Let T be an MST that contains the costliest edge e {u,v}. T-e gives two connected disjoint components, traverse along the cycle to get a cheaper edge that can be replaced with e.

Problem Statement: Find a minimum spanning tree for an undirected weighted graph.

Solution 1: Prim’s Algorithm

Point to remember: Nearly direct implementation of the MST lemma. Steps:

• Start with a set P with some random root node u and a priority queue Q that contains all the remaining nodes of graph with key as infinity.
• Set priority of the nodes that share an edge with u as their edge cost.
• In the priority queue Q maintain a priority of the shortest edge cost from the node to P and the respective closest neighbour
• In a loop keep choosing elements from Q, s.t. d[u] is as small as possible. Add u to P.
• Update the priority of all the nodes in Q w.r.t u. i.e.,  compare their current priority with the edge cost with u and update if the edge cost is lesser.

Running Time Analysis:

• Make heap and other inititalizations – O(n)
• One iteration O(1) in look up and deletion – total of O(nlogn).
• Total iterations of ChangePriority O(m) and each individual costs O(log(n)), hence a total of O(mlogn).

Solution 2: Kruskal’s Algorithm Points to remember:

• Uses MST lemma.
• Uses MFset( also called Union-Find) Data Structure
• O(mlogn) using MFSet
• Runtime dramatically reduces near to Ackermann Function after using path compression, goes to O(n * ackermann(m)), where ackermann(m)<= 5 for all practical purposes.

Steps:

1. Sort the edges in increasing order of cost.
2. Keep adding edges to MST, and whenever a cycle is formed, just don’t add it.

Running Time Analysis: O(mlogn) in worst case without path compression

Solution 3: Reverse- Delete Algorithm

Point to remember: Uses the cycle MST lemma

Steps:

1. Sort the edges in decreasing order of costs.
2. Keep deleting the costliest edges as long as the tree stays connected

Running Time Analysis: O(n2) with usual implementation.

Notes on MSTs

• All of them are Greedy Solutions
• It turns out that any greedy approach using either of the lemmas will work.

## Interesting Applications

#### Cycle Detection in a Graph: Detect if a directed graph contains a cycle

Points to remember:

• Direct extension of DFS.
• Cycle in a graph only if there is a back edge present in the graph.

Code 🙂

 vector adjList[N]; int visited[N]; bool dfsrecursiveandcycle(int u, int parent) { cout << "visited " << u << endl;// 🙂 visited[u] = 1; bool ans = false; for(int i=0;i

Running Time Analysis:

• Seriously?

Testing Bipartiteness

Bipartite Graph Definition: A graph whose vertices can be divided into two disjoint sets X and Y such that every edge has one end in X and other in Y. Consequently, the sets X and Y are independent sets.

Points to remember:

• Sufficient condition is the non existence of any odd cycles.
• Can use both BFS and DFS to test bipartiteness
• For BFS, check whether there is an odd cycle which can be checked by confirming if there exists an edge between two consecutive layers.
• For DFS, while performing DFS assign colors to the node with negative of parent, hence if you encounter a contradictory case, then not bipartite.

Extending these algorithms For Directed Graphs:

Points to remember:

• BFS and DFS remain the same.
• Strong Connectivity can be checked by running both forward and backward BFS/DFS and comparing the two components if they are the same.
• Strong Components will either be equal or disjoint because if s & u are mutually reachable, s & v are mutually reachable then u & v will also be mutually reachable.

Topological Ordering/Sorting in a Directed Graph:

Topological Ordering of nodes: If every directed edge is forward pointed then the ordering is a topological ordering.

Points to remember:

• A topological ordering of a graph exists iff the graph is a DAG(Directed Acyclic Graph)
• Useful in finding precedence relations
• For every DAG there must be a node with no incoming edges

Algortithms Possible:

• Recursively follow the node with no incoming edges, delete that and then recursively find the next such vertex,
• A variation of DFS, whenever a node is finished searching, add it to front of a linked list. This list will be a DAG. Note to make repeated invocations of DFS as the DFS might stop before traversing to each edge. For more information see this.

Analysis:

• O(n2) running time for raw algorithm1, and O(n+m) for DFS implementation
• To obtain O(m+n) time optimization, use the following technique for Algorithm-1:
• Store two Data Structures:
• for each w, no of active nodes that have incoming edge w.r.t. w.
• the set of all active nodes that have no incoming edges from other active nodes.

So that would be all for this post. BFS, DFS, Dijkstra, Floyd-Warshall, MST, and some interesting applications. The notes are highly influenced from the book by Kleinberg, Tardos and the book by Cormen et.al. . If you are not satisfied have a read in the books, or just post a comment and I can care to explain 🙂 .

Make sure to point out any mistakes or suggestions in the comments.

See you in the next post!