Graph Algorithms – a revision of all the basic ones.

Hello friends! This post will be a short review of the the most common and basic algorithms for undirected graphs.

Note: This is not a post intended for folks who are not at all familiar with graphs. For beginners a better starting point would be a book like this.

We will start by discussing the basic problems and then we will have a look at some common applications.

Basic Problem 1 : Traversal

Problem Statement: graph traversal is the problem of visiting all the nodes in a graph in a particular manner, updating and/or checking their values along the way.

Solution 1 : Breadth First Traversal/Search (BFS)

TL;DR: search is limited to essentially two operations: (a) visit and inspect a node of a graph, (b) gain access to visit the nodes that neighbour the currently visited node.

Points to remember:

• Layer-wise search forming layers L[0],L[1],…,L[i] of nodes
• L[i] are neighbours to L[i-1] ,  and are not present in any of the earlier layers.
• L[j] is exactly j distant from S. (assuming non-weighted edges , each with cost 1)
• BFS tree can be obtained after running BFS.
• Any edge can be present between nodes not differing by more than one layer
• Any BFS routine produces the connected component that contains the root.
• O(m+n) running time and space.
• Efficient implementation using a queue.

Steps:

1. Explore all neighbours of root. Call them level 1.
2. Then explore all neighbours of nodes of level 1. Call them level 2.
3. Keep Exploring like this 🙂

Code:

Without Queue:

With Queue:

Solution 2 : Depth First Traversal/Search (DFS)

TL;DR: Greedy search, where search retreats only when it encounters a dead-end of the branch,i.e. all neighbours.

Points to remember:

• Like the BFS tree, a DFS tree is obtained.
• O(m+n) running time and space.
• If an edge (x,y) is not present in the DFS tree, then one is a descendant of the other
• Can be used to produce Topological Sorting.
• Can be used to produce SCCs(Strongly connected components) using Kosaraju’s two-pass algorithm.
• Simple recursive implementation possible that makes use of recursion stacks
• Non-recursively efficiently implemented using Stacks.
• Discovery times of every node follows parenthesis structure (Source: CLRS)
• For finding all paths, we can unmark the node  as visitied when we finish the loop.
• Levels are processed in reverse order for non recursive with respect to that of recursive implementation

Steps:

1. Explore first neighbour of root.
2. Then explore the first unexplored neighbour of the first neighbour
3. Keep exploring until you reach a node with no unexplored neighbours.
4. Backtrack at that node until you reach a node with unexplored neighbours.

Code:

With Recursion:

Without Recursion, with Stack:

Notes on Traversal

• BFS and DFS have same running time and space efficiency
• BFS and DFS can be used to find connected components.
• BFS and DFS can be used to find connected cycles (DFS is more efficient and can be used in directed graphs also, as backward edge produces cycle).
• Classification Of Edges(directed): Tree Edge, Forward Edge(connecting to descendant), Backward Edge(connecting to ancestor), Cross Edge(all other). In DFS, only tree edge or backward edge is possible.
• Classification Of Edges(un-directed): Tree Edge, Forward Edge(discovered by an ancestor), Backward Edge(discovered by a successor), Cross Edge. In DFS, only backward or forward or tree edges are possible.In BFS, only cross edges are possible.

Basic Problem 2 : Shortest Paths

Single Source Problem Statement:  Given a undirected weighted graph G and a source node u, find the shortest cost paths from u  to all other nodes.

Solution: Dijkstra’s Algorithm

TL;DR: Algorithm starts at the source vertex, u, it grows a tree, T, that ultimately spans all vertices reachable from u. Vertices are added to T in order of distance i.e., first u, then the vertex closest to u, then the next closest, and so on.

Steps:

• Initialise a set P and a set Q, with P initially storing u with cost 0, and all other in Q with cost infinity.
• Change the priorities of the nodes in Q having an edge with u as their edge costs.
• Dequeue the minimum key node and add it to P.
• Adjust the priorities of other nodes in Q so that the cost of addition gets minimised.
• At the end, the keys are the shortest path costs.

Cost of addition is given by

Code:

Proof of Correctness:
Proof by Induction on set P’s size: Lets assume that the paths are correct for all the previous nodes in P for some node v. Then if there exists some path with smaller cost it must cross P some time, let’s say it crosses with edge passing from x to y. Then algorithm would have selected y before v as all edge lengths are positive, which is a contradiction.

Running Time Analysis:
Total deleteMin – O(nlogn) with Priority Queue, O(n2) without Priority Queue
Total change priorities – O(mlogn) with Priority Queue, O(m) without Priority Queue

Hence total running time = O((n+m)logn) with Priority Queue, O(m+n2) without Priority Queue

Points to remember:

• Uses Adjacency List Representation of Graph
• Much Similar to Prim’s Algorithm
• Requires all edge costs to be non-negative

Okay, now let’s look at a problem where we are not required to find the shorted distances from only one source, but from each node.

All Pairs Shortest Paths Problem Statement:  Given a undirected weighted graph G and, find the shortest cost paths between all pairs.

Naive Solution: Run Dijkstra’s algorithm on all nodes. O(mnlogn) = O(n3logn).

Solution 1: Floyd-Warshall Algorithm

TL;DR: Dynamic Programming approach with subproblem as shortest distance between node i and node j using only the first k vertices in between.

Points to remember:

• O(n3) running time
• No negative cost cycles should be present.
• In place implementation possible as the (k+1)th row, and (k+1)th column don’t change in one iteration.

Steps:

• Maintain n 2d arrays D[n][][], such that D[k][i][j] gives the shortest distance between i and j using only the first k vertices in between.
• Final result will be D[n-1]
• Initiate D[-1][i][j] as 0 if i==j, cost of (i,j) if edge exists, infinity otherwise.
• Using dynamic programming, find D[k+1][i][j] = min(D[k][i][j] , D[k+1][i][k+1] + D[k+1][k+1][j]).

Steps for finding the shortest paths:

1. Same kind of implementation, use a matrix N[][][]
2. N[-1][i][j] = j if (i,j) exists, undefined otherwise.
3. N[k+1][i][j] = N[k][i,k+1] if k+1 path is shorter, else N[k][i][j]
4. Now for finding the path between i,j keep following the array N and printing the nodes.

Basic Problem 3 : Minimum Spanning Trees

Definition: A spanning tree of a graph is a tree that covers all the nodes of the original graph, e.g. BFS/DFS trees. A minimum spanning tree is a spanning tree with the minimum cost.

The MST Lemma (Cut Property): Let P and Q be two disjoint node sets that together give the union of the original graph nodes, then among all the edges between P and Q, the node with minimum edge cost must be present in an MST(there might be more than one MST)

Proof: Let T be an MST that doesn’t contain e{u,v} (the minimum edge cost), then adding e to it will produce a cycle. Traverse along that cycle , and whenever it crosses from P to Q, replace that edge with e. This produces an MST again.

The MST Cycle Lemma (Cycle Property): The costliest edge of a connected undirected graph must not be in an(any if edge costs are distinct in that cycle) MST.

Proof: Let T be an MST that contains the costliest edge e {u,v}. T-e gives two connected disjoint components, traverse along the cycle to get a cheaper edge that can be replaced with e.

Problem Statement: Find a minimum spanning tree for an undirected weighted graph.

Solution 1: Prim’s Algorithm

Point to remember: Nearly direct implementation of the MST lemma.

Steps:

• Start with a set P with some random root node u and a priority queue Q that contains all the remaining nodes of graph with key as infinity.
• Set priority of the nodes that share an edge with u as their edge cost.
• In the priority queue Q maintain a priority of the shortest edge cost from the node to P and the respective closest neighbour
• In a loop keep choosing elements from Q, s.t. d[u] is as small as possible. Add u to P.
• Update the priority of all the nodes in Q w.r.t u. i.e.,  compare their current priority with the edge cost with u and update if the edge cost is lesser.

Running Time Analysis:

• Make heap and other inititalizations – O(n)
• One iteration O(1) in look up and deletion – total of O(nlogn).
• Total iterations of ChangePriority O(m) and each individual costs O(log(n)), hence a total of O(mlogn).

Solution 2: Kruskal’s Algorithm

Points to remember:

• Uses MST lemma.
• Uses MFset( also called Union-Find) Data Structure
• O(mlogn) using MFSet
• Runtime dramatically reduces near to Ackermann Function after using path compression, goes to O(n * ackermann(m)), where ackermann(m)<= 5 for all practical purposes.

Steps:

1. Sort the edges in increasing order of cost.
2. Keep adding edges to MST, and whenever a cycle is formed, just don’t add it.

Running Time Analysis: O(mlogn) in worst case without path compression

Solution 3: Reverse- Delete Algorithm

Point to remember: Uses the cycle MST lemma

Steps:

1. Sort the edges in decreasing order of costs.
2. Keep deleting the costliest edges as long as the tree stays connected

Running Time Analysis: O(n2) with usual implementation.

Notes on MSTs

• All of them are Greedy Solutions
• It turns out that any greedy approach using either of the lemmas will work.

Interesting Applications

Cycle Detection in a Graph: Detect if a directed graph contains a cycle

Points to remember:

• Direct extension of DFS.
• Cycle in a graph only if there is a back edge present in the graph.

Code 🙂

Running Time Analysis:

• Seriously?

Testing Bipartiteness

Bipartite Graph Definition: A graph whose vertices can be divided into two disjoint sets X and Y such that every edge has one end in X and other in Y. Consequently, the sets X and Y are independent sets.

Points to remember:

• Sufficient condition is the non existence of any odd cycles.
• Can use both BFS and DFS to test bipartiteness
• For BFS, check whether there is an odd cycle which can be checked by confirming if there exists an edge between two consecutive layers.
• For DFS, while performing DFS assign colors to the node with negative of parent, hence if you encounter a contradictory case, then not bipartite.

Extending these algorithms For Directed Graphs:

Points to remember:

• BFS and DFS remain the same.
• Strong Connectivity can be checked by running both forward and backward BFS/DFS and comparing the two components if they are the same.
• Strong Components will either be equal or disjoint because if s & u are mutually reachable, s & v are mutually reachable then u & v will also be mutually reachable.

Topological Ordering/Sorting in a Directed Graph:

Topological Ordering of nodes: If every directed edge is forward pointed then the ordering is a topological ordering.

Points to remember:

• A topological ordering of a graph exists iff the graph is a DAG(Directed Acyclic Graph)
• Useful in finding precedence relations
• For every DAG there must be a node with no incoming edges

Algortithms Possible:

• Recursively follow the node with no incoming edges, delete that and then recursively find the next such vertex,
• A variation of DFS, whenever a node is finished searching, add it to front of a linked list. This list will be a DAG. Note to make repeated invocations of DFS as the DFS might stop before traversing to each edge. For more information see this.

Analysis:

• O(n2) running time for raw algorithm1, and O(n+m) for DFS implementation
• To obtain O(m+n) time optimization, use the following technique for Algorithm-1:
• Store two Data Structures:
• for each w, no of active nodes that have incoming edge w.r.t. w.
• the set of all active nodes that have no incoming edges from other active nodes.

So that would be all for this post. BFS, DFS, Dijkstra, Floyd-Warshall, MST, and some interesting applications. The notes are highly influenced from the book by Kleinberg, Tardos and the book by Cormen et.al. . If you are not satisfied have a read in the books, or just post a comment and I can care to explain 🙂 .

Make sure to point out any mistakes or suggestions in the comments.

See you in the next post!