Hello friends! This post will be a short review of the the most common and basic algorithms for undirected graphs.
Note: This is not a post intended for folks who are not at all familiar with graphs. For beginners a better starting point would be a book like this.
We will start by discussing the basic problems and then we will have a look at some common applications.
Basic Problem 1 : Traversal
Problem Statement: graph traversal is the problem of visiting all the nodes in a graph in a particular manner, updating and/or checking their values along the way.
Solution 1 : Breadth First Traversal/Search (BFS)
TL;DR: search is limited to essentially two operations: (a) visit and inspect a node of a graph, (b) gain access to visit the nodes that neighbour the currently visited node.
Points to remember:
- Layer-wise search forming layers L[0],L[1],…,L[i] of nodes
- L[i] are neighbours to L[i-1] , and are not present in any of the earlier layers.
- L[j] is exactly j distant from S. (assuming non-weighted edges , each with cost 1)
- BFS tree can be obtained after running BFS.
- Any edge can be present between nodes not differing by more than one layer
- Any BFS routine produces the connected component that contains the root.
- O(m+n) running time and space.
- Efficient implementation using a queue.
Steps:
- Explore all neighbours of root. Call them level 1.
- Then explore all neighbours of nodes of level 1. Call them level 2.
- Keep Exploring like this 🙂
Code:
Without Queue:
With Queue:
Solution 2 : Depth First Traversal/Search (DFS)
TL;DR: Greedy search, where search retreats only when it encounters a dead-end of the branch,i.e. all neighbours.
Points to remember:
- Like the BFS tree, a DFS tree is obtained.
- O(m+n) running time and space.
- If an edge (x,y) is not present in the DFS tree, then one is a descendant of the other
- Can be used to produce Topological Sorting.
- Can be used to produce SCCs(Strongly connected components) using Kosaraju’s two-pass algorithm.
- Simple recursive implementation possible that makes use of recursion stacks
- Non-recursively efficiently implemented using Stacks.
- Discovery times of every node follows parenthesis structure (Source: CLRS)
- For finding all paths, we can unmark the node as visitied when we finish the loop.
- Levels are processed in reverse order for non recursive with respect to that of recursive implementation
Steps:
- Explore first neighbour of root.
- Then explore the first unexplored neighbour of the first neighbour
- Keep exploring until you reach a node with no unexplored neighbours.
- Backtrack at that node until you reach a node with unexplored neighbours.
Code:
With Recursion:
Without Recursion, with Stack:
Notes on Traversal
- BFS and DFS have same running time and space efficiency
- BFS and DFS can be used to find connected components.
- BFS and DFS can be used to find connected cycles (DFS is more efficient and can be used in directed graphs also, as backward edge produces cycle).
- Classification Of Edges(directed): Tree Edge, Forward Edge(connecting to descendant), Backward Edge(connecting to ancestor), Cross Edge(all other). In DFS, only tree edge or backward edge is possible.
- Classification Of Edges(un-directed): Tree Edge, Forward Edge(discovered by an ancestor), Backward Edge(discovered by a successor), Cross Edge. In DFS, only backward or forward or tree edges are possible.In BFS, only cross edges are possible.
Basic Problem 2 : Shortest Paths
Single Source Problem Statement: Given a undirected weighted graph G and a source node u, find the shortest cost paths from u to all other nodes.
Solution: Dijkstra’s Algorithm
TL;DR: Algorithm starts at the source vertex, u, it grows a tree, T, that ultimately spans all vertices reachable from u. Vertices are added to T in order of distance i.e., first u, then the vertex closest to u, then the next closest, and so on.
Steps:
- Initialise a set P and a set Q, with P initially storing u with cost 0, and all other in Q with cost infinity.
- Change the priorities of the nodes in Q having an edge with u as their edge costs.
- Dequeue the minimum key node and add it to P.
- Adjust the priorities of other nodes in Q so that the cost of addition gets minimised.
- At the end, the keys are the shortest path costs.
Cost of addition is given by
Code:
Proof of Correctness:
Proof by Induction on set P’s size: Lets assume that the paths are correct for all the previous nodes in P for some node v. Then if there exists some path with smaller cost it must cross P some time, let’s say it crosses with edge passing from x to y. Then algorithm would have selected y before v as all edge lengths are positive, which is a contradiction.
Running Time Analysis:
Total deleteMin – O(nlogn) with Priority Queue, O(n^{2}) without Priority Queue
Total change priorities – O(mlogn) with Priority Queue, O(m) without Priority Queue
Hence total running time = O((n+m)logn) with Priority Queue, O(m+n^{2}) without Priority Queue
Points to remember:
- Uses Adjacency List Representation of Graph
- Much Similar to Prim’s Algorithm
- Requires all edge costs to be non-negative
Okay, now let’s look at a problem where we are not required to find the shorted distances from only one source, but from each node.
All Pairs Shortest Paths Problem Statement: Given a undirected weighted graph G and, find the shortest cost paths between all pairs.
Naive Solution: Run Dijkstra’s algorithm on all nodes. O(mnlogn) = O(n^{3}logn).
Solution 1: Floyd-Warshall Algorithm
TL;DR: Dynamic Programming approach with subproblem as shortest distance between node i and node j using only the first k vertices in between.
Points to remember:
- O(n^{3}) running time
- No negative cost cycles should be present.
- In place implementation possible as the (k+1)th row, and (k+1)th column don’t change in one iteration.
Steps:
- Maintain n 2d arrays D[n][][], such that D[k][i][j] gives the shortest distance between i and j using only the first k vertices in between.
- Final result will be D[n-1]
- Initiate D[-1][i][j] as 0 if i==j, cost of (i,j) if edge exists, infinity otherwise.
- Using dynamic programming, find D[k+1][i][j] = min(D[k][i][j] , D[k+1][i][k+1] + D[k+1][k+1][j]).
Steps for finding the shortest paths:
- Same kind of implementation, use a matrix N[][][]
- N[-1][i][j] = j if (i,j) exists, undefined otherwise.
- N[k+1][i][j] = N[k][i,k+1] if k+1 path is shorter, else N[k][i][j]
- Now for finding the path between i,j keep following the array N and printing the nodes.
Basic Problem 3 : Minimum Spanning Trees
Definition: A spanning tree of a graph is a tree that covers all the nodes of the original graph, e.g. BFS/DFS trees. A minimum spanning tree is a spanning tree with the minimum cost.
The MST Lemma (Cut Property): Let P and Q be two disjoint node sets that together give the union of the original graph nodes, then among all the edges between P and Q, the node with minimum edge cost must be present in an MST(there might be more than one MST)
Proof: Let T be an MST that doesn’t contain e{u,v} (the minimum edge cost), then adding e to it will produce a cycle. Traverse along that cycle , and whenever it crosses from P to Q, replace that edge with e. This produces an MST again.
The MST Cycle Lemma (Cycle Property): The costliest edge of a connected undirected graph must not be in an(any if edge costs are distinct in that cycle) MST.
Proof: Let T be an MST that contains the costliest edge e {u,v}. T-e gives two connected disjoint components, traverse along the cycle to get a cheaper edge that can be replaced with e.
Problem Statement: Find a minimum spanning tree for an undirected weighted graph.
Solution 1: Prim’s Algorithm
Point to remember: Nearly direct implementation of the MST lemma.
Steps:
- Start with a set P with some random root node u and a priority queue Q that contains all the remaining nodes of graph with key as infinity.
- Set priority of the nodes that share an edge with u as their edge cost.
- In the priority queue Q maintain a priority of the shortest edge cost from the node to P and the respective closest neighbour
- In a loop keep choosing elements from Q, s.t. d[u] is as small as possible. Add u to P.
- Update the priority of all the nodes in Q w.r.t u. i.e., compare their current priority with the edge cost with u and update if the edge cost is lesser.
Running Time Analysis:
- Adjacency lists are preferred.
- Make heap and other inititalizations – O(n)
- One iteration O(1) in look up and deletion – total of O(nlogn).
- Total iterations of ChangePriority O(m) and each individual costs O(log(n)), hence a total of O(mlogn).
Solution 2: Kruskal’s Algorithm
Points to remember:
- Uses MST lemma.
- Uses MFset( also called Union-Find) Data Structure
- O(mlogn) using MFSet
- Runtime dramatically reduces near to Ackermann Function after using path compression, goes to O(n * ackermann(m)), where ackermann(m)<= 5 for all practical purposes.
Steps:
- Sort the edges in increasing order of cost.
- Keep adding edges to MST, and whenever a cycle is formed, just don’t add it.
Running Time Analysis: O(mlogn) in worst case without path compression
Solution 3: Reverse- Delete Algorithm
Point to remember: Uses the cycle MST lemma
Steps:
- Sort the edges in decreasing order of costs.
- Keep deleting the costliest edges as long as the tree stays connected
Running Time Analysis: O(n^{2}) with usual implementation.
Notes on MSTs
- All of them are Greedy Solutions
- It turns out that any greedy approach using either of the lemmas will work.
Interesting Applications
Cycle Detection in a Graph: Detect if a directed graph contains a cycle
Points to remember:
- Direct extension of DFS.
- Cycle in a graph only if there is a back edge present in the graph.
Code 🙂
Running Time Analysis:
Testing Bipartiteness
Bipartite Graph Definition: A graph whose vertices can be divided into two disjoint sets X and Y such that every edge has one end in X and other in Y. Consequently, the sets X and Y are independent sets.
Points to remember:
- Sufficient condition is the non existence of any odd cycles.
- Can use both BFS and DFS to test bipartiteness
- For BFS, check whether there is an odd cycle which can be checked by confirming if there exists an edge between two consecutive layers.
- For DFS, while performing DFS assign colors to the node with negative of parent, hence if you encounter a contradictory case, then not bipartite.
Extending these algorithms For Directed Graphs:
Points to remember:
- BFS and DFS remain the same.
- Strong Connectivity can be checked by running both forward and backward BFS/DFS and comparing the two components if they are the same.
- Strong Components will either be equal or disjoint because if s & u are mutually reachable, s & v are mutually reachable then u & v will also be mutually reachable.
Topological Ordering/Sorting in a Directed Graph:
Topological Ordering of nodes: If every directed edge is forward pointed then the ordering is a topological ordering.
Points to remember:
- A topological ordering of a graph exists iff the graph is a DAG(Directed Acyclic Graph)
- Useful in finding precedence relations
- For every DAG there must be a node with no incoming edges
Algortithms Possible:
- Recursively follow the node with no incoming edges, delete that and then recursively find the next such vertex,
- A variation of DFS, whenever a node is finished searching, add it to front of a linked list. This list will be a DAG. Note to make repeated invocations of DFS as the DFS might stop before traversing to each edge. For more information see this.
Analysis:
- O(n^{2}) running time for raw algorithm1, and O(n+m) for DFS implementation
- To obtain O(m+n) time optimization, use the following technique for Algorithm-1:
- Store two Data Structures:
- for each w, no of active nodes that have incoming edge w.r.t. w.
- the set of all active nodes that have no incoming edges from other active nodes.
So that would be all for this post. BFS, DFS, Dijkstra, Floyd-Warshall, MST, and some interesting applications. The notes are highly influenced from the book by Kleinberg, Tardos and the book by Cormen et.al. . If you are not satisfied have a read in the books, or just post a comment and I can care to explain 🙂 .
Make sure to point out any mistakes or suggestions in the comments.
See you in the next post!