Hello friends! This post will be a short review of the the most common and basic algorithms for undirected graphs.

#### Note: This is not a post intended for folks who are not at all familiar with graphs. For beginners a better starting point would be a book like this.

We will start by discussing the basic problems and then we will have a look at some common applications.

**Basic Problem 1 : Traversal**

*Problem Statement: ***graph traversal** is the problem of visiting all the nodes in a graph in a particular manner, updating and/or checking their values along the way.

#### Solution 1 : Breadth First Traversal/Search (BFS)

TL;DR: search is limited to essentially two operations: (a) visit and inspect a node of a graph, (b) gain access to visit the nodes that neighbour the currently visited node.Points to remember:

- Layer-wise search forming layers L[0],L[1],…,L[i] of nodes
- L[i] are neighbours to L[i-1] , and are not present in any of the earlier layers.
- L[j] is exactly j distant from S. (assuming non-weighted edges , each with cost 1)
- BFS tree can be obtained after running BFS.
*Any edge can be present between nodes not differing by more than one layer*- Any BFS routine produces the connected component that contains th
*e root.* - O(m+n) running time and space.
- Efficient implementation using a
.*queue*

Steps:

- Explore all neighbours of root. Call them level 1.
- Then explore all neighbours of nodes of level 1. Call them level 2.
- Keep Exploring like this 🙂

Code:

*Without Queue:*

*With Queue:*

#### Solution 2 : Depth First Traversal/Search (DFS)

TL;DR: Greedy search, where search retreats only when it encounters a dead-end of the branch,i.e. all neighbours.Points to remember:

- Like the BFS tree, a DFS tree is obtained.
- O(m+n) running time and space.
- If an edge (x,y) is not present in the DFS tree, then one is a descendant of the other
- Can be used to produce
**Topological Sorting.** - Can be used to produce
**SCCs(Strongly connected components)**using*Kosaraju’s two-pass algorithm*. - Simple recursive implementation possible that makes use of recursion stacks
- Non-recursively efficiently implemented using
**Stacks**. - Discovery times of every node follows
*parenthesis structure (Source: CLRS)* - For finding all paths, we can unmark the node as visitied when we finish the loop.
- Levels are processed in reverse order for non recursive with respect to that of recursive implementation

Steps:

- Explore first neighbour of root.
- Then explore the first unexplored neighbour of the first neighbour
- Keep exploring until you reach a node with no unexplored neighbours.
- Backtrack at that node until you reach a node with unexplored neighbours.

Code:

*With Recursion:*

*Without Recursion, with Stack:*

#### Notes on Traversal

- BFS and DFS have same running time and space efficiency
- BFS and DFS can be used to find connected components.
- BFS and DFS can be used to find connected cycles (DFS is more efficient and can be used in directed graphs also, as backward edge produces cycle).
- Classification Of Edges(directed):
**Tree Edge, Forward Edge**(connecting to descendant)**, Backward Edge**(connecting to ancestor)**, Cross Edge**(all other)**.**In DFS, only tree edge or backward edge is possible. - Classification Of Edges(un-directed):
**Tree Edge, Forward Edge**(discovered by an ancestor)**, Backward Edge**(discovered by a successor)**, Cross Edge.**In DFS, only backward or forward or tree edges are possible.In BFS, only cross edges are possible.

**Basic Problem 2 : Shortest Paths**

*Single Source Problem Statement: * Given a undirected weighted graph **G** and a source node **u**, find the shortest cost paths from **u** to all other nodes.

#### Solution: Dijkstra’s Algorithm

TL;DR: Algorithm starts at the source vertex, u, it grows a tree, T, that ultimately spans all vertices reachable from u. Vertices are added to T in order of distance i.e., first u, then the vertex closest to u, then the next closest, and so on.Steps:

- Initialise a set P and a set Q, with P initially storing u with cost 0, and all other in Q with cost infinity.
- Change the priorities of the nodes in Q having an edge with u as their edge costs.
- Dequeue the minimum key node and add it to P.
- Adjust the priorities of other nodes in Q so that the cost of addition gets minimised.
- At the end, the keys are the shortest path costs.

Cost of addition is given by

Code:

Proof of Correctness:

Proof by Induction on set P’s size: Lets assume that the paths are correct for all the previous nodes in P for some node v. Then if there exists some path with smaller cost it must cross P some time, let’s say it crosses with edge passing from x to y. Then algorithm would have selected y before v as all edge lengths are positive, which is a contradiction.

Running Time Analysis:

Total deleteMin – O(nlogn) with Priority Queue, O(n^{2}) without Priority Queue

Total change priorities –** O(mlogn) **with Priority Queue, O(m) without Priority Queue

Hence total running time = **O((n+m)logn)** with Priority Queue, **O(m+n ^{2})** without Priority Queue

Points to remember:

- Uses Adjacency List Representation of Graph
- Much Similar to Prim’s Algorithm
- Requires all edge costs to be
**non-negative**

*Okay, now let’s look at a problem where we are not required to find the shorted distances from only one source, but from each node.*

*All Pairs Shortest Paths Problem Statement: * Given a undirected weighted graph **G** and, find the shortest cost paths between all pairs.

#### Naive Solution: Run Dijkstra’s algorithm on all nodes. O(mnlogn) = O(n^{3}logn).

**Solution 1: Floyd-Warshall Algorithm**

Points to remember:

- O(n
^{3}) running time - No negative cost cycles should be present.
**In place implementation possible as the (k+1)th row, and (k+1)th column don’t change in one iteration**.

Steps:

- Maintain
**n**2d arrays D[n][][], such that D[k][i][j] gives the shortest distance between i and j using only the first k vertices in between. - Final result will be D[n-1]
- Initiate D[-1][i][j] as
*0 if i==j, cost of (i,j) if edge exists, infinity otherwise.* - Using dynamic programming, find D[k+1][i][j] = min(D[k][i][j] , D[k+1][i][k+1] + D[k+1][k+1][j]).

Steps for finding the shortest paths:

- Same kind of implementation, use a matrix N[][][]
- N[-1][i][j] = j if (i,j) exists, undefined otherwise.
- N[k+1][i][j] = N[k][i,k+1] if k+1 path is shorter, else N[k][i][j]
- Now for finding the path between i,j keep following the array N and printing the nodes.

**Basic Problem 3 : Minimum Spanning Trees**

**Definition:** A spanning tree of a graph is a tree that covers all the nodes of the original graph, e.g. BFS/DFS trees. A minimum spanning tree is a spanning tree with the minimum cost.

**The MST Lemma (Cut Property):** Let P and Q be two disjoint node sets that together give the union of the original graph nodes, then among all the edges between P and Q, the node with minimum edge cost must be present in **an MST**(there might be more than one MST)

Proof: Let T be an MST that doesn’t contain e{u,v} (the minimum edge cost), then adding e to it will produce a cycle. Traverse along that cycle , and whenever it crosses from P to Q, replace that edge with e. This produces an MST again.

**The MST Cycle Lemma (Cycle Property):** The costliest edge of a connected undirected graph must not be in an(any if edge costs are distinct in that cycle) MST.

Proof: Let T be an MST that contains the costliest edge e {u,v}. T-e gives two connected disjoint components, traverse along the cycle to get a cheaper edge that can be replaced with e.

*Problem Statement: Find a minimum spanning tree for an undirected weighted graph.*

**Solution 1: Prim’s Algorithm**

Point to remember: Nearly direct implementation of the MST lemma.

Steps:

- Start with a set P with some random root node u and a priority queue Q that contains all the remaining nodes of graph with key as infinity.
- Set priority of the nodes that share an edge with u as their edge cost.
- In the priority queue Q maintain a priority of the
**shortest edge cost from the node to P**and the respective closest neighbour - In a loop keep choosing elements from Q, s.t. d[u] is as small as possible. Add u to P.
- Update the priority of all the nodes in Q w.r.t u. i.e., compare their current priority with the edge cost with u and update if the edge cost is lesser.

Running Time Analysis:

- Adjacency lists are preferred.
- Make heap and other inititalizations –
**O(n)** - One iteration O(1) in look up and deletion – total of
**O(nlogn)**. - Total iterations of ChangePriority O(m) and each individual costs O(log(n)), hence a total of
**O(mlogn)**.

**Solution 2: Kruskal’s Algorithm**

Points to remember:

- Uses MST lemma.
- Uses MFset( also called Union-Find) Data Structure
- O(mlogn) using MFSet
- Runtime dramatically reduces near to Ackermann Function after using path compression, goes to O(n * ackermann(m)), where ackermann(m)<= 5 for all practical purposes.

Steps:

- Sort the edges in increasing order of cost.
- Keep adding edges to MST, and whenever a cycle is formed, just don’t add it.

Running Time Analysis: O(mlogn) in worst case without path compression

**Solution 3: **Reverse- Delete Algorithm

Point to remember: Uses the cycle MST lemma

Steps:

- Sort the edges in decreasing order of costs.
- Keep deleting the costliest edges as long as the tree stays connected

Running Time Analysis:** O(n ^{2}**) with usual implementation.

**Notes on MSTs**

- All of them are
**Greedy Solutions** - It turns out that any greedy approach using
**either****of the lemmas****will work.**

## Interesting Applications

#### Cycle Detection in a Graph: Detect if a directed graph contains a cycle

Points to remember:

- Direct extension of DFS.
- Cycle in a graph only if there is a back edge present in the graph.

Code 🙂

Running Time Analysis:

- Seriously?

**Testing Bipartiteness**

Bipartite Graph Definition: *A graph whose vertices can be divided into two disjoint sets X and Y such that every edge has one end in X and other in Y. Consequently, the sets X and Y are independent sets.*

Points to remember:

- Sufficient condition is the non existence of any odd cycles.
- Can use both BFS and DFS to test bipartiteness
- For BFS, check whether there is an odd cycle which can be checked by confirming if there exists an edge between two consecutive layers.
- For DFS, while performing DFS assign colors to the node with negative of parent, hence if you encounter a contradictory case, then not bipartite.

**Extending these algorithms For Directed Graphs:**

Points to remember:

- BFS and DFS remain the same.
- Strong Connectivity can be checked by running both forward and backward BFS/DFS and comparing the two components if they are the same.
- Strong Components will either be equal or disjoint because if s & u are mutually reachable, s & v are mutually reachable then u & v will also be mutually reachable.

**Topological Ordering/Sorting in a Directed Graph:**

Topological Ordering of nodes: If every directed edge is forward pointed then the ordering is a topological ordering.

Points to remember:

- A topological ordering of a graph exists iff the graph is a DAG(Directed Acyclic Graph)
- Useful in finding precedence relations
- For every DAG there must be a node with no incoming edges

Algortithms Possible:

- Recursively follow the node with no incoming edges, delete that and then recursively find the next such vertex,
- A variation of DFS, whenever a node is finished searching, add it to
**front**of a linked list. This list will be a DAG. Note to make repeated invocations of DFS as the DFS might stop before traversing to each edge. For more information see this.

Analysis:

- O(n
^{2}) running time for raw algorithm1, and O(n+m) for DFS implementation - To obtain O(m+n) time optimization, use the following technique for Algorithm-1:
- Store two Data Structures:
- for each w, no of active nodes that have incoming edge w.r.t. w.
- the set of all active nodes that have no incoming edges from other active nodes.

- Store two Data Structures:

So that would be all for this post. BFS, DFS, Dijkstra, Floyd-Warshall, MST, and some interesting applications. The notes are highly influenced from the book by Kleinberg, Tardos and the book by Cormen et.al. . If you are not satisfied have a read in the books, or just post a comment and I can care to explain 🙂 .

Make sure to point out any mistakes or suggestions in the comments.

See you in the next post!