Graph Algorithms : Network Flow

Definition of a Flow Network:
A directed graph (V,E) with the following properties:
• Associated with each edge e is a capacity c(e) , c(e)>0.
• There is a single source node s belonging to V
• There is a single sink node t belonging to V
Nodes other than s and t are called internal nodes.
Assumptions:
• All capacities are integers.
• There must be at least one edge leaving source
• There must be at least one edge entering the sink
Defining Flow:
Basically carrying traffic or flow. More formally, flow is a function mapping each edge to a value f(e) satisfying the following three constraints:
• {Capacity Constraint} For each edge e,  f(e)<=c(e)
• {Skew Symmetry} For each pair of vertices u,v ,  f(u,v) = -f(v,u)
• {Flow Conservation} For each node v other than s and t,  sum of all flow going out of v must be zero. In other words, total +ve flow coming into v (or fin(v)) equals total +ve flow going out of v (or fout(v)).
Hence f can be both negative or positive, also called the net flow of that edge. The value of network flow is thence summation of flow going out of source.
Note: Generally flow for an edge corresponds to the positive flow.
Defining Flow between sets:
f(X,Y) = sum of f(x,y) , x belongs to X, y belongs to Y.
Some Basic Identities:
• f(X,X) = 0
• f(X,Y) = -f(Y,X)
• f(XUY,Z) =  f(X,Z) + f(Y,Z) if X and Y are disjoint.
• f(s,V) = f(V,t)
Proof:
f(s,V)
= f(V,V) – f(V-s,V)
= f(V,V-s)
= f(V,t) + f(V,V-s-t)
= f(V,t) – f(V-s-t,V)
= f(V,t)

The Maximum-Flow Problem

Natural Goal – Maximizing the flow from source to sink without violating capacity constraints.
Note: Problem for multiple sources and sinks can be reduced making a global source and a global sink and connecting with infinite capacities.

Example: Formulate maximum flow as linear programming problem.

Designing The Algorithm:
• DP won’t work here.
• If start with a greedy approach then we can counter that by compensating some flows and directing it to other edges.
• Hence the concept of augmenting paths

Residual Graph: G’ defined w.r.t. some flow f.

• Node set of G’ same as G
• Consists of edges that can admit more net flow.
• For every edge having flow less than capacity (assuming forward to be the direction of the capacity)
• A forward edge with residual capacity
• A backward edge describing the flow for that edge
• Mathematically, cf(u,v)=c(u,v)-f(u,v) {c(u,v) will be zero for the reverse edge in nearly all cases}

Note: If f is a flow in G and  f’ is a flow in Gf, then f+f’ is also a flow in G. Proof is trivial using the definition of flow network.
Augmenting Flow using the residual graph:
For any residual graph G’ , a simple path P with the bottleneck edge b as the minimum cost edge in P is called an augmenting path.

fp(u,v): b if (u,v) belongs to P, -b if (v,u) belongs to P, 0 otherwise.

Augmenting flow is defined as f’(u,v) = f(u,v)+fp(u,v)
Proof of f’ being a flow:

If e is a forward edge on P ->  0 <= f(e) <=  f’(e) = f(e) + b <=  f(e) + (ce – f(e)) = ce;
If e is a backward edge on P -> ce >= f(e) >= f’(e) = f(e) – b >= f(e) – f(e) = 0;
Proof for conservation of each node:
If node not in P, no effect. If node is in P then one outgoing flow increases and one incoming flow decreases by the same amount.
Flows and Cuts:
Notion of Cut:  partition the nodes into two sets S & T, s belongs to S, t belongs to T. Then (S,T) form a cut.
Identity: Flow of a cut ( f(S,T) ) equals f(s,V).
Proof:
• f(S,T)
• = f(S,V-S)
• = f(S,V)
• = f(s,V) + f(V-s,V)
• = f(s,V)
• = |f|
Capacity of the Cut: c(S,T) sum of capacities of all edges from S to T.
Theorem: Max Flow = Min Cut for a flow network.
Proof:
• Any Flow <= Capacity of any Cut
• Trivial
• If f is maximum,
• Gf has no augmenting path
• Proof:
• |f’| = |f| + |fp| hence flow increasing with each augmentation.
• |f| = c(S,T) for some S,T
• Proof:
• Take S as all nodes reachable from s in the Residual Graph.
• As t is not reachable from s, S must not contain t.
• Take T as V-S.
• As there are no outgoing edges from S to T in Gf , f(S,T) must be equal to c(S,T).
Ford-Fulkerson Algorithm for finding Max-Flow:
Max-Flow
Initially f(e)=0 for all e in G
While there is an s-t path in the residual graph Gf
• Let P be any (not necessarily simple, simple makes it the generic flow algorithm) s-t path in Gf
• f’= augment(f, P)
• Update f to be f’
• Update the residual graph Gf to be Gf
Endwhile
Return
Points to remember:
• After every augmentation, the flow values and residual capacities are integers
• While augmenting the flow is taken along the direction of the path.
• The total value of flow increases by the bottleneck edge of s-t path cost as all other points conserved
• At most |F| iterations where |F| is maximum flow.
• Hence running time at most O(m|F|) as BFS in O(m+n) and n<=2*m as all nodes have at least one incident edge.
Edmond-Karp Algorithm for finding Max-Flow:
Same as the Generic flow algorithm, with augmentation done only with shortest BFS paths (edge weights as 1).
Running Time With This Modification: O(m2n).
Proof that running time is O(m2n):
• Let δ(s,v) be the level of v in BFS tree.
• After an augmentation, for each edge:
• Either a backward edge is added (if existing was forward)
• Or a forward edge is added (if existing was backward)
• Which means, in a BFS tree, if Li to Li+1 was removed, Li+1 to Li would be added.
• Hence, δ(s,v) must increase at every iteration.
• If an edge starting at u is chosen as the bottleneck edge, u’s level increases by 2.
• Hence a total of 2m * n/2 edges can be chosen as bottleneck.
• Total no. of iterations <= mn.
• Hence O(m2n).
Final Points:
• The flow returned by Ford-Fulkersen Algorithm is a maximum flow.
• Our time analysis assumes that the capacities are integers.
• Given a flow of maximum value, we can compute s-t cut of minimum capacity in O(m) time.
• In every flow network, the maximum value of  s-t flow is equal to the minimum capacity of s-t cut.
• Dinic’s algorithm computes Max Flow in O(n2m).

Graph Algorithms – a revision of all the basic ones.

Hello friends! This post will be a short review of the the most common and basic algorithms for undirected graphs.

Note: This is not a post intended for folks who are not at all familiar with graphs. For beginners a better starting point would be a book like this.

We will start by discussing the basic problems and then we will have a look at some common applications.

Basic Problem 1 : Traversal

Problem Statement: graph traversal is the problem of visiting all the nodes in a graph in a particular manner, updating and/or checking their values along the way.

Solution 1 : Breadth First Traversal/Search (BFS)

TL;DR: search is limited to essentially two operations: (a) visit and inspect a node of a graph, (b) gain access to visit the nodes that neighbour the currently visited node.

Points to remember:

• Layer-wise search forming layers L[0],L[1],…,L[i] of nodes
• L[i] are neighbours to L[i-1] ,  and are not present in any of the earlier layers.
• L[j] is exactly j distant from S. (assuming non-weighted edges , each with cost 1)
• BFS tree can be obtained after running BFS.
• Any edge can be present between nodes not differing by more than one layer
• Any BFS routine produces the connected component that contains the root.
• O(m+n) running time and space.
• Efficient implementation using a queue.

Steps:

1. Explore all neighbours of root. Call them level 1.
2. Then explore all neighbours of nodes of level 1. Call them level 2.
3. Keep Exploring like this 🙂

Code:

Without Queue:

 #define nodecount 10000 bool discovered[nodecount]; vector > L;//Layers void BFS(int s, vector > & adjList)//BFS for arrays, without queue, non-recursive, by @arkanathpathak { fill(discovered, discovered+nodecount, false); discovered[s] = true; for(int v=0;v)); L[0].push_back(s); while(L[i].size()!=0) { L.push_back(*(new vector)); for(int j=0;j < (int)L[i].size(); j++) { int u = L[i][j]; for(int k=0;k<(int)adjList[u].size();k++) { int v = adjList[u][k]; if(discovered[v]==false) { discovered[v] = true; L[i+1].push_back(v); } } } i++; } }

view raw
BFS, with layers
hosted with ❤ by GitHub

With Queue:

 vector adjList[N]; int visited[N]; void bfs(int u) { visited[u] = 1;//mapping the visited nodes, by default 0 queue nq;//queue used nq.push(u);//push initial node while(!nq.empty()) { int tt = nq.front(); visited[tt] = 1; nq.pop(); for(int i=0;i

view raw
BFS with Queue
hosted with ❤ by GitHub

Solution 2 : Depth First Traversal/Search (DFS)

TL;DR: Greedy search, where search retreats only when it encounters a dead-end of the branch,i.e. all neighbours.

Points to remember:

• Like the BFS tree, a DFS tree is obtained.
• O(m+n) running time and space.
• If an edge (x,y) is not present in the DFS tree, then one is a descendant of the other
• Can be used to produce Topological Sorting.
• Can be used to produce SCCs(Strongly connected components) using Kosaraju’s two-pass algorithm.
• Simple recursive implementation possible that makes use of recursion stacks
• Non-recursively efficiently implemented using Stacks.
• Discovery times of every node follows parenthesis structure (Source: CLRS)
• For finding all paths, we can unmark the node  as visitied when we finish the loop.
• Levels are processed in reverse order for non recursive with respect to that of recursive implementation

Steps:

1. Explore first neighbour of root.
2. Then explore the first unexplored neighbour of the first neighbour
3. Keep exploring until you reach a node with no unexplored neighbours.
4. Backtrack at that node until you reach a node with unexplored neighbours.

Code:

With Recursion:

 vector adjList[N]; int visited[N]; void dfsrecursive(int u) { visited[u] = 1; for(int i=0;i

view raw
DFS with Recursion
hosted with ❤ by GitHub

Without Recursion, with Stack:

 vector adjList[N]; int visited[N]; void dfsstack(int u) { stack nstack;//stack used in place of recursion stack visited[u] = 1; nstack.push(u); while(!nstack.empty()) { int tp = nstack.top(); nstack.pop(); for(int i=0;i

view raw
DFS with Stack
hosted with ❤ by GitHub

Notes on Traversal

• BFS and DFS have same running time and space efficiency
• BFS and DFS can be used to find connected components.
• BFS and DFS can be used to find connected cycles (DFS is more efficient and can be used in directed graphs also, as backward edge produces cycle).
• Classification Of Edges(directed): Tree Edge, Forward Edge(connecting to descendant), Backward Edge(connecting to ancestor), Cross Edge(all other). In DFS, only tree edge or backward edge is possible.
• Classification Of Edges(un-directed): Tree Edge, Forward Edge(discovered by an ancestor), Backward Edge(discovered by a successor), Cross Edge. In DFS, only backward or forward or tree edges are possible.In BFS, only cross edges are possible.

Basic Problem 2 : Shortest Paths

Single Source Problem Statement:  Given a undirected weighted graph G and a source node u, find the shortest cost paths from u  to all other nodes.

Solution: Dijkstra’s Algorithm

TL;DR: Algorithm starts at the source vertex, u, it grows a tree, T, that ultimately spans all vertices reachable from u. Vertices are added to T in order of distance i.e., first u, then the vertex closest to u, then the next closest, and so on.

Steps:

• Initialise a set P and a set Q, with P initially storing u with cost 0, and all other in Q with cost infinity.
• Change the priorities of the nodes in Q having an edge with u as their edge costs.
• Dequeue the minimum key node and add it to P.
• Adjust the priorities of other nodes in Q so that the cost of addition gets minimised.
• At the end, the keys are the shortest path costs.

Cost of addition is given by

Code:

 int shortestdistance[N]; int explored[N]; vector > adjList[N]; // stores edges with cost void dijkstra(int node)// Does not use priority queue, O(M+N2) implementation { shortestdistance[node] = 0; explored[node] = 1; int discovered = 1; int lastadded = node; while(discovered

view raw
Dijkstra's Algorithm
hosted with ❤ by GitHub

Proof of Correctness:
Proof by Induction on set P’s size: Lets assume that the paths are correct for all the previous nodes in P for some node v. Then if there exists some path with smaller cost it must cross P some time, let’s say it crosses with edge passing from x to y. Then algorithm would have selected y before v as all edge lengths are positive, which is a contradiction.

Running Time Analysis:
Total deleteMin – O(nlogn) with Priority Queue, O(n2) without Priority Queue
Total change priorities – O(mlogn) with Priority Queue, O(m) without Priority Queue

Hence total running time = O((n+m)logn) with Priority Queue, O(m+n2) without Priority Queue

Points to remember:

• Uses Adjacency List Representation of Graph
• Much Similar to Prim’s Algorithm
• Requires all edge costs to be non-negative

Okay, now let’s look at a problem where we are not required to find the shorted distances from only one source, but from each node.

All Pairs Shortest Paths Problem Statement:  Given a undirected weighted graph G and, find the shortest cost paths between all pairs.

Naive Solution: Run Dijkstra’s algorithm on all nodes. O(mnlogn) = O(n3logn).

Solution 1: Floyd-Warshall Algorithm

TL;DR: Dynamic Programming approach with subproblem as shortest distance between node i and node j using only the first k vertices in between.

Points to remember:

• O(n3) running time
• No negative cost cycles should be present.
• In place implementation possible as the (k+1)th row, and (k+1)th column don’t change in one iteration.

Steps:

• Maintain n 2d arrays D[n][][], such that D[k][i][j] gives the shortest distance between i and j using only the first k vertices in between.
• Final result will be D[n-1]
• Initiate D[-1][i][j] as 0 if i==j, cost of (i,j) if edge exists, infinity otherwise.
• Using dynamic programming, find D[k+1][i][j] = min(D[k][i][j] , D[k+1][i][k+1] + D[k+1][k+1][j]).

Steps for finding the shortest paths:

1. Same kind of implementation, use a matrix N[][][]
2. N[-1][i][j] = j if (i,j) exists, undefined otherwise.
3. N[k+1][i][j] = N[k][i,k+1] if k+1 path is shorter, else N[k][i][j]
4. Now for finding the path between i,j keep following the array N and printing the nodes.

Basic Problem 3 : Minimum Spanning Trees

Definition: A spanning tree of a graph is a tree that covers all the nodes of the original graph, e.g. BFS/DFS trees. A minimum spanning tree is a spanning tree with the minimum cost.

The MST Lemma (Cut Property): Let P and Q be two disjoint node sets that together give the union of the original graph nodes, then among all the edges between P and Q, the node with minimum edge cost must be present in an MST(there might be more than one MST)

Proof: Let T be an MST that doesn’t contain e{u,v} (the minimum edge cost), then adding e to it will produce a cycle. Traverse along that cycle , and whenever it crosses from P to Q, replace that edge with e. This produces an MST again.

The MST Cycle Lemma (Cycle Property): The costliest edge of a connected undirected graph must not be in an(any if edge costs are distinct in that cycle) MST.

Proof: Let T be an MST that contains the costliest edge e {u,v}. T-e gives two connected disjoint components, traverse along the cycle to get a cheaper edge that can be replaced with e.

Problem Statement: Find a minimum spanning tree for an undirected weighted graph.

Solution 1: Prim’s Algorithm

Point to remember: Nearly direct implementation of the MST lemma.

Steps:

• Start with a set P with some random root node u and a priority queue Q that contains all the remaining nodes of graph with key as infinity.
• Set priority of the nodes that share an edge with u as their edge cost.
• In the priority queue Q maintain a priority of the shortest edge cost from the node to P and the respective closest neighbour
• In a loop keep choosing elements from Q, s.t. d[u] is as small as possible. Add u to P.
• Update the priority of all the nodes in Q w.r.t u. i.e.,  compare their current priority with the edge cost with u and update if the edge cost is lesser.

Running Time Analysis:

• Make heap and other inititalizations – O(n)
• One iteration O(1) in look up and deletion – total of O(nlogn).
• Total iterations of ChangePriority O(m) and each individual costs O(log(n)), hence a total of O(mlogn).

Solution 2: Kruskal’s Algorithm

Points to remember:

• Uses MST lemma.
• Uses MFset( also called Union-Find) Data Structure
• O(mlogn) using MFSet
• Runtime dramatically reduces near to Ackermann Function after using path compression, goes to O(n * ackermann(m)), where ackermann(m)<= 5 for all practical purposes.

Steps:

1. Sort the edges in increasing order of cost.
2. Keep adding edges to MST, and whenever a cycle is formed, just don’t add it.

Running Time Analysis: O(mlogn) in worst case without path compression

Solution 3: Reverse- Delete Algorithm

Point to remember: Uses the cycle MST lemma

Steps:

1. Sort the edges in decreasing order of costs.
2. Keep deleting the costliest edges as long as the tree stays connected

Running Time Analysis: O(n2) with usual implementation.

Notes on MSTs

• All of them are Greedy Solutions
• It turns out that any greedy approach using either of the lemmas will work.

Interesting Applications

Cycle Detection in a Graph: Detect if a directed graph contains a cycle

Points to remember:

• Direct extension of DFS.
• Cycle in a graph only if there is a back edge present in the graph.

Code 🙂

 vector adjList[N]; int visited[N]; bool dfsrecursiveandcycle(int u, int parent) { cout << "visited " << u << endl;// 🙂 visited[u] = 1; bool ans = false; for(int i=0;i

Running Time Analysis:

• Seriously?

Testing Bipartiteness

Bipartite Graph Definition: A graph whose vertices can be divided into two disjoint sets X and Y such that every edge has one end in X and other in Y. Consequently, the sets X and Y are independent sets.

Points to remember:

• Sufficient condition is the non existence of any odd cycles.
• Can use both BFS and DFS to test bipartiteness
• For BFS, check whether there is an odd cycle which can be checked by confirming if there exists an edge between two consecutive layers.
• For DFS, while performing DFS assign colors to the node with negative of parent, hence if you encounter a contradictory case, then not bipartite.

Extending these algorithms For Directed Graphs:

Points to remember:

• BFS and DFS remain the same.
• Strong Connectivity can be checked by running both forward and backward BFS/DFS and comparing the two components if they are the same.
• Strong Components will either be equal or disjoint because if s & u are mutually reachable, s & v are mutually reachable then u & v will also be mutually reachable.

Topological Ordering/Sorting in a Directed Graph:

Topological Ordering of nodes: If every directed edge is forward pointed then the ordering is a topological ordering.

Points to remember:

• A topological ordering of a graph exists iff the graph is a DAG(Directed Acyclic Graph)
• Useful in finding precedence relations
• For every DAG there must be a node with no incoming edges

Algortithms Possible:

• Recursively follow the node with no incoming edges, delete that and then recursively find the next such vertex,
• A variation of DFS, whenever a node is finished searching, add it to front of a linked list. This list will be a DAG. Note to make repeated invocations of DFS as the DFS might stop before traversing to each edge. For more information see this.

Analysis:

• O(n2) running time for raw algorithm1, and O(n+m) for DFS implementation
• To obtain O(m+n) time optimization, use the following technique for Algorithm-1:
• Store two Data Structures:
• for each w, no of active nodes that have incoming edge w.r.t. w.
• the set of all active nodes that have no incoming edges from other active nodes.

So that would be all for this post. BFS, DFS, Dijkstra, Floyd-Warshall, MST, and some interesting applications. The notes are highly influenced from the book by Kleinberg, Tardos and the book by Cormen et.al. . If you are not satisfied have a read in the books, or just post a comment and I can care to explain 🙂 .

Make sure to point out any mistakes or suggestions in the comments.

See you in the next post!

Hello World! What are Greedy Algorithms?

So hello world.. This would be the first post on this blog.

The posts here would be of the kind that are easy to read, objective oriented, and food for thought.

So this one goes for the beginners… don’t worry, after reading this you will not be a one!

Greedy Algorithms

Definition: An algorithm which makes locally optimum choices with a hope to make the whole outcome optimum.

Points to remember:

• Greedy Algorithms are most commonly proved using two different approaches
• First proving approach is “Stays Ahead” where you prove that your algorithm stays ahead in terms of the underlying criteria w.r.t. to any other approach.
• Second proving approach is called “Exchange Argument”, where you start with the optimal solution and exchange it with the greedy approach using some number of steps so that at each step you prove that you are not decreasing the outcome.

Some Basic Pseudocodes:

Interval Scheduling Problem:

 Problem: Given set of intervals with starting time and end time, give the set of intervals with maximum cardinality such that no intervals overlap. Algorithm: * Initially let R be the set of all requests. * While R is not empty: * Choose a request i with earliest finish time * add to queue A * delete all incompatible from R. * EndWhile * Return A Analysis: * A is compatible * A is optimal with the "Stays Ahead" argument. * Running Time : O(n logn)

Interval Partitioning Problem:

 Problem: Given intervals with finish and end times, partition the intervals into labels with minimum amount of labels so that no two intervals in any label overlap. Algorithm: * Sort Intervals w.r.t. starting time, breaking tries arbitrarily. * Let them be I1, I2, … , In * For j = 1:n * For each (i

Minimize maximum Lateness:

 Problem: Variation of Interval Scheduling, with deadlines and duration given, all must be performed on a single resource, find the ordering with minimum maximum lateness of any interval. Algorithm: * Sort Jobs w.r.t the deadlines * Let them be d1, d2, … ,dn * Initially f=s * For (i=1:n) * Assign s(i) f, f(i) = s(i) + t(i) * f=f(i) * EndFor * Return Analysis: * There is an optimal solution with no idle time. * Proof of optimality by "Exchange Argument"