Spanning Tree Algorithm

Below is my version generalizing many "standard" spanning tree algorithms, including Depth-First Search (DFS), Bredth-First Search (BFS), Minimum-Weight Spanning Tree (MST), and Shortest Path Tree (also called Single-Source Shortest Path).

ST (G,s):

Input:     Graph G=(V, E [, W]) and a node s. The weight function W: E --> R (reals) is optional
 A tree T rooted at s and spanning its connected component (more precisely, for directed G, the tree includes all nodes reachable from s by a directed path). T is represented as a subset of E.



for each node v do v.init

S.init; <s,s [, d0]> )

T <-- emptyset

while NOT S.empty do {

<u, v [, x]> <-- S.extract

if NOT v.visited then do {

T <-- T + uv  // adds edge uv to the tree T

v.visit( [x] )

for each edge vw do <v, w [, dvw ]> )



return T;

Node data and optional parameters

Un-weighted graphs

In the case of un-weighted graphs G=(V,E) the optional parameters, denoted by "[...]" in the algorithm above are not used, and node data can be implemented using a single Boolean flag v.visited. Then we can define v.init and v.visit as follows:

v.init :     v.visited <-- false

v.visit :    v.visited <-- true

Weighted graphs

In the case of weighted graphs G=(V, E, W) the situation is somewhat more complex. In particular, the optional parameters need to be used. But the specifics of their use and the node data may depend on the specific task at hand. For example, in the case of MST, the node data can be handled the same way as for the un-weighted graphs above. And the optional parameter dvw in the (and S.extract) is simply the weight W(vw) of the corresponding edge.

But in the case of the Shortest Path tree, each node would maintain a numerical field v.dist which is initiated to infinity for all nodes except s, and then changed to the shortest path distance when the node is visited. Then the node data operations are implemented as follows:

v.init :       v.dist <-- infinity

v.visited :    return ( NOT v.dist == infinity )

v.visit(x):    v.dist <-- x

In the initialization use d0=0, and in the main part of the algorithm storing an edge is done as follows: <v, w , v.dist + W(vw)> )

That is we store an edge vw together with the "candidate shortest path" distance from s to w going through the edge vw.

Storage S

Implementing the storage using different data structures produces different types of spanning trees:

Note: using Priority Queue extracting the maximum (rather than a minimum) will result in a correct Maximum-Weight Spanning Tree, but will not result in a Longest Path Tree (in fact, Longest Path problem, though solvable in many special cases, is NP-complete for general graphs and thus is unlikely to have an efficient algorithm; see also Hamiltonian Path problem, which is a special case of the longest path problem).

Another subtle difference is that the above MST algorithm works correctly even if the edge weights are negative. But the Shortest Path Tree algorithm is guaranteed to work correctly only if all the edge weights are non-negative.