## Topological Sort

### November 19, 2010

There are several ways to represent a graph. One method is an adjacency matrix, which has rows and columns for each vertex; each matrix cell has a 1 if there is an edge from the row vertex to the column vertex and a 0 otherwise. Another method is the adjacency list, which has one list for each vertex, each list containing all the vertices for which there is an edges from the starting vertex.

There are also several ways to search a graph. Depth-first search picks a starting vertex with no predecessors, visits an unvisited successor of that vertex, than an unvisited successor of that vertex, and so on, plunging as deeply into the graph as quickly as possible. When a vertex has no unvisited successors, the recursion pops back up a level and visits another unvisited successor. This down-and-up recursion continues until it pops all the way back to the starting vertex, when it begins again if any unvisited vertices remain. Breadth-first search works the other way, first visiting all the successors of a single vertex, all at the same level, then all the children of each of those successors, and so on, completely examining all the vertices at a single level from the starting vertex before moving on to the next level. Both methods visit all the vertices in a graph, but do so in a different order, and each is appropriate in different situations.

Instead of choosing either an adjacency matrix or adjacency list, we represent a graph as a list of `(from . to)`

edges, which is convenient in many situations. Because we don’t store the adjacency information in any form, we compute it on-the-fly, when we need it. Doing things differently is sometimes fun and often educational, as it forces us to think in ways we don’t normally think.

Before we can perform topological sort, we need to know if the graph has a cycle in which one vertex has an edge to a second vertex, the second vertex has an edge to a third vertex, and so on, eventually returning to the first vertex; in the extreme, a graph could have a vertex with an edge to itself. Our test is recursive: find a *leaf* vertex that has no edges coming from it, delete the leaf and all edges pointing to it, and recur. If you come to a graph that has no leaf, the original was cyclic, but if you eventually remove all the edges and are left with the null graph, the original had no cycle:

`(define (cyclic? edges)`

(define (leaf edges)

(let loop ((froms (unique = (sort < (map car edges))))

(tos (unique = (sort < (map cdr edges)))))

(cond ((null? froms) (if (pair? tos) (car tos) #f))

((null? tos) #f)

((< (car froms) (car tos)) (loop (cdr froms) tos))

((< (car tos) (car froms)) (car tos))

(else (loop (cdr froms) (cdr tos))))))

(define (rem-tos t edges)

(let loop ((edges edges) (zs '()))

(cond ((null? edges) zs)

((= (cdar edges) t) (loop (cdr edges) zs))

(else (loop (cdr edges) (cons (car edges) zs))))))

(let loop ((edges edges))

(if (null? edges) #f

(let ((t (leaf edges)))

(if (not t) #t

(loop (rem-tos t edges)))))))

The sample graph shown on the previous page is acyclic; adding an edge from 10 to 5 causes it to become cyclic:

`> (cyclic? '((3 . 8) (3 . 10) (5 . 11) (7 . 8)`

(7 . 11) (11 . 2) (11 . 9) (11 . 10)))

#f

> (cyclic? '((3 . 8) (3 . 10) (5 . 11) (7 . 8)

(7 . 11) (10 . 5) (11 . 2) (11 . 9) (11 . 10)))

#t

If a graph is non-cyclic, its vertices can be sorted into an order in which every vertex precedes all of its successors. There are typically many possible sorts that satisfy the constraints; for the sample graph, some of these sorts are 7 5 3 11 8 2 9 10, 3 7 8 5 11 10 2 9, and 5 7 3 8 11 10 9 2. `Tsort`

loops through the list of *from* vertices, accumulating a result in `result`

, until no vertices remain. At each step, if the first vertex in the list is already in the result, it is ignored, and the loop moves to the next vertex in the list. Otherwise, the first vertex in the list is added to `result`

, as well as all of the vertices that can be reached by doing a topological sort from each vertex adjacent to the first vertex. This is easier to do than to say:

`(define (tsort edges)`

(define (adjacent v)

(map cdr (filter (lambda (vs) (= (car vs) v)) edges)))

(if (cyclic? edges) (error 'tsort "cyclic")

(let ((froms (unique = (sort < (map car edges)))))

(let loop ((froms froms) (result '()))

(cond ((null? froms) result)

((member (car froms) result)

(loop (cdr froms) result))

(else (loop (cdr froms)

(cons (car froms)

(loop (adjacent (car froms))

result)))))))))

Note that our algorithm is effectively a depth-first search through the graph, stated somewhat differently; we delay calculation of the adjacency list for a given vertex until that vertex is reached. Here is the topological sort of the example graph:

`> (tsort '((3 . 8) (3 . 10) (5 . 11) (7 . 8)`

(7 . 11) (11 . 2) (11 . 9) (11 . 10)))

(7 5 11 2 3 10 8 9)

It is not clear that our experiment in using an alternate data structure and algorithm was a success. The code shown above is longer and less clear than the standard version using adjacency lists and depth-first search, and it’s not very general, as we assumed that all the vertices are identified as integers. At least we had some fun.

We used `filter`

, `sort`

and `unique`

from the Standard Prelude. You can run the program at http://programmingpraxis.codepad.org/RjhH0OAK.

Funny timing. In my graduate Algorithms & Data Structues class (I’m a math Ph.D. student, learning programming on the side), we’re to implement a program that takes in a DAG and uses Dijkstra’s Algorithm for shortest paths.

My Haskell solution (see http://bonsaicode.wordpress.com/2010/11/19/programming-praxis-topological-sort/ for a version with comments):

Quick answer, if the Python module “networkx.py” is installed:

Currently working on my own answer.

My own answer (though I got help from several sources, all mentioned in the comments). Too

long and cumbersome an answer for my taste, but I’m calling it a day.

Oops! Made the link, but didn’t put in the address; my answer is here.

In F#

//remove vertexes until there is no more edges

let rec remove_vertexs (edgelist: (int * int) list) =

match edgelist with

| [] -> []

| _ ->

let src_vertexs = List.unzip edgelist |> fst

let dst_vertexs = List.unzip edgelist |> snd

let all_vertexs = List.append src_vertexs dst_vertexs

let vertex_no_incoming = Set.difference (all_vertexs |> Set.ofList) (dst_vertexs |> Set.ofList) |> Set.toList

let new_edgelist = List.filter (fun (x,y) -> not (x = vertex_no_incoming.[0])) edgelist

vertex_no_incoming.[0] :: (remove_vertexs new_edgelist)

//do a remove_vertexs, and append the result with the remaining isolated vertexs

let topologicalsort (edgelist: (int * int) list) =

let all_vertexs = List.append (List.unzip edgelist |> fst) (List.unzip edgelist |> snd)

let remove = remove_vertexs edgelist

let isolated = Set.difference (all_vertexs |> Set.ofList) (remove |> Set.ofList) |> Set.toList

List.append remove isolated

topologicalsort [(5,11);

(11,2);(11,10);(11,9);

(7,11);(7,8);

(8,9);

(3,10);(3,8)]

[/sourecode]

In F#

Language: Scheme (Gambit)

The code I posted just above has the advantage of doing cycle detection + sorting in one go, but to the cost of a O(n^2) worst case, for an unlucky edge list order.

Here is a modified version that drastically reduces the worst case, by reordering the edges on the fly. I also made the code somewhat more functional.

Thanks for the stimulating problems!

[…] worked with directed graphs (“digraphs”) in a recent exercise. Today’s exercise is another graph theoretical procedure with many applications: […]