Topological Sort

November 19, 2010

There are several ways to represent a graph. One method is an adjacency matrix, which has rows and columns for each vertex; each matrix cell has a 1 if there is an edge from the row vertex to the column vertex and a 0 otherwise. Another method is the adjacency list, which has one list for each vertex, each list containing all the vertices for which there is an edges from the starting vertex.

There are also several ways to search a graph. Depth-first search picks a starting vertex with no predecessors, visits an unvisited successor of that vertex, than an unvisited successor of that vertex, and so on, plunging as deeply into the graph as quickly as possible. When a vertex has no unvisited successors, the recursion pops back up a level and visits another unvisited successor. This down-and-up recursion continues until it pops all the way back to the starting vertex, when it begins again if any unvisited vertices remain. Breadth-first search works the other way, first visiting all the successors of a single vertex, all at the same level, then all the children of each of those successors, and so on, completely examining all the vertices at a single level from the starting vertex before moving on to the next level. Both methods visit all the vertices in a graph, but do so in a different order, and each is appropriate in different situations.

Instead of choosing either an adjacency matrix or adjacency list, we represent a graph as a list of (from . to) edges, which is convenient in many situations. Because we don’t store the adjacency information in any form, we compute it on-the-fly, when we need it. Doing things differently is sometimes fun and often educational, as it forces us to think in ways we don’t normally think.

Before we can perform topological sort, we need to know if the graph has a cycle in which one vertex has an edge to a second vertex, the second vertex has an edge to a third vertex, and so on, eventually returning to the first vertex; in the extreme, a graph could have a vertex with an edge to itself. Our test is recursive: find a leaf vertex that has no edges coming from it, delete the leaf and all edges pointing to it, and recur. If you come to a graph that has no leaf, the original was cyclic, but if you eventually remove all the edges and are left with the null graph, the original had no cycle:

(define (cyclic? edges) (define (leaf edges) (let loop ((froms (unique = (sort < (map car edges)))) (tos (unique = (sort < (map cdr edges))))) (cond ((null? froms) (if (pair? tos) (car tos) #f)) ((null? tos) #f) ((< (car froms) (car tos)) (loop (cdr froms) tos)) ((< (car tos) (car froms)) (car tos)) (else (loop (cdr froms) (cdr tos)))))) (define (rem-tos t edges) (let loop ((edges edges) (zs '())) (cond ((null? edges) zs) ((= (cdar edges) t) (loop (cdr edges) zs)) (else (loop (cdr edges) (cons (car edges) zs)))))) (let loop ((edges edges)) (if (null? edges) #f (let ((t (leaf edges))) (if (not t) #t (loop (rem-tos t edges)))))))

The sample graph shown on the previous page is acyclic; adding an edge from 10 to 5 causes it to become cyclic:

> (cyclic? '((3 . 8) (3 . 10) (5 . 11) (7 . 8) (7 . 11) (11 . 2) (11 . 9) (11 . 10))) #f > (cyclic? '((3 . 8) (3 . 10) (5 . 11) (7 . 8) (7 . 11) (10 . 5) (11 . 2) (11 . 9) (11 . 10))) #t

If a graph is non-cyclic, its vertices can be sorted into an order in which every vertex precedes all of its successors. There are typically many possible sorts that satisfy the constraints; for the sample graph, some of these sorts are 7 5 3 11 8 2 9 10, 3 7 8 5 11 10 2 9, and 5 7 3 8 11 10 9 2. Tsort loops through the list of from vertices, accumulating a result in result, until no vertices remain. At each step, if the first vertex in the list is already in the result, it is ignored, and the loop moves to the next vertex in the list. Otherwise, the first vertex in the list is added to result, as well as all of the vertices that can be reached by doing a topological sort from each vertex adjacent to the first vertex. This is easier to do than to say:

(define (tsort edges) (define (adjacent v) (map cdr (filter (lambda (vs) (= (car vs) v)) edges))) (if (cyclic? edges) (error 'tsort "cyclic") (let ((froms (unique = (sort < (map car edges))))) (let loop ((froms froms) (result '())) (cond ((null? froms) result) ((member (car froms) result) (loop (cdr froms) result)) (else (loop (cdr froms) (cons (car froms) (loop (adjacent (car froms)) result)))))))))

Note that our algorithm is effectively a depth-first search through the graph, stated somewhat differently; we delay calculation of the adjacency list for a given vertex until that vertex is reached. Here is the topological sort of the example graph:

> (tsort '((3 . 8) (3 . 10) (5 . 11) (7 . 8) (7 . 11) (11 . 2) (11 . 9) (11 . 10))) (7 5 11 2 3 10 8 9)

It is not clear that our experiment in using an alternate data structure and algorithm was a success. The code shown above is longer and less clear than the standard version using adjacency lists and depth-first search, and it’s not very general, as we assumed that all the vertices are identified as integers. At least we had some fun.

We used filter, sort and unique from the Standard Prelude. You can run the program at http://programmingpraxis.codepad.org/RjhH0OAK.

Posted by programmingpraxis

Filed in Exercises

10 Comments »

10 Responses to “Topological Sort”

Graham said
November 19, 2010 at 12:22 PM
Funny timing. In my graduate Algorithms & Data Structues class (I’m a math Ph.D. student, learning programming on the side), we’re to implement a program that takes in a DAG and uses Dijkstra’s Algorithm for shortest paths.

Remco Niemeijer said

November 19, 2010 at 12:27 PM

My Haskell solution (see http://bonsaicode.wordpress.com/2010/11/19/programming-praxis-topological-sort/ for a version with comments):

import Data.List

with :: (a -> b) -> [a] -> (b -> b -> Bool) -> b -> [a]
with t xs eq x = filter ((eq x) . t) xs

noIncoming :: Eq a => [(a, a)] -> [a] -> Maybe a
noIncoming es = find (null . with snd es (==))

isCyclic :: Eq a => [(a, a)] -> Bool
isCyclic = not . null . until (\x -> remove x == x) remove where
    remove es = maybe es (with fst es (/=)) . noIncoming es $ map fst es

tsort :: Eq a => [(a, a)] -> [a]
tsort xs = if isCyclic xs then error "cannot sort cyclic list"
           else f xs . nub . uncurry (++) $ unzip xs where
    f es vs = maybe [] (\v -> v : f (with fst es (/=) v) (delete v vs)) $
              noIncoming es vs

Graham said

November 19, 2010 at 6:16 PM

Quick answer, if the Python module “networkx.py” is installed:

#!/usr/bin/env python2.6

import networkx as nx
G1 = nx.DiGraph({2:[], 3:[8, 10], 5:[11], 7:[8,11], 8:[9], 9:[], 10:[],
    11:[2, 9, 10]})
G1.name = 'G1'
G2 = nx.DiGraph({2:[], 3:[8, 10], 5:[11], 7:[8,11], 8:[9], 9:[], 10:[5],
    11:[2, 9, 10]})
G2.name = 'G2'
for G in [G1, G2]:
    print "Graph: %s" % G
    if nx.is_directed_acyclic_graph(G):
        print "Topological sort:\n%s" % nx.topological_sort(G)
    else:
        print "Sorry, this graph is cyclic"

Currently working on my own answer.

Graham said
November 19, 2010 at 7:03 PM
My own answer (though I got help from several sources, all mentioned in the comments). Too
long and cumbersome an answer for my taste, but I’m calling it a day.
Graham said
November 19, 2010 at 7:05 PM
Oops! Made the link, but didn’t put in the address; my answer is here.
Khanh Nguyen said
November 29, 2010 at 10:13 PM
In F#

//remove vertexes until there is no more edges
let rec remove_vertexs (edgelist: (int * int) list) =
match edgelist with
| [] -> []
| _ ->
let src_vertexs = List.unzip edgelist |> fst
let dst_vertexs = List.unzip edgelist |> snd
let all_vertexs = List.append src_vertexs dst_vertexs
let vertex_no_incoming = Set.difference (all_vertexs |> Set.ofList) (dst_vertexs |> Set.ofList) |> Set.toList
let new_edgelist = List.filter (fun (x,y) -> not (x = vertex_no_incoming.[0])) edgelist
vertex_no_incoming.[0] :: (remove_vertexs new_edgelist)
//do a remove_vertexs, and append the result with the remaining isolated vertexs
let topologicalsort (edgelist: (int * int) list) =
let all_vertexs = List.append (List.unzip edgelist |> fst) (List.unzip edgelist |> snd)
let remove = remove_vertexs edgelist
let isolated = Set.difference (all_vertexs |> Set.ofList) (remove |> Set.ofList) |> Set.toList
List.append remove isolated

topologicalsort [(5,11);
(11,2);(11,10);(11,9);
(7,11);(7,8);
(8,9);
(3,10);(3,8)]
[/sourecode]

Khanh Nguyen said

November 29, 2010 at 10:15 PM

In F#

//remove vertexes until there is no more edges
let rec remove_vertexs (edgelist: (int * int) list) = 
    match edgelist with 
    | []  -> []
    | _   ->
              let src_vertexs = List.unzip edgelist |> fst
              let dst_vertexs = List.unzip edgelist |> snd
              let all_vertexs = List.append src_vertexs dst_vertexs
              let vertex_no_incoming =  Set.difference (all_vertexs |> Set.ofList) (dst_vertexs |> Set.ofList) |> Set.toList
              let new_edgelist = List.filter (fun (x,y) -> not (x = vertex_no_incoming.[0])) edgelist
              vertex_no_incoming.[0] :: (remove_vertexs new_edgelist)
//do a remove_vertexs, and append the result with the remaining isolated vertexs
let topologicalsort (edgelist: (int * int) list) = 
    let all_vertexs = List.append (List.unzip edgelist |> fst) (List.unzip edgelist |> snd)
    let remove = remove_vertexs edgelist
    let isolated = Set.difference (all_vertexs |> Set.ofList) (remove |> Set.ofList) |> Set.toList
    List.append remove isolated

topologicalsort [(5,11);
                 (11,2);(11,10);(11,9);
                 (7,11);(7,8);
                 (8,9);
                 (3,10);(3,8)]

Guillaume said

November 30, 2010 at 5:57 PM

Language: Scheme (Gambit)

; topological sort + cyclic graph detection 

(define (topo-sort graph) ; graph is a list of (parent . child) pairs

  (let ((t-v (make-table))
        (nmax 0))
    
    ; init
    (for-each (lambda (x)   
                (table-set! t-v (car x) 0)
                (table-set! t-v (cdr x) 0)
                ) 
              graph)

    ; propagate until there is no change anymore
    (let loop ()
      (if (> nmax (table-length t-v))
          #f  ; cyclic graph
          (let ((change #f))
            (for-each (lambda (x)
                        (let* ((par (car x))
                               (kid (cdr x))
                               (v-par (table-ref t-v par))
                               (v-kid (table-ref t-v kid))
                               (v     (max v-kid (+ 1 v-par))))
                          (and (> v v-kid)
                               (begin (table-set! t-v kid v)
                                      (set! change #t)
                                      (set! nmax (max nmax v))
                                      ))))
                      graph)
            (if change
                (loop)
                (sort (lambda (x y) (< (cdr x) (cdr y)))
                      (table->list t-v))))))))


(topo-sort `((5 . 11) (11 . 2) (11 . 10) (11 . 9) (7 . 11) (7 . 8) (8 . 9) (3 . 8) (3 . 10)))
;; ((7 . 0) (5 . 0) (3 . 0) (8 . 1) (11 . 1) (10 . 2) (9 . 2) (2 . 2))

(topo-sort `((5 . 11) (11 . 2) (11 . 10) (11 . 9) (7 . 11) (7 . 8) (8 . 9) (3 . 8) (3 . 10) (10 . 5)))
;; #f

Guillaume said

December 2, 2010 at 5:51 AM

The code I posted just above has the advantage of doing cycle detection + sorting in one go, but to the cost of a O(n^2) worst case, for an unlucky edge list order.

Here is a modified version that drastically reduces the worst case, by reordering the edges on the fly. I also made the code somewhat more functional.

Thanks for the stimulating problems!

;; Language: Scheme (Gambit)
;; In one go: topological sort + cyclic graph detection 

(define (topo-sort graph) ; graph is a list of (parent . child) pairs
  (let ((t-v (make-table)))
    ;; init
    (for-each (lambda (x)   
                (table-set! t-v (car x) 0)
                (table-set! t-v (cdr x) 0)) 
              graph)
    (let ((n-node (table-length t-v)))
      
      ;; propagate until there is no change anymore
      (let loop ((graph graph))

        (let ((result2
               
               ;; loop through the graph's edges
               (let loop2 ((rest graph) (new-g1 '()) (new-g2 '()) (change #f) (vmax 0) (prev 0))

                 (cond ((>= vmax n-node) #f) ; cyclic graph detected
                       ((null? rest)   (cons change 
                                             (append new-g1 (reverse new-g2)))) ; all edges visited -> done, return new order

                       (else (let* ((x       (car rest))
                                    (par     (car x))
                                    (kid     (cdr x))
                                    (v-par   (table-ref t-v par))
                                    (v-kid   (table-ref t-v kid))
                                    (v       (max v-kid (+ 1 v-par)))
                                    (prop    (> v v-kid))
                                    (improve (<= v prev))
                                    )
                                                                  
                               (and prop (table-set! t-v kid v)) ; propagation
                               
                               (loop2 (cdr rest)
                                      ;; The next two lines improve the edge order to reduce cost of the worst case
                                      (if improve (cons x new-g1) new-g1)
                                      (if improve new-g2 (cons x new-g2))
                                      (or change prop)
                                      (max vmax v)
                                      v)))))
               ))
                              
          (and result2   ; in the cyclic case, return #f
               (if (car result2)  ; (change)
                   (loop (cdr result2))  ; (new-graph)
                   (sort (lambda (x y) (< (cdr x) (cdr y))) ; finished -> return sorted result
                         (table->list t-v)))))))))


(pretty-print (topo-sort `((5 . 11) (11 . 2) (11 . 10) (11 . 9) (7 . 11) (7 . 8) (8 . 9) (3 . 8) (3 . 10))))
;; ((7 . 0) (5 . 0) (3 . 0) (8 . 1) (11 . 1) (10 . 2) (9 . 2) (2 . 2))

(pretty-print (topo-sort `((5 . 11) (11 . 2) (11 . 10) (11 . 9) (7 . 11) (7 . 8) (8 . 9) (3 . 8) (3 . 10) (10 . 5))))
;; #f

(pretty-print (topo-sort `((4 . 5)(3 . 4)(2 . 3)(1 . 2)(0 . 1))))
;; >> ((0 . 0) (1 . 1) (2 . 2) (3 . 3) (4 . 4) (5 . 5))
;; (3 loops)

(pretty-print (topo-sort `((4 . 5)(3 . 4)(2 . 3)(1 . 2)(0 . 1)(5 . 0))))
;; >> #f
;; (2 loops)

Dijkstra’s Algorithm « Programming Praxis said
January 7, 2011 at 9:02 AM
[…] worked with directed graphs (“digraphs”) in a recent exercise. Today’s exercise is another graph theoretical procedure with many applications: […]

Programming Praxis