1
$\begingroup$

I'm dealing with tables that show connectivity within a network. Can anyone point me toward a technique for ordering the rows and columns so that nodes that are connected tend to be close to each other in the table?

For example, if I had a network like this (connections are directional), the structure isn't immediately obvious; it just looks random.

  1 2 3 4 5
1     x   x
2   x   
3     x  x
4   x  x 
5        x 

But if you re-order the columns and rows this way, the structure is clear; you can see some organisation in the network.

  2 4 5 3 1
2 x    
4 x x   
5   x   
3     x x 
1     x x 

Of course, I'm dealing with larger tables, with ~1024 nodes. I could just code up something that tries to find the arrangement with the minimal average distance between connected nodes by some general-purpose method like gradient descent. But perhaps there's a technique that's particularly suited to this type of problem?

  • 1
    What all have you tried? I can think of some simple techniques like choosing the first entry such that it has the maximum number of neighbors and then proceed so on. However, I am not sure if it will give you the optimal bandwidth.2010-11-14
  • 2
    Two thoughts: (1) This looks a lot like a graph layout problem restricted to one dimension. (2) If your network is acyclic, like your example, you can try [topological sorting](http://en.wikipedia.org/wiki/Topological_sorting) on the nodes. It'll probably look less random, but it may not find clusters if they exist in the network.2010-11-14
  • 2
    @Sivaram: Probably not, because bandwidth minimization is NP-complete.2010-11-14
  • 0
    @ Rahul: Thought so. I was looking up if this was NP-complete. Thanks!2010-11-14
  • 0
    Bandwidth minimization can be approximated up to logarithmic factors.2010-11-14
  • 0
    Bandwidth isn't an issue; I was using the word "network" in a generic sense. Sorry for the confusion. Sivaram and Rahul, than you for the suggestions. I'll experiment with those ideas.2010-11-14
  • 1
    @mhwombat: We know what you meant. [Bandwidth](http://en.wikipedia.org/wiki/Graph_bandwidth) is a term in graph theory that measures how close to the diagonal you can pack the connectivity in an adjacency matrix (i.e. your tables) by rearranging the ordering of nodes. This is definitely related to what you want.2010-11-14

1 Answers 1

1

Rahul's link to the Wikipedia article on topological sorting led me to tsort, which is a Unix command to do topological sorting. The networks I'm dealing with might have some cycles, but it seems that tsort can deal with cycles gracefully, and report the cycles to stderr. It looks like that will do the job I need. Thank you, Rahul!