When an algorithm developed for one problem domain applies almost verbatim to another, completely unrelated domain, that is the type of insight, beauty and depth that makes computer science a science on its own, and not a branch of something else, namely mathematics, like many professionals educated in the field mistakenly believe. For example, one of the common algorithmic problems during the 60s was the scheduling of tasks on multiprocessor machines. The problem is, you are given a large set of tasks, some of which depend on others, that have to be scheduled for processing on N number of processors in such a way as to maximize processor use. A well-known algorithm for this problem is the Coffman-Graham algorithm. It assumes that there are no circular dependencies between the tasks, as is usually the case when it comes to real world tasks, except in catch 22 situations at some bureaucracies run amok! To do that, the tasks and their dependencies are modeled as a DAG (a directed acyclic graph). In mathematics, this is also known as a partial order: if a tasks T1 depends on T2, we say that T2 preceeds T1, and we write T2 < T1. The ordering is called partial because not all tasks are related in this precedence relation, some are simply independent of each other and can be safely carried out in parallel.

The Coffman-Graham algorithm works by creating a sequence of execution rounds where at each round at most N tasks execute simultaneously. The algorithm also has to make sure that all dependencies of the current round have been executed in previous rounds. Those two constraints are what makes the problem non-trivial: we want exactly N tasks at each round of execution if possible, so that all processors get used, and we also have to complete all tasks that precede a given task T before scheduling it. There are 3 basic steps to achieving the objective:

The Coffman-Graham algorithm works by creating a sequence of execution rounds where at each round at most N tasks execute simultaneously. The algorithm also has to make sure that all dependencies of the current round have been executed in previous rounds. Those two constraints are what makes the problem non-trivial: we want exactly N tasks at each round of execution if possible, so that all processors get used, and we also have to complete all tasks that precede a given task T before scheduling it. There are 3 basic steps to achieving the objective:

- Cleanup the graph so that only direct dependencies are represented. So if there is a task A that depends on B and B depends on another task C, we already know that A depends “indirectly” on C (transitivity is one of defining features of partial orders), so that dependency does not need to be stated explicitly. Sometimes the input of a problem will have such superfluous information, but in fact this could only confuse the algorithm! Removing the indirect dependencies is called transitive reduction, as opposed to the more commonly operation of transitive closure which explicitly computes all indirect dependencies.
- Order the tasks in a single list so that the dependent ones come after their dependencies and they are sort of evenly spread apart. This is the crucial and most interesting part of the algorithm. So how are we to compare two tasks and decide which one should be run first. The trick is to proceed from the starting tasks, the roots of the graph that don’t have any dependencies whatsoever, and then progressively add tasks that depend only on them and then tasks then depend only on them etc. This is called topological ordering of the dependency graph. There are usually many possible such orderings and some of them will lead to a good balanced distribution of tasks for the purpose of CPU scheduling while others will leave lots of CPUs unused. In step (3) of the algorithm, we are just going to take the tasks one by one from this ordering and assign them to execution rounds as they come. Therefore, to make it so that at each round, the number of CPUs is maximized, the ordering must somehow space the dependencies apart as much as possible. That is, if the order is written as [T1, T2, T3, …, Tn] and if Tj depends on Ti, we want j-i to be as big as possible. Intuitively, this is desirable because the closer they are, the sooner we’d have to schedule Tj for execution after Ti, and since they can’t be executed on the same parallel round, we’d end up with unused CPUs. To space the tasks apart, here is what we do. Suppose we have tasks A and B, with all their dependencies already ordered in our list and we have to pick which one is going to come next. From A’s dependencies, we take the one most recently placed in the ordering and we check if it comes before or after the most recently placed task from B’s dependencies. If it comes before, then we choose A, if it comes after then we chose B. If it turns out A and B’s most recently placed dependency is actually the same task that both depend on, we look at the next most recent dependency etc. This way, by picking the next task as the one whose closest dependency is the furthest away, at every step we space out dependencies in our ordering as much as possible.
- Assign tasks to rounds so as to maximize the number of tasks executed on each round. This is the easiest step - we just reap the rewards from doing all the hard work of the previous steps. Going through our ordering [T1, T2, T3, …, Tn], we fill up available CPUs by assigning the tasks one by one. When all CPUs are occupied, we move to the next round and we start filling CPUs again. If while at a particular round the next task to be scheduled has a dependency that’s also scheduled for that same round, we have no choice but to leave the remaining CPUs unused and start the next round. The algorithm does not take into account how long each tasks can potentially take.

Now, I said that this algorithm is also used to solve a completely different problem. The problem I was referring to is drawing networks in a visually appealing way. This is a pretty difficult problem and there are many different approaches whose effectiveness often depends on the structure of the network. When a network is devoid of cycles (paths from on node back to itself), the Coffman-Grahan algorithm just described can be applied!

The idea is to think of the network nodes as the tasks and of the network connections as the dependencies between the tasks, and then build a list of consecutive layers analogous to the task execution rounds. Instead of specifying a number of available CPUs, one specifies how many nodes per layer are allowed, which is generally convenient because the drawing is done on a fixed width computer screen. Because the algorithm does not like circular dependencies, there is an extra step here to remove a select set of connections so that the network becomes a DAG. This is in addition to transitive reduction where we only keep direct connections and drop all the rest. Once the algorithm is complete, the drawing of those provisionally removed connections can be performed on top of the nice layering produced. Thus, the Coffman-Graham is (also!) one of hierarchical drawing algorithms, a general framework for graph drawing developed by Kozo Sugiyama.

## No comments:

## Post a Comment