In this project COMO will focus on load-based routing. Given the proposed network topology, the routing problem is situated on two levels. First we have the classical routing problem, and secondly the problem of adapting the light paths. The key question is how to couple these two levels. It is well known that a distributed approach of load-based routing, easily results into a oscillatory behavior of the network. These oscillations need to be kept under control. This can be obtained by e.g. introducing a bias or to adapt the reactivity using a meta-level. Given the hierarchical routing in this project, the problem of oscillations will be persistently present. In this project the feasibility of Prioritized Sweeping will be investigated. This is a variant of reinforcement learning, where it is possible to propagate important information very fast. To apply the idea of Prioritized Sweeping, the structure of the problem must be known. By this we mean that the transition probabilities must be available. This seems to be in contradiction with the model free idea of reinforcement Learning. However in the context of this reduces to each routing knowing its neighbors, which is no real restriction. The expected results of the study are that based on the concept of Prioritized sweeping, a stable load-based routing strategy can be developed by using propagating information with a different priority on the different levels. We will also investigate how Prioritized sweeping can contribute to the forming of coalition of routers, with the aim to obtain a better load distribution.
This research is also relevant in a broader scope. Hierarchical Reinforcement Learning is an important research track, which should allow reinforcement Learning to be applied to problems with huge state spaces, and decomposable problems. It is expected that the results will also allow to developing interesting routing strategies for other systems, such as complex networks of collaborative sensors. COMO will in the near future conduct basic research in the domain of the optimization of wireless sensor webs.
Runtime: 2004 - 2007