By Yuen Chung Kwong

The 3rd within the "Series on Scalable Computing", this paintings comprises 5 articles describing major advancements within the box. It addresses issues similar to clusters, parallel instruments, load balancing, cellular platforms, and structure dependence.

Example text

Most often, workstation clusters are not only used for high-throughput computing in time-sharing mode but also for running complex parallel jobs in space-sharing mode. This poses several difficulties to the resource management system. It must be able to reserve computing resources for exclusive use and also to determine an optimal process mapping for a given system topology. On the basis of our CCS environment, we have presented the anatomy of a modern resource management system. CCS is built on three architectural elements: the concept of autonomous domains, the versatile resource description tool RSD, and the site management tools CIS and CRM.

The bulk of them follow the off-line approach, where the gathered data can only be analysed after the application terminated. Representative examples are Vampir, Pablo, and ParaGraph. The on-line tools are much fewer and even fewer are the tools which feature some well defined interface to a monitoring facility, which could provide prerequisites for interoperability. Practically each on-line tool needs such an interface to a monitoring system via which it can observe and possibly influence the application's state, since virtually all of the existing tools are supplied with their own proprietary monitoring systems, they are not capable of cooperating with each other.

