Autonomic Management of Large Clusters and Their Integration into the Grid

Thomas Röblitz, Florian Schintke, Alexander Reinefeld, Olof Bärring, Maite Barroso Lopez, German Cancio, Sylvain Chapeland, Karim Chouikh, Lionel Cons, Piotr Poznaski, Philippe Defert, Jan Iven, Thorsten Kleinwort, Bernd Panzer-Steindel, Jaroslaw Polok, Catherine Rafflin, Alan Silverman, Tim Smith, Jan Eldik, David FrontMassimo Biasotto, Cristina Aiftimiei, Enrico Ferro, Gaetano Maron, Andrea Chierici, Luca Dell'agnello, Marco Serra, Michele Michelotto, Lord Hess, Volker Lindenstruth, Frank Pister, Timm Morten Steinbeck, David Groep, Martijn Steenbakkers, Oscar Koeroo, Wim Som Cerff, Gerben Venekamp, Paul Anderson, Tim Colles, Alexander Holt, Alastair Scobie, Michael George, Andrew Washbrook, Rafael A. García Leiva

Research output: Contribution to journalArticlepeer-review

Abstract

We present a framework for the co-ordinated, autonomic management of multiple clusters in a compute center and their integration into a Grid environment. Site autonomy and the automation of administrative tasks are prime aspects in this framework. The system behavior is continuously monitored in a steering cycle and appropriate actions are taken to resolve any problems.
All presented components have been implemented in the course of the EU project DataGrid: The Lemon monitoring components, the FT fault-tolerance mechanism, the quattor system for software installation and configuration, the RMS job and resource management system, and the Gridification scheme that integrates clusters into the Grid.
Original languageEnglish
Pages (from-to)247-260
Number of pages14
JournalJournal of Grid Computing
Volume2
Issue number3
DOIs
Publication statusPublished - Sep 2005

Fingerprint

Dive into the research topics of 'Autonomic Management of Large Clusters and Their Integration into the Grid'. Together they form a unique fingerprint.

Cite this