Adapting Batch Scheduling to Workload Characteristics: What can we expect From Online Learning? - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

Adapting Batch Scheduling to Workload Characteristics: What can we expect From Online Learning?

Résumé

Despite the impressive growth and size of super-computers, the computational power they provide still cannot match the demand. Efficient and fair resource allocation is a critical task. Super-computers use Resource and Job Management Systems to schedule applications, which is generally done by relying on generic index policies such as First Come First Served and Shortest Processing time First in combination with Backfilling strategies. Unfortunately, such generic policies often fail to exploit specific characteristics of real workloads. In this work, we focus on improving the performance of online schedulers. We study mixed policies, which are created by combining multiple job characteristics in a weighted linear expression, as opposed to classical pure policies which use only a single characteristic. This larger class of scheduling policies aims at providing more flexibility and adaptability. We use space coverage and black-box optimization techniques to explore this new space of mixed policies and we study how can they adapt to the changes in the workload. We perform an extensive experimental campaign through which we show that (1) even the best pure policy is far from optimal and that (2) using a carefully tuned mixed policy would allow to significantly improve the performance of the system. (3) We also provide empirical evidence that there is no one size fits all policy, by showing that the rapid workload evolution seems to prevent classical online learning algorithms from being effective.
Fichier principal
Vignette du fichier
final_version.pdf (14.65 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02044903 , version 1 (21-02-2019)

Identifiants

Citer

Arnaud Legrand, Denis Trystram, Salah Zrigui. Adapting Batch Scheduling to Workload Characteristics: What can we expect From Online Learning?. IPDPS 2019 - 33rd IEEE International Parallel & Distributed Processing Symposium, May 2019, Rio de Janeiro, Brazil. pp.686-695, ⟨10.1109/IPDPS.2019.00077⟩. ⟨hal-02044903⟩
299 Consultations
195 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More