Publication Details
Overview
 
 
Athanasios Kordelas, Thanasis Spyrou, Spyros Voulgaris, Vasileios Megalooikonomou, Nikos Deligiannis
 

Chapter in Book/ Report/ Conference proceeding

Abstract 

Apache Spark is one of the most commonly usedframeworks for Big Data processing. Research on the providedstreaming dynamic resource allocation feature, has been shownthat large data load fluctuations, for instance, in website traffic,have a negative impact on the automatic scaling. Research hasalso indicated that the lack of data load prediction, whichaims at the identification of the expected data load increase onpeak hours/days, is the root cause of the aforementioned issue.Hence, this paper proposes an enhanced solution, namely, KORDI(Knowledge-based Orchestrated Resource DIstribution), aimingat optimising the allocation of Spark resources on Streamingapplications in real time with the use of SARIMAX model.The experimental evaluation proves that the proposed solutionprovides a cost reduction of 38% without affecting stability.

Reference