Trisk: Task-Centric Data Stream Reconfiguration

Abstract

Due to the long-run and unpredictable nature of stream processing, any statically configuredexecution of stream jobs fails to process data in a timely and efficient manner. To achieve performance requirements, stream jobs need to be reconfigured dynamically. In this paper, we present Trisk, a control plane that support versatile reconfigurations while keeping high efficiency with easy-to-use programming APIs. Trisk enables versatile reconfigurations with usability based on a task-centric abstraction, and encapsulates primitive operations such that reconfigurations can be described by compositing the primitive operations on the abstraction. Trisk adopts a partial pause-and-resume design for efficiency, through which synchronization mechanisms in the native stream systems can further be leveraged. We implement Trisk on Apache Flink and demonstrate its usage and performance under realistic application scenarios. We show that Trisk executes reconfigurations with shorter completion time and comparable latency compared to a state-of-the-art fluid mechanism for state management.

Publication
In the ACM Symposium on Cloud Computing, Seattle, WA, USA, Novermber 1st-4th, 2021
Yancan Mao
Yancan Mao
3rd Year Ph.D. Student

My research interests include state management and dynamic reconfiguration of distributed stream processing.

Yuan Huang
Yuan Huang
Research Assistant

I worked on stream processing systems and other distributed systems.

Runxin Tian
Runxin Tian
MComp Student

My research interests include distributed stream processing and performance modeling.

Xin Wang
Xin Wang
Senior Research Fellow

My research interests include social networks, network economics and distributed systems.

Richard T. B. Ma
Richard T. B. Ma
Associate Professor

My research interests include cloud computing, big data systems and network economics.