Hani Al-Sayeh | Juggler: Autonomous Cost Optimization and Performance Prediction of Big Data Applications | #6
Архівні серії ("Канал неактуальний" status)
When? This feed was archived on October 13, 2022 19:36 (). Last successful fetch was on August 21, 2022 08:18 ()
Why? Канал неактуальний status. Нашим серверам не вдалося отримати доступ до каналу подкасту протягом тривалого періоду часу.
What now? You might be able to find a more up-to-date version using the search function. This series will no longer be checked for updates. If you believe this to be in error, please check if the publisher's feed link below is valid and contact support to request the feed be restored or if you have any other concerns about this.
Manage episode 337873974 series 3383877
Summary:
Distributed in-memory processing frameworks accelerate iterative workloads by caching suitable datasets in memory rather than recomputing them in each iteration. Selecting appropriate datasets to cache as well as allocating a suitable cluster configuration for caching these datasets play a crucial role in achieving optimal performance. In practice, both are tedious, time-consuming tasks and are often neglected by end users, who are typically not aware of workload semantics, sizes of intermediate data, and cluster specification. To address these problems, Hani and his colleagues developed Juggler, an end-to-end framework, which autonomously selects appropriate datasets for caching and recommends a correspondingly suitable cluster configuration to end users, with the aim of achieving optimal execution time and cost.
Questions:
1:02 - Can you introduce your work and describe the current workflow for developing big data applications in the cloud?
2:49 - What is the challenge (maybe hidden challenge) facing application developers in this workflow? What harms performance?
5:36 - How does Juggler solve this problem?
11:55 - As an end user, how do I interact with Juggler?
14:07 - Can you talk us through your evaluation of Juggler? What were the key insights?
16:30 - What other tools are similar to Juggler? How do they compare?
18:17 - What are the limitations of Juggler?
21:57 - Who will find Juggler the most useful? Who is it for?
24:05 - Is Juggler publicly available?
24:23 - What is the most interesting (maybe unexpected) lesson you learned while working on this topic?
27:50 - What is next for Juggler? What do you have planned for future research?
28:49 - What attracted you to this research area?
29:45 - What do you think is the biggest challenge now in this area?
Links:
- Juggler: Autonomous Cost Optimization and Performance Prediction of Big Data Applications (SIGMOD 2022 paper)
- Juggler SIGMOD 22 presentation
- CherryPick: Adaptively Unearthing the Best Cloud Configurations for Big Data Analytics (NSDI 2017 paper)
- Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics (NSDI 2016 paper)
Contact:
- Email: hani-bassam.al-sayeh@tu-ilmenau.de
- TU Ilmenau Database and Information Systems Group
Our GDPR privacy policy was updated on August 8, 2022. Visit acast.com/privacy for more information.
11 епізодів