Algorithms and Techniques for Automated Deployment and Efficient Management of Large-Scale Distributed Data Analytics Services
The advent of the Internet of Things (IoT) has enabled smart applications, which provides near real-time and robust predictive data analytics. The backbone of these predictive analytics services is an underlying machine learning (ML) model. Multiple challenges manifest themselves in the context of automated deployment and efficient management of predictive analytics services across the cloud-edge spectrum. First, the development and evaluation of ML models require substantial expertise. Second, provisioning these ML-based data analytics applications often incurs complex deployment and configuration challenges due to handling of diverse ML libraries and frameworks, and incorporating a range of hardware and cloud platforms. Third, to handle the dynamic workload of the prediction requests, the resources need to scale up or down to minimize the operational cost while guaranteeing the Service Level Objective(SLO) of prediction tasks. Fourth, the accuracy of the ML model may degrade over time as new data arrives, which requires continuous model re-training. Model update tasks can be performed along with the background latency-critical tasks on the edge devices. However, this should not hamper the SLO of latency-critical jobs. To address these challenges, this doctoral research makes the following contributions: First, it defines a model-driven, template-based design for the rapid development of machine learning models. Second, it presents an automated service provisioning technique that enables the rapid and agile deployment of application components across the cloud-fog-edge spectrum with minimal domain expertise. Third, it describes our novel algorithms to proactively and dynamically scale the predictive analytics application components to minimize execution time according to the Service Level Objectives (SLO) while optimizing resource usage under variable workloads. Finally, it details a scalable and efficient framework for continual ML, particularly, Deep Learning-based model re-training on heterogeneous edge. The framework minimizes the DL model update time, while guarantees the SLO of the background latency-sensitive jobs.