Abstract: Cloud service providers are adopting AI-based systems to efficiently offer data services. A data platform combines data management systems, serving storage, and compute infrastructure. A data service is the result of the careful orchestration of these services, with a set of users on the right-hand side and their data and knowledge requests on the left. The data service is typically composed of microservices with data as an object. Each microservice then offers APIs and SDKs for users to submit tasks and queries to the data platforms. The cloud-scale data platforms and the data service architecting and planning should be automated so that users can focus on describing their workloads without worrying about the underlying architectures. Data service design is extremely challenging. The workloads are broad and horizontally scaling. All the architectural components are stateful, dynamic, and performance-sensitive. The architectural complexity is huge due to the vast design space and requirement sets. The trade-offs on diverse metrics and concerns are crucial. The aforementioned challenges are further magnified in cloud-scale systems. A simulation-based framework is built to facilitate performance- and power-accuracy exploration across heterogeneous hardware implementations. The framework is employed to explore the design space of big data analytics written in a high-level domain-specific language for reconfigurable systems. An architecture transformation framework is presented to transform plain applications into efficient hardware blocks. The framework performs automatically instruction-level optimization on a petascale simulation kernel, achieving speedup over state-of-the-art toolchains and domain-specific compilers.
Keywords: Real-Time Data Processing, Cloud-Native Architecture, Scalable Data Pipelines, AI-Driven Insights, End-to-End Data Integration, Data Lake House Architecture, Streaming Analytics, Machine Learning at Scale, Event-Driven Architecture, Unified Data Platform, Low-Latency AI Inference, Big Data Orchestration, Cloud Data Warehousing, Predictive Analytics in Real Time, Automated Data Engineering.
|
DOI:
10.17148/IJARCCE.2022.111251