Agentic workflows are AI-driven software setups that link various models and tools to perform intricate tasks, such as video analysis and question answering. However, their fragmented design often leads to inefficiencies, wasting computation, energy, and costs. To address this, MIT and Microsoft researchers created an intelligent system to simplify the design of these workflows and optimize their implementation automatically.
With this system, developers can outline their workflow intentions in plain language without specifying detailed application configurations. The system identifies the best models and tools, along with the ideal hardware setup and resource allocation, which are adjusted in real-time based on user priorities like cost reduction or speed enhancement. Testing revealed that it significantly reduces computational units and energy requirements, cutting costs without affecting performance.
“Agentic workflows are becoming increasingly complex and vital for cloud providers. Energy consumption is a major concern, and we must ensure efficiency to avoid wasting resources,” said Gohar Chaudhry, EECS graduate student and lead author of the study. Co-authors include Adam Belay and Ricardo Bianchini from Microsoft Azure. The study will be presented at the USENIX Symposium on Operating Systems Design and Implementation.
Agentic workflows consist of autonomous AI agents using various models and tools to complete tasks like data processing. Developers typically must predefine all technical choices, a challenging task given the vast configuration possibilities. If a new AI model enhances an application, developers would have to start from scratch. Chaudhry notes the difficulty of manually optimizing workflows due to the extensive configuration options.
The new system, named Murakkab, aims to optimize the entire agentic workflow process. It allows developers to describe their application intentions in broad terms, automatically assembling the best models and tools. Murakkab decides which components to run sequentially or in parallel, adapting to new models or GPU accelerators without developer intervention.
When deployed by cloud providers, Murakkab configures workflows to meet user constraints, maximizing efficiency by dynamically adjusting hardware allocations. It provides cloud providers with visibility into multiple workloads, allowing efficient resource sharing while meeting user constraints.
In tests, Murakkab met user requirements while using only about 35% of the computation required by other methods, consuming roughly 27% of the energy at less than 25% of the cost. It also allowed users to balance tradeoffs, significantly reducing energy consumption with minimal accuracy loss. The system identified an optimal model configuration for video frame selection, a task difficult to achieve manually.
The researchers aim to expand the system to more complex workflows and larger clusters while exploring new optimization opportunities. “There is great potential to make these workflows more resource-efficient, but we need to consider this at the scale of major cloud platforms,” Chaudhry stated. The research received support from the Semiconductor Research Corporation and the U.S. Defense Advanced Research Projects Agency.
Original Source: news.mit.edu
