College
College of Engineering & Technology (Batten)
Department
Engineering Management and Systems Engineering
Graduate Level
Doctoral
Graduate Program/Concentration
Systems Engineering
Presentation Type
No Preference
Abstract
Large language models (LLMs) are increasingly at the core of multi-agent systems (MAS). However, the high resource demand, error propagation, and lack of adaptive evaluation mechanisms pose significant challenges in deploying these agentic solutions at scale. To address these concerns, this research proposes a Task-Aware Multi-Agent Orchestrator System designed to refine the agentic framework, categorizing tasks autonomously, assigning specialized evaluation datasets, and balancing token usage against functional effectiveness. This approach underscores robust data management, including AsyncHow, Mosaic AI, and Synthetic Preference Optimization (PO) corpora. Each dataset targets specific dimensions of agent performance, such as dynamic task decomposition and tool integration (AsyncHow), quality-cost-latency tradeoffs (Mosaic AI), and iterative preference refinement (PO). By classifying tasks using hierarchical clustering and LLM-driven intent detection, the framework automatically aligns each task with the most relevant evaluation dataset and metrics. An integrated evaluation pipeline leverages LLM judges for correctness and groundedness assessments, computing token-based cost-latency metrics and aggregating these results into multi-objective optimization models. By evaluating agents along Pareto frontiers of performance and cost, the framework enables casual decision-making, particularly in high-stakes applications where resource constraints and reliability must be balanced. In pursuit of continual refinement, feedback loops guide iterative improvements to agent configurations, leveraging meta-level techniques like Llama 3.2-3B for reasoning over performance outcomes. This multi-agent reinforcement learning (MARL) engine maximizes task success rates and proactively minimizes resource consumption, contributing to sustainable, real-world deployment scenarios. Thus, TAMOS enables alignment mechanisms and transparent reporting of cost-latency tradeoffs to address ethical concerns around bias, safety, and accountability.
Keywords
Large language models, Agentic Architecture, AI Agents, Reinforcement Learning, Multi-Agent, Systems, Meta-Structure
Included in
Artificial Intelligence and Robotics Commons, Databases and Information Systems Commons, Systems Architecture Commons, Systems Engineering Commons
TAMOS: Task-Aware Multi-Agent Orchestrator System
Large language models (LLMs) are increasingly at the core of multi-agent systems (MAS). However, the high resource demand, error propagation, and lack of adaptive evaluation mechanisms pose significant challenges in deploying these agentic solutions at scale. To address these concerns, this research proposes a Task-Aware Multi-Agent Orchestrator System designed to refine the agentic framework, categorizing tasks autonomously, assigning specialized evaluation datasets, and balancing token usage against functional effectiveness. This approach underscores robust data management, including AsyncHow, Mosaic AI, and Synthetic Preference Optimization (PO) corpora. Each dataset targets specific dimensions of agent performance, such as dynamic task decomposition and tool integration (AsyncHow), quality-cost-latency tradeoffs (Mosaic AI), and iterative preference refinement (PO). By classifying tasks using hierarchical clustering and LLM-driven intent detection, the framework automatically aligns each task with the most relevant evaluation dataset and metrics. An integrated evaluation pipeline leverages LLM judges for correctness and groundedness assessments, computing token-based cost-latency metrics and aggregating these results into multi-objective optimization models. By evaluating agents along Pareto frontiers of performance and cost, the framework enables casual decision-making, particularly in high-stakes applications where resource constraints and reliability must be balanced. In pursuit of continual refinement, feedback loops guide iterative improvements to agent configurations, leveraging meta-level techniques like Llama 3.2-3B for reasoning over performance outcomes. This multi-agent reinforcement learning (MARL) engine maximizes task success rates and proactively minimizes resource consumption, contributing to sustainable, real-world deployment scenarios. Thus, TAMOS enables alignment mechanisms and transparent reporting of cost-latency tradeoffs to address ethical concerns around bias, safety, and accountability.