Requirements-Driven Prompt Engineering for Simulation Software Development
Abstract/Description/Artist Statement
Simulation software development underpins digital engineering, scientific modeling, and system analysis across discrete-event, stochastic, and continuous domains. Foundational computational components such as sorting algorithms used in event-set management, random number generators, and numerical solvers directly influence model validity, performance, and stability. As large language models (LLMs) are increasingly used to generate executable code, their integration into simulation workflows must be evaluated against defined engineering requirements. Generative AI inherently produces variable outputs that do not consistently adhere to standards of performance, correctness, or numerical and statistical quality. A prompt engineering framework, Goal, Performance, Exclusion Architecture (GPE-A), encodes explicit goals, performance priorities, and constraint boundaries to guide LLM-generated implementations toward convergence on defined behavioral metrics. Evaluated problems have included heapsort, which provides a binary correctness measure enabling isolated assessment of execution performance; random number generation, enabling evaluation of statistical quality metrics and examination of extensibility toward emerging quantum-computing paradigms; and fourth-order Runge–Kutta methods for continuous-time systems. These problems also allow benchmarking against production-standard baselines. The framework has been evaluated across multiple LLMs, including ChatGPT 4o, ChatGPT 5.2, Gemini 3, and Claude Sonnet 4.5, and compared against commonly used prompting strategies such as Chain-of-Thought formulations. Collectively, these studies illustrate a requirements-driven approach to integrating LLM-generated code into simulation software development and provide a foundation for continued evaluation across broader computational contexts.
Faculty Advisor/Mentor
James Leathrum, Yuzhong Shen, Masha Sosonkina
Faculty Advisor/Mentor Email
jleathru@odu.edu, yshen@odu.edu, msosonki@odu.edu
Faculty Advisor/Mentor Department
Electrical and Computer Engineering
College/School Affiliation
Batten College of Engineering & Technology
Student Level Group
Undergraduate
Presentation Type
Poster
Requirements-Driven Prompt Engineering for Simulation Software Development
Simulation software development underpins digital engineering, scientific modeling, and system analysis across discrete-event, stochastic, and continuous domains. Foundational computational components such as sorting algorithms used in event-set management, random number generators, and numerical solvers directly influence model validity, performance, and stability. As large language models (LLMs) are increasingly used to generate executable code, their integration into simulation workflows must be evaluated against defined engineering requirements. Generative AI inherently produces variable outputs that do not consistently adhere to standards of performance, correctness, or numerical and statistical quality. A prompt engineering framework, Goal, Performance, Exclusion Architecture (GPE-A), encodes explicit goals, performance priorities, and constraint boundaries to guide LLM-generated implementations toward convergence on defined behavioral metrics. Evaluated problems have included heapsort, which provides a binary correctness measure enabling isolated assessment of execution performance; random number generation, enabling evaluation of statistical quality metrics and examination of extensibility toward emerging quantum-computing paradigms; and fourth-order Runge–Kutta methods for continuous-time systems. These problems also allow benchmarking against production-standard baselines. The framework has been evaluated across multiple LLMs, including ChatGPT 4o, ChatGPT 5.2, Gemini 3, and Claude Sonnet 4.5, and compared against commonly used prompting strategies such as Chain-of-Thought formulations. Collectively, these studies illustrate a requirements-driven approach to integrating LLM-generated code into simulation software development and provide a foundation for continued evaluation across broader computational contexts.