Requirements-Driven Prompt Engineering for Simulation Software Development

Abstract/Description/Artist Statement

Simulation software development underpins digital engineering, scientific modeling, and system analysis across discrete-event, stochastic, and continuous domains. Foundational computational components such as sorting algorithms used in event-set management, random number generators, and numerical solvers directly influence model validity, performance, and stability. As large language models (LLMs) are increasingly used to generate executable code, their integration into simulation workflows must be evaluated against defined engineering requirements. Generative AI inherently produces variable outputs that do not consistently adhere to standards of performance, correctness, or numerical and statistical quality. A prompt engineering framework, Goal, Performance, Exclusion Architecture (GPE-A), encodes explicit goals, performance priorities, and constraint boundaries to guide LLM-generated implementations toward convergence on defined behavioral metrics. Evaluated problems have included heapsort, which provides a binary correctness measure enabling isolated assessment of execution performance; random number generation, enabling evaluation of statistical quality metrics and examination of extensibility toward emerging quantum-computing paradigms; and fourth-order Runge–Kutta methods for continuous-time systems. These problems also allow benchmarking against production-standard baselines. The framework has been evaluated across multiple LLMs, including ChatGPT 4o, ChatGPT 5.2, Gemini 3, and Claude Sonnet 4.5, and compared against commonly used prompting strategies such as Chain-of-Thought formulations. Collectively, these studies illustrate a requirements-driven approach to integrating LLM-generated code into simulation software development and provide a foundation for continued evaluation across broader computational contexts.

Presenting Author Name/s

Abigail Berardi

Faculty Advisor/Mentor

James Leathrum, Yuzhong Shen, Masha Sosonkina

Faculty Advisor/Mentor Email

jleathru@odu.edu, yshen@odu.edu, msosonki@odu.edu

Faculty Advisor/Mentor Department

Electrical and Computer Engineering

College/School Affiliation

Batten College of Engineering & Technology

Student Level Group

Undergraduate

Presentation Type

Poster

This document is currently not available here.

Share

COinS
 

Requirements-Driven Prompt Engineering for Simulation Software Development

Simulation software development underpins digital engineering, scientific modeling, and system analysis across discrete-event, stochastic, and continuous domains. Foundational computational components such as sorting algorithms used in event-set management, random number generators, and numerical solvers directly influence model validity, performance, and stability. As large language models (LLMs) are increasingly used to generate executable code, their integration into simulation workflows must be evaluated against defined engineering requirements. Generative AI inherently produces variable outputs that do not consistently adhere to standards of performance, correctness, or numerical and statistical quality. A prompt engineering framework, Goal, Performance, Exclusion Architecture (GPE-A), encodes explicit goals, performance priorities, and constraint boundaries to guide LLM-generated implementations toward convergence on defined behavioral metrics. Evaluated problems have included heapsort, which provides a binary correctness measure enabling isolated assessment of execution performance; random number generation, enabling evaluation of statistical quality metrics and examination of extensibility toward emerging quantum-computing paradigms; and fourth-order Runge–Kutta methods for continuous-time systems. These problems also allow benchmarking against production-standard baselines. The framework has been evaluated across multiple LLMs, including ChatGPT 4o, ChatGPT 5.2, Gemini 3, and Claude Sonnet 4.5, and compared against commonly used prompting strategies such as Chain-of-Thought formulations. Collectively, these studies illustrate a requirements-driven approach to integrating LLM-generated code into simulation software development and provide a foundation for continued evaluation across broader computational contexts.