Date of Award
Spring 2024
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Engineering Management & Systems Engineering
Program/Concentration
Engineering Management and Systems Engineering
Committee Director
James D. Moreland, Jr.
Committee Member
Saikou Y. Diallo
Committee Member
Andrew J. Collins
Abstract
The process of extracting structured data from unstructured and semi-structured text is manual, time consuming and error prone. Current natural language processing approaches for automating this process are difficult to verify for non-trivial and context-sensitive corpora. Large Language Models (LLMs) like ChatGPT have become a subject of considerable interest, opening a promising avenue of exploration. However, there is limited evidence on the performance of LLMs for information extraction.
In this dissertation, an approach is proposed to evaluate the accuracy of Stanford OpenIE and OpenAI's ChatGPT for this purpose. This includes comparing Resource Description Framework (RDF) triples extracted by each of these semi-automated methods to hand extracted triples. For identified discrepancies and/or noteworthy extractions, qualitative indicators were collected, analyzed, and discussed. The F2 score, a measure of accuracy for each method with a weighting for recall, was calculated by combining recall and precision performance metrics.
Results show that ChatGPT correctly identified manually extracted RDF triples with no statistical difference and was found to have an F2 score of 95.9%. OpenIE was found to have an F2 score of 20.7%. While LLMs still require human verification, this research has demonstrated that LLMs show an improvement over the state of the art.
Rights
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
DOI
10.25777/pyyz-j671
ISBN
9798382770529
Recommended Citation
Koski, Samuel R..
"Performing Information Extraction for Mission Engineering Applications"
(2024). Doctor of Philosophy (PhD), Dissertation, Engineering Management & Systems Engineering, Old Dominion University, DOI: 10.25777/pyyz-j671
https://digitalcommons.odu.edu/emse_etds/245