Date of Award

Summer 2024

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

Program/Concentration

Computer Science

Committee Director

Jian Wu

Committee Member

Vikas Ashok

Committee Member

Meng Jiang

Abstract

Large Language Models (LLMs) have rapidly advanced the field of Natural Language Processing and become powerful tools for generating and evaluating scientific text. Although LLMs have demonstrated promising as evaluators for certain text generation tasks, there is still a gap until they are used as reliable text evaluators for general purposes. In this thesis project, I attempted to fill this gap by examining the discernibility of LLMs from human-written and LLM-generated scientific news. This research demonstrated that although it was relatively straightforward for humans to discern scientific news written by humans from scientific news generated by GPT-3.5 using basic prompts, it is challenging for most state-of-the-art LLMs without instruction-tuning. To unlock the potential evaluation capability of LLMs on this task, we propose guided-few-shot (GFS), an instruction-tuning method that significantly improves the discernibility of LLMs to human-written and LLM-generated scientific news. To evaluate our method, we built a new dataset, SA News, containing about 362 triplets of scientific news text, LLM-generated news text, and the corresponding scientific paper abstract on which the news articles were based. This work is the first step for further understanding the feasibility of using LLMs as an automated scientific news quality evaluator.

Rights

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

DOI

10.25777/perk-7b13

ISBN

9798384455080

ORCID

0000-0002-7089-6354

Share

COinS