Histopathological Vision Language Model Risk-Score for Cancer Diagnosis
Abstract/Description/Artist Statement
Healthcare and medical applications require high standards of trust and reliability, due to critical safety concerns. Vision language models (VLMs) (such as GPT-4v/o [1]), are AI that process both vision and language input(s) to classify, predict, or generate vision and/or language output(s). VLMs excel at complex reasoning tasks in order to achieve human-level performance, but struggle with new data, uncertainty, and the impact of clinical errors. The clinical impact of diagnostic errors as a result of computer-based diagnosis may be more informative to quantify device safety than traditionally utilized accuracy scores. Implementation of explainability and risk aversion are important in AI models as they improve clinical trust, ensure accountability, enable legal and regulatory compliance, allow for the mitigation of biases and errors, and ultimately prioritize patient safety. Our work proposes an approach to evaluate clinical histopathology Vision Language Model (VLM) safety through the development of a clinical cancer risk score metric for AI evaluation as an alternative to traditionally utilized AI accuracy scores. The medical VLM(s) will be utilized to evaluate several multimodal (image & text) pathology datasets (SLAKE, PathVQA, etc.) to extract cancer related histopathology data samples. Then utilizing a clinically based Simplified Risk Quantification score based on the NCI SEER* Explorer statistical cancer data on incidence & survivability, we will be able to comparatively evaluate our risk metric on open source and available VLMs on the developed dataset. This method will be able to inform potential users of the impact of AI diagnostics beyond accuracy scores.
Faculty Advisor/Mentor
Murat Kuzlu
Faculty Advisor/Mentor Email
mkuzlu@odu.edu
Faculty Advisor/Mentor Department
BCET
College/School Affiliation
Batten College of Engineering & Technology
Student Level Group
Graduate/Professional
Presentation Type
Oral Presentation
Histopathological Vision Language Model Risk-Score for Cancer Diagnosis
Healthcare and medical applications require high standards of trust and reliability, due to critical safety concerns. Vision language models (VLMs) (such as GPT-4v/o [1]), are AI that process both vision and language input(s) to classify, predict, or generate vision and/or language output(s). VLMs excel at complex reasoning tasks in order to achieve human-level performance, but struggle with new data, uncertainty, and the impact of clinical errors. The clinical impact of diagnostic errors as a result of computer-based diagnosis may be more informative to quantify device safety than traditionally utilized accuracy scores. Implementation of explainability and risk aversion are important in AI models as they improve clinical trust, ensure accountability, enable legal and regulatory compliance, allow for the mitigation of biases and errors, and ultimately prioritize patient safety. Our work proposes an approach to evaluate clinical histopathology Vision Language Model (VLM) safety through the development of a clinical cancer risk score metric for AI evaluation as an alternative to traditionally utilized AI accuracy scores. The medical VLM(s) will be utilized to evaluate several multimodal (image & text) pathology datasets (SLAKE, PathVQA, etc.) to extract cancer related histopathology data samples. Then utilizing a clinically based Simplified Risk Quantification score based on the NCI SEER* Explorer statistical cancer data on incidence & survivability, we will be able to comparatively evaluate our risk metric on open source and available VLMs on the developed dataset. This method will be able to inform potential users of the impact of AI diagnostics beyond accuracy scores.