Date of Award
Spring 2025
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Electrical & Computer Engineering
Program/Concentration
Electrical and Computer Engineering
Committee Director
Jiang Li
Committee Member
Chunsheng Xin
Committee Member
Hongyi Wu
Abstract
This dissertation aims to address critical challenges in the field of computer vision and machine learning, focusing on three key areas: image translation, denoising, and model security. The research encompasses novel methodologies and models that significantly advance existing techniques. This dissertation will not only provide valuable contributions to the academic community but also hold significant potential for practical applications in domains ranging from surveillance to autonomous systems.
Consequently, this dissertation proposes three goals. First, we present new approaches for converting optical videos to infrared videos using deep learning. To apply powerful deep learning based algorithms for object detection and classification in infrared videos, it is necessary to have more training data in order to build high performance models. However, in many surveillance applications, one can have a lot more optical videos than infrared videos. This lack of IR video datasets can be mitigated if optical-to-infrared video conversion is possible. We have proposed Attention GAN and MWIRGAN for optical to MWIR conversion. We have shown that synthetic MWIR videos can improve target detection and classification performance at night. Second, we have proposed generative approaches for improved document enhancement and segmentation with small unpaired data. Scanned PDF documents, which serve as digital representations of these historical drawings, often suffer from inherent imperfections such as noise and artifacts introduced during the scanning process. These imperfections, compounded by the heavy and heterogeneous nature of the noise, render the documents unsuitable for seamless integration into modern computer-aided design (CAD) platforms like AutoCAD. Traditional denoising methods, designed to mitigate noise in images adhering to well-defined noise models, prove ineffective when confronted with the complex noise structures present in scanned shipbuilding drawings. To address these challenges, we developed a specialized document enhancement strategy designed specifically for shipbuilding drawings stored as scanned PDF files. Also, we have evaluated the use of the generative adversarial networks for ship deck segmentation in traditional engineering documents. We customized the Pix2pix GAN for our experiments. Moreover, we have used an instruction-based stable diffusion model and visual prompting to clean document images of ship drawing by removing texts and label-related information. Third, we propose the Adversarial Interpretation Mitigation (AIM) method, a novel method for detecting adversarial model manipulation targeted at fooling saliency map-based interpretations. Our approach leverages convolutional kernel responses to identify model manipulation attacks on interpretation methods. AIM needs only clean images to detect the manipulation in a model and proves to be highly efficacious in practical applications. Through extensive experiments, we demonstrate that AIM can identify a diversity of fooling attacks across various interpretation methods, network architectures, and datasets.
Rights
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
DOI
10.25777/f5zs-8r83
ISBN
9798280747142
Recommended Citation
Uddin, Mohammad S..
"From Image Enhancement to Model Protection Integrating Generative AI and Secure Learning in Computer Vision"
(2025). Doctor of Philosophy (PhD), Dissertation, Electrical & Computer Engineering, Old Dominion University, DOI: 10.25777/f5zs-8r83
https://digitalcommons.odu.edu/ece_etds/604
ORCID
0000-0002-2466-1212