Research Experiences for Undergraduates in Disinformation Detection & Analytics
The goal of this REU Site was to engage participating students in real-world projects studying disinformation from the perspectives of data analytics, information retrieval, applied machine learning, web archiving, and social computing. See the REU site for Computer & Information Science for more information about the program. Student presentations are available below.
Program Coordinators: Dr. Sampath Jayarathna and Dr. Jian Wu
Research Mentors:
- Dr. Michele C. Weigle, Computer Science
- Dr. Michael L. Nelson, Computer Science, VMASC
- Dr. Erika Frydelund, VMASC
- Dr. Anne Perrotti, Communication Disorders & Special Education
- Dr. Vikas G. Ashok, Computer Science
- Dr. Faryaneh Poursardar, Computer Science
-
Discovering the Traces of Disinformation on Instagram
2022Haley Bragg and Michele C. Weigle (Mentor)
Disinformation, which is fabricated, misleading content spread with the intent to deceive others, is accumulating substantial engagements and reaching a vast audience on Instagram. However, the temporary nature of the platform and the security guidelines that remove malicious content make studying this disinformation a challenge. The only way to access removed content and banned accounts that are no longer on the live web is by searching the web archives. In this study, we set out to quantify the replayability and quality of past captures of Instagram accounts, specifically focusing on a group of of anti-vax content creators known as the Disinformation Dozen. We found that the number of mementos listed for these accounts on the Internet Archive’s Wayback Machine can be misleading, because a majority of the mementos are actually redirections to the Instagram login page, and of the remaining replayable mementos, many are missing post images. In fact, 96.13% of mementos from the Disinformation Dozen accounts redirect to the login page, and only 27.16% of the remaining replayable mementos contain every post image. Combined, these results reveal that merely 1.05% of mementos for the Disinformation Dozen accounts are replayable with complete post images. Furthermore, we found that the percentage of replayable mementos is decreasing over time, with a particular lack of replayable mementos for the years 2021 and 2022.
-
Protecting Blind Screen-Reader Users From Deceptive Content
2022Ash Dobrenen and Vikas Ashok (Mentor)
Visually impaired people who want to use a computer rely on screen readers to independently do this. This research focuses on beginning to build a chrome extension in order to help users more safely navigate the internet using a screen reader. to begin collecting the data, a screen reader was used to help determine items in the website that might take the user somewhere they did not mean to go since the link or image was not sufficiently able to be described by the screen reader. Next, those items were tagged with ’data-attribute=”deceptive”’. After, those data-attributes were extracted and tagged with values for various features in it, and a code at the end for if it was a deceptive item. Then six different machine learning models were created in order to predict whether an item on a website is deceptive. Overall, the best model for this data set was the Random Forest Classification from the Scikit-Learn Python Library. Overall, there is much more to be done to improve the accuracy and usability of the models, and then develop the chrome extension, but this is research created a point to begin from for future research.
-
An Assessment of Scientific Claim Verification Frameworks: Final Presentation
2022Ethan Landers and Jian Wu (Mentor)
-
Networks of Disinformation: The Proliferation of Hate Speech in Chile and Colombia During the Venezuelan Migration Crisis
2022Isabelle Valdes and Erika Frydenlund (Mentor)