Abstract

Organizations essentially inform clients about data collection and sharing practices through privacy policies. Recent research has proposed tools to help users better comprehend these lengthy and intricate legal documents that summarize collection and sharing. However, these instruments have a significant flaw. They overlook the possibility of contradictions within a particular policy. This paper introduces PolicyLint, a tool for analyzing privacy policies that simultaneously considers negation and varying semantic levels of data objects and entities. PolicyLint accomplishes this by using sentence-level natural language processing to automatically create ontologies from a large corpus of privacy policies and capturing both positive and negative statements regarding data collection and sharing. Using PolicyLint, I examined the policies of 300 apps and found that some contained contradictions that could indicate false statements. I manually check 100 contradictions, spotting troubling patterns like the use of misleading presentation, attempts to redefine terms that are commonly understood, and tracking information that is made possible by sharing or collecting data that can be used to derive sensitive information. As a result, automated privacy policy analysis is significantly improved by PolicyLint.

Document Type

Paper

Disciplines

Information Security

DOI

10.25776/ka6p-cc66

Publication Date

12-9-2022

Upload File

wf_yes

Share

COinS
 

Investigating Privacy Policies using PolicyLint Tool

Organizations essentially inform clients about data collection and sharing practices through privacy policies. Recent research has proposed tools to help users better comprehend these lengthy and intricate legal documents that summarize collection and sharing. However, these instruments have a significant flaw. They overlook the possibility of contradictions within a particular policy. This paper introduces PolicyLint, a tool for analyzing privacy policies that simultaneously considers negation and varying semantic levels of data objects and entities. PolicyLint accomplishes this by using sentence-level natural language processing to automatically create ontologies from a large corpus of privacy policies and capturing both positive and negative statements regarding data collection and sharing. Using PolicyLint, I examined the policies of 300 apps and found that some contained contradictions that could indicate false statements. I manually check 100 contradictions, spotting troubling patterns like the use of misleading presentation, attempts to redefine terms that are commonly understood, and tracking information that is made possible by sharing or collecting data that can be used to derive sensitive information. As a result, automated privacy policy analysis is significantly improved by PolicyLint.