Data Politics, Privacy, and Surveillance
Data collection and use should be guided by a purpose and underlying policies that guide its use. Companies interested in data collection need to set their motives and include the details in their data privacy policies. However, recent studies indicate some reluctance on data security and protection of personal information. Therefore, there is the need to describe the rules that control the integrity, quality, security, and use before collection. These data policies create the foundation for data, categorization, and the rules guiding its use. In this era of large data, artificial intelligence, and Machine Learning, people have lost control of how their data should be collected and used. Data privacy has different dimensions depending on the context under discussion. In general, privacy entails having reasonable control over the flow of information . Data exists in different forms, some of which are encoded or require close monitoring to gain insights. Surveillance entails closely monitoring several activities, behaviors, or information to gather, direct or influence information. Data privacy, policies, and surveillance affect people positively and negatively regarding data collection, profiling, and protection.
How does the reliance on data categorization affect people?
First, Data categorization affects how a given algorithm interprets a particular data set. Attempting to match a given data set to a particular trait generalizes that characteristic of the data set regardless of the differential features of a given set. Data categorization erodes data privacy and disrupts the norm of consent. For instance, according to the article by Karen Hao, researchers motivated by deep learning refrain from seeking people’s consent when categorizing data for surveillance. In addition, data categorization leads to messier data sets. The algorithms used in data categorization may unintentionally include details that do not correspond to the particular set. Some algorithms are highly inefficient and reveal photos of minors, sexist or racist labels, and inconsistent outcomes. The wrong categorization of data sets negatively affects innocent people during surveillance, especially when authorities implement social control measures. A good example is the false arrest of two Black men in the Detroit area in 2020 . The arrest led to unrest among the black community.
Data categorization often used in facial recognition has proven to be effective in resolving criminal cases. However, for accuracy purposes, there is a need for incorporating other mechanisms. There have been reports of false profiling and identification of crime because of data categorization. The common scenario is the profiling of the black community and its linkage to crime. Therefore, there is a need for continued research on developing categorization techniques that would eliminate bias.
To what extent do web-based data collection processes violate individual privacy?
The web-based data collection process is the most unsafe form of data collection that openly violates individual privacy. Web-based data collection algorithms force one to collect sensitive information about individuals while openly violating their privacy. All this information is hoarded to help in building predictive models. The key factors contributing to privacy issues in web-based data collection processes are that they are inherently open, and the exchange of information involves sensitive personal data. The information is collected through cookies, software downloads, web beacons, federated identity, and screen scraping. All these approaches are unknown to the users of the website, making them more vulnerable.
Most web-based data collection algorithms are unknown to users, and some unknowingly provide personal information that may even have financial implications. For instance, the 2017 Equifax breach compromised more than 140 million Americans’ personal identification information . Data breach is among the biggest ethical issues that show how web-based data collection can affect individual privacy. Consumers who never transacted were affected by the breach after it rippled through the financial system. As a result, the unauthorized web-based data mining exposed innocent consumers to the negative economic impacts of Equifax’s credit score
Is surveillance necessary to produce accurate, reliable, and large-scale data sets?
Surveillance focuses on tracking behaviors, activities, and information to gather information, direct or manage a phenomenon, or influence an event. Data surveillance helps government entities in formulating predictions and preparing for events in the future. Disease control and management is an example of areas where surveillance has found numerous applications. Surveillance is necessary for producing accurate, reliable, and large-scale data sets. For instance, since the outbreak of COVID-19, public health surveillance data continue to monitor the disease trends, detect the outbreak phases, and guide response activities. Countries are currently implementing COVID 19 action plans and policies and evaluating health interventions in response to the pandemic. For instance, At the peak of COVID-19 infections in the US, the government enhanced testing efforts to respond to the pandemic. The actions responded to a prediction made after data surveillance .
Recent technological developments have introduced new surveillance tools capable of producing large data sets and processing data for security purposes. The capacity of big data intensifies surveillance by expanding related data sets and analytical tools. The is a need to monitor aspects like risk management and control to determine the scope of surveillance and the speed through new techniques. The quest for pattern discovery in surveillance justifies the unprecedented data access . The introduction of new surveillance technologies increases surveillance capacity, and tools like CCTVs enhance data accuracy.
In conclusion, several companies are investing in data mining, analysis, and big data. All the investments are geared towards developing predictive algorithms that help determine customer needs. Most companies violate data privacy and protection policies. Users are the most affected group. Despite having data privacy policies and guidelines, users unknowingly give their data to websites through accepting cookies and signing up on websites. Therefore, data privacy remains a big concern as we move towards big data and artificial intelligence. More studies need to be done on creating awareness of data privacy, surveillance, and data protection policies. As we continue to look at the effects of data categorization and violation of data privacy by web-based data collection processes, it is the user’s role to maintain vigilance while using websites.
 Marc Pelteret and Jacques Ophoff. A review of information privacy and its importance to consumers and organizations. Informing Science: The International Journal of an Emerging Transdiscipline, 19. (2016), 277–301. DOI: https://doi.org/10.28945/3573
 Karen Hao. 2021. This is how we lost control of our faces. Retrieved November 1, 2021 from https://www.technologyreview.com/2021/02/05/1017388/ai-deep-learning-facial-recognition-data-history/.
 Cameron F. Kerry. 2018. Why protecting privacy is a losing game today—and how to change the game. (July 2018). Retrieved November 1, 2010 from https://www.brookings.edu/research/why-protecting-privacy-is-a-losing-game-today-and-how-to-change-the-game/.
 David R. Holtgrave, Sten H. Vermund, and Leana S. Wen. Potential benefits of expanded COVID-19 surveillance in the US. JAMA Network, 326, 5. (July 2021), 381-382. DOI: https://dx.doi.org/10.1001/jama.2021.11211
 David Lyon. Surveillance, Snowden, and Big Data: Capacities, consequences, critique. Big Data & Society, 1, 2. (July 2014), 1-13. DOI: https://doi.org/10.1177/2053951714541861.