High quality anonymization in Big Data

Think through the reading on privacy. Both the MK book and articles focus on issues around anomization. Think about the Netflix prize and Geo-location data obtained by the NYT. How important do you think high quality anonymization is going to be to the future of Big Data for use by researchers, industry and government? Do you think there are cases where actors like the government should be able to pierce privacy? Think about how much you value privacy when answering the above questions, contextualize it for your own life.


Big Data and Privacy

Data anonymization entails protecting sensitive and private data and information through encryption or erasure of identifiers that link an individual to the data stored. For instance, Personally Identifiable Information like social security numbers, names, and phone numbers, if run through anonymization, could help safeguard one’s privacy. According to Arbuckle and El (2020), big data affects operations in the healthcare, transport, and communication sectors among other industries. However, big data and data privacy go in hand. Governments control and regularly assess how companies collect, process, store, and utilize the public’s private information. Data anonymization is fundamental in big data privacy. If well-choreographed, data anonymization enables one to identify one given entry point in a large dataset.

High quality big data anonymization will be important in future to governments, researchers, and industries as it safeguards data against insider exploitation and misuse. According to Weber et al. (2018), governments or researchers issue regulations and rules on big data. Anonymization comes in handy to enforce such regulations on big data. Anonymization also helps governments and companies maintain the public’s trust as it provides foolproof methods of securing personal, sensitive, and confidential data. Governments hold extremely complex data that require caution and confidentiality in its handling which can only be guaranteed through big data anonymization. Anonymization also helps big data enforce consistency and governance. Anonymization maintains accurate and clean data on behalf of the public. According to Arbuckle and El (2020), anonymization safeguards data both in databases and in public platforms such as apps and web applications. Anonymization of data prevents companies from personalizing user experience or implementing marketing designed for given customers.

There are cases where actors like the government should be able to pierce privacy.  For instance, governments could pierce privacy if the public’s security is at stake. Cybercriminals or hackers’ data stored by government or companies could be availed if the need arises for the public’s safety. According to Birkinshaw et al. (2019), the government might also pierce privacy if they need access to private information regarding outlawed people or gangs that terrorize the public. However, government employees should not pierce or access private data for individual gains like spreading campaign propaganda.


Arbuckle, L., & El, E. K. (2020). Building an Anonymization Pipeline: Creating Safe Data. Sebastopol: O’Reilly Media, Incorporated.

Birkinshaw, P., Varney, M., & Bloomsbury Professional. (2019). Government and information: The law relating to access, disclosure and regulation. London: Bloomsbury Professional.

Weber, R. H., & Heinrich, U. I. (2018). Anonymization. London: Springer London.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *