Data Privacy in AI Chatbots and the Looming Dark Web
With the constant change and development in the world of technology, the newest fascination is the advent of advanced Artificial Intelligence Tools. What saw its inception with ChatGPT, an artificially intelligent chatbot created by OpenAI that seems to be able to answer almost any query presented to it, the techno-sphere is on a roll, with each tech-giant rushing to create a competing AI chatbot. These AI-based technologies work with large and heavy data to create their outcomes. This often requires them to access data regarding our preferences and the information we disclose in order to cater to our needs and create tailor-made outcomes. Since AI chatbots is trained using data that is readily accessible to the public, it can be said that they are utilizing open-source intelligence (OSINT). Although not actively searching the internet or other sources for information, AI chatbots do rely on the data it was trained on to produce responses to user inputs. This raises the evergreen question of data protection and privacy. While the chatbots like ChatGPT might only access websites or databases as it a language model and information source rather than a conversationalist, certain other bots such as Replika or Woebot that engage in actual conversations which may include personal information and other data that can be misused, data privacy becomes a serious concern.
Working of data in AI
AIs are of various types and each of them require different sets of data. However, an AI’s ML algorithms will gather data and learn as it is programmed to do because it is designed to learn and carry out its responsibilities as efficiently as possible. Until the right data-gathering frameworks and parameters are included in its coding, AI won’t raise concerns about data privacy. To consider an example of how much data can be assimilated, recently Amazon opened up the ‘Just walk out’ store, the customers can enter the shop by entering a linked credit or debit card, scanning a QR code in the Amazon app, or scanning their palm.
As a customer leaves a shop, they are immediately charged for the products in their virtual cart and receive a digital receipt. The AI tracks the products that they pick while surfing the aisles. The data it collects is not only limited to what the consumer picks, but also to what they browse and what their demands are. This is heavy data that the company can easily chart the data and offer personalized options to each consumer. This prima facie benefit is only the tip of the iceberg of the personal data being used by AI tools.
AI chatbots follow measures that includeend-to-end encryption, access control, data anonymization, transparency of compliance, etc. However, this does not rule out the possibility of harm to data privacy due to other factors such as security breaches, or AI self-learning which can cause the bot to act outside the programmed safety measures. Each of these possibilities is discussed below.
The Dark Web and its Impact on AI
The dark web is often known as the go-to for cyber crimes given its anonymity. It is the secret network of websites that can only be accessed with a specialised web browser. To access websites on the Surface Web, traditional search engines frequently employ “web crawlers.” Crawling is the practise of gathering websites from the internet so that search engines may classify and index them. However, content on the Deep (and Dark) Web may not be indexed by typical search engines due to a variety of factors, including the possibility that it is unstructured, unlinked, or transitory content.
On the dark web, hackers can provide a variety of services, including as breaking into websites and obtaining private data like financial and personal information. They might also employ phishing assaults, which persuade victims to divulge sensitive data like login passwords. Once the hackers get this data, they can sell it on the dark web, giving cybercriminals a market to make money off of stolen data.
Hundreds of millions of people’s personally identifiable information were taken from Equifax in March 2017, one of the credit reporting companies that evaluates the financial situation of almost everyone in the United States. Because the systems weren’t sufficiently isolated from one another, the attackers were able to go from the web portal to other servers and were able to uncover usernames and passwords stored in plain text that subsequently allowed them to access even more systems.
Added problems: Laws and the public
The issue with the legal regime comes under two folds, the laws against the dark web and the laws governing privacy. India lacks on both these ends leaving the consumers vulnerable to hacking attacks, in the case that they happen. While the programmers themselves might try their best to prevent such mishaps through the measures looked at earlier, it is not impossible and in the case of such a breach, there exists no law in India to fix liability for the damage and losses.
Firstly, in respect of the dark web, it is allowed to use the dark web in India, and that presents unique difficulties for law enforcement. India’s regulations governing the internet are not strict. The laws in our country have many flaws, and the dark web has its own special issues. The Information Technology Act, of 2000 only contains six sections that deal with cybercrimes in India. While surfing the dark web is legal, performing illegal activities are not protected or outside the legal regime. However, for the act of data privacy breach to be illegal, the data protection laws have to deem it so which brings us to the real problem.
India’s data privacy laws are not exhaustive enough to cover such crimes as data leakage and unlawful trading. They are nascent and only provide for the basic protection of privacy through Article 21 and the IT Act which are not very stringent in cracking down on the issue at hand as no penalization of such an act exists as of yet. While Section 43A of the IT Act provides that any body corporate possessing sensitive data is negligent in maintaining reasonable security of such data, it has never been enforced in that sense. The only other recourse lies in invoking the GDPR which recognizes and penalises breach of personal data. But this is a far-fetched remedy that is not viable in the common sense. While the GDPR is a extensive policy that covers multiple aspects and could help tackle the issue if adopted in some sense, in its current state, cannot particularly be enforced in the Indian territory.
Another grave issue is the lack of awareness of the importance of data safety and risks of such data breach in the general public in India. An average Indian neither realises the risks of such breach nor understands how important it is to not disclose such personal information. People tend to be indifferent to privacy protection when they are unaware of the harm caused by the misuse of their personal information. This makes regulation and enforcement even difficult than it already was.
Additionally, the classic problem of inability to fix liability of the AI bot due to the dilemma of its legal personhood adds on the plethora of problems that exist. The question of who is to be made liable for the AI being unable to safeguard the data it consumes due to it working outside its scope of programmed measures is something that needs solutions. One option could be to adopt the idea of vicarious liability or alternatively strict liability which pins the blame on the programmer as they were the creator of the AI but it does have its own limitations.
AI developing at such a rapid pace calls for new laws and regulations to ensure the safety of consumer data. Not only does the law regarding AI have to be newly developed but also the laws governing data privacy and other technology-based offences have to be revamped to make penalties more stringent.
A viable legal regime could require AI programmers to adopt transparent and stringent steps of compliance such as monitoring, extreme access controls, periodical security checks, high-level encryption and multiple levels of authorization for access like the case of health information protection under HIPAA. It could also call for secure storage vaults and minimum storage policies like in the case of information protected under the Aadhaar data guidelines in India. The liability, in absence of a clear legal personhood for an AI bot would naturally shift onto the programmer and the organisation collecting and using the data for negligence in failing to maintain the security in addition to the liability on the persons causing the breach, if any. Adopting the GDPR, even after it is modified to suit the Indian regime, could be a big step towards data protection.
However, the primary target in this respect would be to educate and enable the public at large to prevent even the possibility of a data breach. With people being cautious of what data they provide to the AI bot or on the internet in general, it would drastically decrease the amount of data requiring such higher protection in the first place.
While the entire regime pertaining to both Artificial Intelligence and data privacy are yet to be developed in general, the law governing the issues arising out of the interface is the need of the hour.
 Alford A, ‘OpenAI Releases Conversational AI Model ChatGPT’ (2022) InfoQ https://www.infoq.com/news/2022/12/openai-chatgpt/ accessed 30 March 2023.
 J Rajamäki, ‘Privacy in Open Source Intelligence and Big Data Analytics: Case “MARISA” for Maritime Surveillance’ (2020) 19(1) Journal of Information Warfare 12-25 https://www.jstor.org/stable/27033606 accessed 30 March 2023.
 Replika, https://replika.com/ accessed 30 March 2023
 Woebot Health, https://woebothealth.com/ accessed 30 March 2023.
 Thaichon Park and Sara Quach, Artificial Intelligence for Marketing Management (1st ed., 2023) 163.
 J Porter, ‘Amazon brings its cashierless tech to a full-size grocery store for the first time’, The Verge, 15 June 2021, https://www.theverge.com/2021/6/15/22534570/amazon-fresh-full-size-grocery-store-just-walk-out-cashierless-technology-bellevue-washington accessed 30 March 2023.
 OpenAI API, ‘Safety Best Practices’, Openai.com, https://platform.openai.com/docs/guides/safety-best-practices accessed 30 March 2023.
 Kaspersky, ‘What is the Deep and Dark Web?’, Kaspersky Resource Centre, https://www.kaspersky.com/resource-center/threats/deep-web accessed 30 March 2023.
C Nast, ‘Dark Web’, WIRED, https://www.wired.com/tag/dark-web/ accessed 30 March 2023.
 J Fruhlinger, ‘Equifax data breach FAQ: What happened, who was affected, what was the impact?’, CSO Online, https://www.csoonline.com/article/3444488/equifax-data-breach-faq-what-happened-who-was-affected-what-was-the-impact.html accessed 30 March 2023.
 Diganth Raj Sehgal, ‘Laws relating to the dark web in India’, iPleaders, https://blog.ipleaders.in/laws-relating-dark-web-india/#Legality_of_accessing_the_dark_web_in_India accessed 30 March 2023.
 The Constitution of India, art 21.
 S 43A, Information Technology Act 2000.
 The Wire, ‘Why Is No One Ever Penalised for Data Breaches in India?’, The Wire, https://thewire.in/tech/terminal-india-data-breach-software-punishment-diksha-ekstep accessed 30 March 2023.
General Data Protection Regulation (GDPR) 2016 (European Union) art 4.
 Deepak K and Sowmya V, ‘Data privacy in India: Current outlook and the future’, Times of India Blog, https://timesofindia.indiatimes.com/blogs/voices/data-privacy-in-india-current-outlook-and-the-future/ accessed 30 March 2023.
 Liability for Artificial Intelligence Report from the Expert Group on Liability and New Technologies -New Technologies Formation, European Commission, Report from the Expert Group on Liability and New Technologies – New Technologies Formation, https://www.europarl.europa.eu/meetdocs/2014_2019/plmrep/COMMITTEES/JURI/DV/2020/01-09/AI-report_EN.pdf accessed 30 March 2023.
 Health Insurance Portability and Accountability Act 1996 (Act of Aug 20, 1996) (US).
 Centralised Aadhaar Vault, National Informatics Centre, https://www.nic.in/servicecontents/centralised-aadhaar-vault/ accessed 30 March 2023.