ChatGPT and privacy: What happens to your personal data?

Privacy news

•

8 mins

•

12.05.2023

Written by

Greg Govin

Within months of its release, ChatGPT broke the record for the fastest-growing consumer application in history. Over 100 million people tried out its advanced capabilities as a chatbot: answering questions in a somewhat accurate and fairly human-sounding manner, as well as writing computer code and even music.

Much has been discussed about AI’s potential to disrupt sectors like art and law, but much less has been said about its implications for our privacy. With ChatGPT running afoul with lawmakers in Italy—with the country temporarily banning the service from March 31 to April 28—that’s finally changing.

After all, AI is trained using data from the web, and if you’ve ever written anything online, chances are, ChatGPT has read it. Part of that massive pile of data? Some of our personal information.

How does ChatGPT collect your personal data?

One way ChatGPT gets hold of personal information is when data is uploaded in bulk to train the model. That’s troubling in itself because you don’t get to consent to the usage of your information. For example, if you recounted a personal story on a Reddit AMA, that comment could be used by OpenAI (ChatGPT’s creator) to train ChatGPT.

Another way? Through using ChatGPT. The chatbot collects various details about your session, including your conversations and anything else you type. According to OpenAI’s privacy policy, collected data includes:

Log data. Your IP address, browser type, and settings, date and time of your usage, and how you interacted with the tool. ChatGPT also collects and stores your chats.

Usage data. Aside from how you use and engage with the service, usage data also entails the collection of your location by time zone and country, software version, type of device, connection information, and more.

Device information. The operating system and device you use to access ChatGPT, along with browser information, gets collected.

Cookies. OpenAI uses cookies to store data on your browsing activity, tracking you across the web and for analytics.

User content. OpenAI collects information that you upload or enter into ChatGPT. This means that everything you’ve typed or uploaded into the tool gets stored.

Communication information. If you reach out to OpenAI support or opt-in to their newsletters, your personal information and the messages you send get stored.

Social media information. If you interact with OpenAI on social media, they collect the information made available on your profile. This may include your phone number and email address if you’ve filled in those fields.

Account information. Information you provide to create an account, like your name, contact information, and payment information, gets stored.

You might be thinking that that’s not much different from what other websites collect, and you wouldn’t be wrong. But let’s bring your attention to the collection of “user content” and how ChatGPT works.

ChatGPT isn’t designed to work like a search engine; it’s designed to converse with you. While you can use it to look up the recipe for chocolate cake, it won’t feed you results in the same way Google would. Instead, ChatGPT “talks” about the recipe with you.

This can lead to a false sense of security and might tempt users to share information they normally wouldn’t in a Google search. ChatGPT’s effectiveness as a work tool has also led to the leak of confidential information, which Samsung learned about the hard way when employees let the chatbot record meetings and check proprietary code. Since OpenAI collects all this user content, this information, private or not, is never deleted.

ChatGPT gets blocked in Italy

On March 31, Italy issued a temporary ban on ChatGPT over its use of personal information in its training data. Italy’s data regulator (Garante per la Protezione dei Dati Personal) contended that OpenAI didn’t have the legal right to use personal information in training ChatGPT. The ban was lifted toward the end of April after ChatGPT made privacy-related changes for European users.

Other countries are also scrutinizing ChatGPT on its privacy protections, or lack thereof. So what specifically did Italy object to?

As a European Union member, Italy enforces GDPR rules, which require someone to expressly consent to the collection of their data. This is typically done with a pop-up that lets you accept or reject the collection of your data. It’s something that OpenAI failed to do. Italy believes that ChatGPT had four problems under GDPR rules. They were:

No age controls to stop people under 13 from using ChatGPT
ChatGPT could provide false information about people
People were not told that their data was collected
There was no “legal basis” for the collection of personal information in data used to train ChatGPT

Italy’s temporary ban on ChatGPT marked the first instance of a Western nation taking action against the generative AI tool, and it’s unlikely to be the last.

Privacy complaints push more European regulators to confront ChatGPT privacy issues

Artists and media companies have complained about the permissionless use of their works to train generative AI. European regulators are now saying the same about the use of people’s personal information. Ever since Italy blocked ChatGPT, several other European countries are now starting to investigate the AI tool for its privacy issues.

In France, data privacy regulator CNIL is investigating ChatGPT following “several” privacy complaints. The French Digital Ministry has said that ChatGPT is not in compliance with GDPR but has stopped short of banning it. However, the CNIL could enact a ban.

Elsewhere, Spain has requested the involvement of the European Data Protection Board in tackling privacy concerns with ChatGPT. The move could see ChatGPT put on the Plenary of the European Data Protection Committee’s discussion schedule, potentially driving coordination of EU GDPR actions involving AI models.

ChatGPT privacy concerns

Let’s recap some of the main privacy concerns surrounding ChatGPT.

1. ChatGPT collects a lot of your data

We’ve already gone over this above, but it bears emphasizing. Anything you input into ChatGPT could be added to ChatGPT’s database. This information could be sensitive, such as Samsung’s confidential meeting transcripts or client information that a lawyer wants to incorporate into legal documents.

2. Transparency about data use and GDPR compliance

One of Italy’s primary complaints was that ChatGPT lacked information to users about the personal data that was collected, as well as a legal basis for the collection and storage of personal data among its data used for training the platform’s algorithms.

ChatGPT has since provided a description of the personal data used for AI training algorithms and explicitly says that anyone can opt out. However, experts say it has yet to provide a strong legal basis for processing people’s information.

ChatGPT also made its privacy policy more visible on OpenAI’s home page.

3. Ability to opt out of personal data collection

Since Italy’s ban of ChatGPT, the site made it easier for users to find the opt-out form to exclude their user content from being used for improving AI model performance. It also added a new form for EU users to object to the use of their personal data to train models.

However, experts have raised questions about how ChatGPT could comply with the GDPR’s “right to be forgotten” rule—with which people in the EU can request to be excluded from search results. All that information you might inadvertently give to ChatGPT could be used to train its algorithms. Analysts have said that separating it back out could be almost impossible.

4. Phone number registration

When you sign up for ChatGPT, you need to provide a phone number and enter a verification code sent to you. While this measure checks that you are a human and helps the platform prevent spam bots from using its service, it also raises concerns about lack of anonymity for users.

5. Age controls

Countries all over the world have laws that protect the use and collection of children‘s data. Under GDPR rules, it is unlawful for companies to process the data of children under 16 without parental consent, although member states may lower that age to 13.

In OpenAI’s terms of use, they state that users of ChatGPT “must be at least 13 years old.” Among the changes made in April, one of them was adding a button for Italy’s users to confirm that they meet the age requirements.

As is the case for numerous platforms, it is difficult to enforce age requirements, and young children’s data is easily collected along the way.

How to protect your privacy when using ChatGPT

1. Don’t share sensitive information

Confidential information is confidential for a reason. You should avoid sharing sensitive information of any sort, be it work or personal, since what you enter into ChatGPT gets stored on OpenAI’s servers. It’s probably best to assume that someone has access to this data. Plus, in the event of a security breach, your private information could fall into the wrong hands.

2. Use a VPN

When you connect to the internet via a VPN for ChatGPT, your traffic to and from ChatGPT servers is encrypted, keeping it protected from malicious actors looking to steal or corrupt your data. A VPN can also help increase your anonymity when using ChatGPT by masking your IP address and location, reducing the amount of data you hand over to ChatGPT.

3. Opt out of ChatGPT’s personal data processing

Amid increased scrutiny, OpenAI introduced new ChatGPT data controls. You can disable your chat histories through your account settings. Doing so will see your conversations with ChatGPT deleted after 30 days. However, this doesn’t seem to stop the use of your data in training ChatGPT. To opt out of having your personal and private information used to train ChatGPT, fill out the OpenAI Data Opt Out Google form. People in Europe covered under GDPR rules can request OpenAI to remove their personal data using another form.

FAQ: About ChatGPT privacy issues

Why does ChatGPT want my phone number?

According to OpenAI, a phone number is required when creating an account to verify that you are human and stave off spam bots. During the account creation process, you will need to enter a verification code that is sent to your mobile phone via the number you’ve entered. Once you’ve verified yourself, you can start using ChatGPT.

Note that you are unable to remove or change the number associated with your OpenAI account.

What are the issues with ChatGPT?

The rise of ChatGPT and generative AI tools, in general, has brought with it many concerns. Broadly, they are issues pertaining to copyright violations, identity fraud, loss of privacy, misinformation, and AI’s potential to replace jobs.

Does ChatGPT leak data?

Like all software, ChatGPT is vulnerable to cybersecurity risks that could result in a compromise of data. It suffered a bug-related data leak earlier this year that exposed the chat histories and payment information of some users.

A greater source of concern is users leaking confidential data to ChatGPT, and the risk of ChatGPT then regurgitating that data to other users. Information entered into ChatGPT is stored on OpenAI’s servers and used to further train the AI tool, which is why users should exercise caution and not enter confidential or personal information into ChatGPT.