Is DeepSeek really sending data to China? Let’s decode

Last week, Chinese startup DeepSeek sent shockwaves in the AI community with its frugal yet highly performant open-source release, DeepSeek-R1. The model uses pure reinforcement learning (RL) to match OpenAI’s o1 on a range of benchmarks, challenging the longstanding notion that only large-scale training with powerful chips can lead to high-performing AI. 

However, with the blockbuster release, many have also started pondering the implications of the Chinese model, including the possibility of DeepSeek transmitting personal user data to China. 

The concerns started with the company’s privacy policy. Soon, the issue snowballed, with OpenAI technical staff member Steven Heidel indirectly suggesting that Americans love to “give away their data” to the Chinese Communist Party to get free stuff.

The allegations are significant from a security standpoint, but the fact is that DeepSeek can only store data on Chinese servers when the models are used through the company’s own ChatGPT-like service. 

If the open-source model is hosted locally or orchestrated via GPUs in the U.S., the data does not go to China. 

Concerns about DeepSeek’s privacy policy

In its privacy policy, which was also unavailable for a couple of hours, DeepSeek notes that the company collects information in different ways, including when users sign up for its services or use them. This means everything from account setup information — names, emails, numbers and passwords — to usage data such as text or audio input prompts, uploaded files, feedback and broader chat history goes to the company.

But, that’s not all. The policy further states that the information collected will be stored in secure servers located in the People’s Republic of China and may be shared with law enforcement agencies, public authorities and others for reasons such as helping investigate illegal activities or just complying with applicable law, legal process or government requests. 

The latter is important as China’s data protection laws allow the government to seize data from any server in the country with minimal pretext.

With such a range of information on Chinese servers, a myriad of things can be triggered, including profiling individuals and organizations, leakage of sensitive business data, and even cyber surveillance campaigns.

The catch

While the policy can easily raise security and privacy alarms (as it already has for many), it is important to note that it applies only to DeepSeek’s own services — apps, websites and software — using the R1 model in the cloud.

If you have signed up for the DeepSeek Chat website or are using the DeepSeek AI assistant on your Android or iOS device, there’s a good chance that your device data, personal information and prompts so far have been sent to and stored in China. 

The company has not shared its stance on the matter, but given that the iOS DeepSeek app has been trending as #1, even ahead of ChatGPT, it’s fair to say that many people may have already signed up for the assistant to test out its capabilities — and shared their data at some level in the process. 

The Android app of the service has also scored over a million downloads.

DeepSeek-R1 is open-source itself

As for the core DeepSeek-R1 model, there’s no question of data transmission. 

R1 is fully open-source, which means teams can run it locally for their targeted use case through open-source implementation tools like Ollama. This ensures the model does its job effectively while keeping data restricted to the machine itself. According to Emad Mostaque, former founder and CEO of Stability AI, the R1-distill-Qwen-32B model can run smoothly on the new Macs with 16GB of vRAM.

As an alternative, teams can also use GPU clusters from third-party orchestrators to train, fine-tune and deploy the model — without data transmission risks. One of these is Hyperbolic Labs, which allows users to rent a GPU to host R1. The company also allows inference via a secured API.

That said, in case one’s looking just to chat with DeepSeek-R1 to solve a particular reasoning problem, the best way to go right now is with Perplexity. The company has just added R1 to its model selector, allowing users to do deep web research with chain-of-thought reasoning.

According to Aravind Srinivas, the CEO of Perplexity, the company has enabled this use case for its customers by hosting the model in data center servers located in the U.S. and Europe. 

Long story short: your data is safe as long as it’s going to a locally hosted version of DeepSeek-R1, whether it’s on your machine or a GPU cluster somewhere in the West.

Read more here: https://venturebeat.com/security/is-banning-ransomware-payments-key-to-fighting-cybercrime/