Microsoft probing If DeepSeek-linked group improperly obtained OpenAI data

Dina Bass and Shirin Ghaffary, Bloomberg News on Jan 29, 2025

By clicking submit, I authorize Arcamax and its affiliates to: (1) use, sell, and share my information for marketing purposes, including cross-context behavioral advertising, as described in our Privacy Policy , (2) add to information that I provide with other information like interests inferred from web page views, or data lawfully obtained from data brokers, such as past purchase or location data, or publicly available data, (3) contact me or enable others to contact me by email or other means with offers for different types of goods and services, and (4) retain my information while I am engaging with marketing messages that I receive and for a reasonable amount of time thereafter. I understand I can opt out at any time through an email that I receive, or by clicking here

Microsoft Corp. and OpenAI are investigating whether data output from OpenAI’s technology was obtained in an unauthorized manner by a group linked to Chinese artificial intelligence startup DeepSeek, according to people familiar with the matter.

Microsoft’s security researchers in the fall observed individuals they believe may be linked to DeepSeek exfiltrating a large amount of data using the OpenAI application programming interface, or API, said the people, who asked not to be identified because the matter is confidential. Software developers can pay for a license to use the API to integrate OpenAI’s proprietary artificial intelligence models into their own applications.

Microsoft, an OpenAI technology partner and its largest investor, notified OpenAI of the activity, the people said. Such activity could violate OpenAI’s terms of service or could indicate the group acted to get around OpenAI’s restrictions on how much data they could obtain, the people said.

DeepSeek earlier this month released a new open-source artificial intelligence model called R1 that can mimic the way humans reason, upending a market dominated by OpenAI and U.S. rivals such as Google and Meta Platforms Inc. The Chinese upstart said R1 rivaled or outperformed leading U.S. developers’ products on a range of industry benchmarks, including for mathematical tasks and general knowledge — and was built at a fraction of the cost. The potential threat to the U.S. firms’ edge in the industry sent technology stocks tied to AI, including Microsoft, Nvidia Corp., Oracle Corp. and Google parent Alphabet Inc., tumbling on Monday, erasing a total of almost $1 trillion in market value.

OpenAI didn’t respond to a request for comment, and Microsoft declined to comment. DeepSeek and hedge fund High-Flyer, where DeepSeek was started, didn’t immediately respond to requests for comment via email.

Major tech stocks including Nvidia and Microsoft have gained ground since the market rout on Monday. Shares of Nvidia fell 2.7% as the markets opened on Wednesday after closing at $128.99 on Tuesday, a gain of almost 9% over Monday. Microsoft dropped 1.2% after closing at $447.20, a 3% increase. Chip-machine maker ASML Holding NV, which closed 7% lower on Monday, posted its biggest intraday gain in four years after beating earnings expectations on Wednesday.

David Sacks, President Donald Trump’s artificial intelligence czar, said Tuesday there’s “substantial evidence” that DeepSeek leaned on the output of OpenAI’s models to help develop its own technology. In an interview with Fox News, Sacks described a technique called distillation whereby one AI model uses the outputs of another for training purposes to develop similar capabilities.

“There’s substantial evidence that what DeepSeek did here is they distilled knowledge out of OpenAI models and I don’t think OpenAI is very happy about this,” Sacks said, without detailing the evidence.

In a statement responding to Sacks’ comments, OpenAI didn’t directly address his comments about DeepSeek. “We know PRC based companies — and others — are constantly trying to distill the models of leading U.S. AI companies,” an OpenAI spokesperson said in the statement, referring to the People’s Republic of China. “As the leading builder of AI, we engage in countermeasures to protect our IP, including a careful process for which frontier capabilities to include in released models, and believe as we go forward that it is critically important that we are working closely with the U.S. government to best protect the most capable models from efforts by adversaries and competitors to take U.S. technology.”

In its own research, DeepSeek said it had “distilled” models from its R1 system based on other open-source systems. Unlike OpenAI’s closed systems, some models such as Meta’s Llama are open-source and freely available for use.