Hugging Face wants to reverse engineer DeepSeek R1's reasoning model

0 3

By Matthew Griffin Intelligence and the Senses 12th February 2025

WHY THIS MATTERS IN BRIEF

Reverse engineering an AI is very very difficult, so it’s going to be interesting to see how this goes.

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trends, connect, watch a keynote, or browse my blog.

After Chinese researchers released DeepSeek R1, which tipped the US Artificial Intelligence (AI) industry on its head and tanked stocks by over $1 Trillion, researchers from Hugging Face say they’re attempting to recreate Chinese startup DeepSeek’s R1 “reasoning model” by reverse engineering it.

The initiative comes after R1 stunned the AI community by matching the performance of the most capable models built by US firms, despite being built at a fraction of the cost at just $5.6 Million. Hugging Face researchers say the Open-R1 project aims to create a fully open-source duplicate of the R1 model and make all of its components available to the AI community.

Inside Deepseek R1, by AI Futurist Matthew Griffin

Elie Bakouch, one of the Hugging Face engineers leading the project, told TechCrunch that though DeepSeek claims R1 is open-source because it can be used without any restrictions, the truth is that it doesn’t meet the standard definition of open software. That’s because many of the components used to build it, and also the data it was trained on, have not been made publicly available.

The lack of information about what goes into DeepSeek means that it’s really just another “black box,” similar to proprietary models such as OpenAI’s GPT series, making it impossible for the AI community to build on or improve, he said.

DeepSeek, which is operated by Hangzhou DeepSeek AI Company and Beijing DeepSeek AI Company, hit the headlines last week when it made its two primary reasoning models – DeepSeek-R1-Zero and DeepSeek-R1 – available on Hugging Face. At the same time, it also published a paper on arxiv.com outlining the development process behind the models.

The R1 model has caused intense excitement with its apparent ability to match the performance of advanced LLMs like OpenAI’s GPT-4o and Anthropic PBC’s Claude, even though it was built at a total cost of just $5.6 million, according to its developer. In contrast, OpenAI and other American firms like Google and Meta have spent billions of dollars on developing their own models.

DeepSeek’s model demonstrates that it’s possible to make the same kind of progress without breaking the bank, and the revelation caused chaos in the financial markets with the stocks of US companies involved in AI development tanking last Monday. The AI chipmaker Nvidia saw its stock fall 15%, while Broadcom shares were down 16% and TSMC dropped 14%.

At the same time, DeepSeek’s iOS chatbot application, which provides free access to the R1 model, emerged from nowhere to become the No. 1 productivity app on the Apple App Store this week.

The Chinese company claims that it developed R1 with fewer, and much less advanced GPUs than the ones that were used to develop models like GPT-4o and Llama 3, raising questions about whether the multibillion-dollar investments being made in AI are really necessary. On a number of benchmarks, R1 has shown it’s able to match or even surpass the performance of OpenAI’s o1 reasoning model.

Reasoning models are notable for their ability to “fact-check” their responses before they output them, helping to avoid the “hallucinations” that plague more standard large language models. They generally take a little longer to generate their responses, as these accuracy checks take a little time, but it makes them much more reliable in areas such as physics, science and math.

Hugging Face says it’s attempting to replicate R1 to benefit the AI research community, and it intends to do so in just a few weeks. To do this, it will leverage the company’s dedicated research server, the “Science Cluster,” which is powered by 768 Nvidia H100 GPUs. The plan is to try to reverse engineer the R1 model to try and understand what data was used to train it, and which components were used in its creation.

The Open-R1 project is seeking assistance from the broader AI research community to try and recreate the training datasets used by DeepSeek, and it has garnered a lot of interest so far, with its associated GitHub page getting more than 100,000 stars just three days after its launch.

Despite the initial enthusiasm from the AI community, it may be difficult for Hugging Face’s researchers to pull this off and make a version of R1 that’s close to the real thing, analyst Holger Mueller of Constellation Research says.

“Hugging Face wants to reverse engineer DeepSeek’s model because it has all of the attention right now, and if it can do this, it will increase transparency and improve confidence for users,” Mueller said. “But without the underlying datasets used by DeepSeek, it will be challenging for them to do this. Still, Hugging Face’s researchers are good at what they do, so let’s wait and see what they come up with.”

Bakouch said the project is not a zero-sum game, but rather the start of something that will hopefully be much more beneficial for the wider AI industry. He said he hopes that whatever they manage to build will eventually become the foundation of a new generation of even more advanced open-source reasoning models. If they can recreate R1, the entire AI community will be able to look at how it works and try to improve on it, he explained.

“Open-source development immediately benefits everyone, including the frontier labs and the model providers, as they can all use the same innovations,” he said.

Matthew Griffin / About Author

Matthew Griffin is a multi-award winning Futurist and expert in Disruption and Innovation, Geopolitics, Leadership, and Technology, who NASA have described as a "walking encyclopaedia of the future" and a "futurist Polymath." 15-time best selling author of the "Codex of the Future" series, Matthew is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working with royal households, world leaders, G7, G20, and G77 governments, NGOs, and multi-national mid and mega cap firms to help them explore, shape, and lead the next 50 years of business and society.

An award-winning YouTube creator with over a million followers, with an unrivalled global reach and impact, Matthew is a highly sought-after international keynote speaker, lecturer, and mentor who collaborates with global leaders through the United Nations Alliance of Civilizations (UNAOC) and United Nations General Assembly (UNGA) to shape pivotal initiatives such as the UN’s AI for Humanity program, the United Nations Conference of the Parties (UN COP), and the World Economic Forum in Davos.

As the former Global Head of Cloud, National Security, and Enterprise Sales for companies including Atos, Dell-EMC, and IBM, Matthew has a proven track record of building multi-billion dollar business units and turning failing divisions into market leaders. His ability to identify, analyse, and communicate the implications of hundreds of emerging technologies and trends is unparalleled, and his insights are trusted by many of the world’s most respected organisations, including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi, Coca-Cola, Dentons, Deloitte, Dow Jones, EY, Google, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, Siemens AG and Siemens Energy, T-Mobile, UBS, VISA, Walmart, Workday, Worldpay and many others.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.