If you hadn’t heard the fuss about DeepSeek over the weekend, there is a good chance you’ll at least have heard the term by now. It rose to fame because it provided a genuine competitor to ChatGPT at a fraction of the price, and it has caused turmoil in the stock market, seeing tech share prices plummet. Nvidia in particular suffered a record-breaking $600 billion share price drop, the largest share price drop in history.
Released by a Chinese startup of the same name, DeepSeek is a free AI chatbot with ambitions to take on the likes of OpenAI’s ChatGPT. There are also new models with some multimodal capabilities, mainly in image creation and analysis. It has taken the AI world by storm and is still the number one app in Apple’s App Store in the United States and the United Kingdom.
The app and website proved popular, with DeepSeek experiencing an outage and a reported ‘malicious attack’ the same day it rose to fame.
While Sam Altman, OpenAI’s chief executive, responded, we also heard from Nvidia, arguably the global leader in AI chips, which has risen in prominence as the AI wave has continued to grow.
In an emailed statement to TechRadar, Nvidia wrote, “DeepSeek is an excellent AI advancement and a perfect example of Test Time Scaling. DeepSeek’s work illustrates how new models can be created using that technique, leveraging widely-available models and compute that is fully export control compliant. Inference requires significant numbers of NVIDIA GPUs and high-performance networking. We now have three scaling laws: pre-training and post-training, which continue, and new test-time scaling.”
It’s certainly strong, calling DeepSeek “an excellent AI advancement,” which speaks to the performance of DeepSeek’s R1 model. It also confirms what we knew: new models can be established using existing models and chips rather than creating entirely new ones.
Nvidia clearly wants to remain a key part, noting that this type of rollout requires a lot of Nvidia GPUs and plays off the fact that DeepSeek used China-specific Nvidia GPUS. Reading between the lines, it also hints that DeepSeek will need more of its chips … at some point.
DeepSeek claims it used an innovative new training process to develop its LLMs using trial and error to self-improve. You could say it trained its LLMs in the same way that humans learn, by receiving feedback based on their actions. It also utilized an MoE (Mixture-of-Experts) architecture, meaning it activate only a small fraction of its parameters at any given time, significantly reducing the computational cost, making it more efficient.
Sam Altman also praised DeepSeek’s R1 model, “particularly around what they’re able to deliver for the price.” He reiterated that OpenAI will “obviously deliver much better models,” but he welcomed the competition. Nvidia seems to be keeping its future cards closer to its chest.
It’s still a waiting game of sorts to see when DeepSeek AI will turn back on new sign-ups and get back to full performance, but if you’re curious about its staying power, read my colleague Lance Ulanoff’s – TechRadar’s Editor-at-Large – thoughts on its chances of sticking around in the United States. As well as our hands-on of DeepSeek AI versus ChatGPT from John-Anthony Disotto, one of TechRadar’s AI experts.