An AI model that changed the fortunes of silicon valley overnight. Deep Seek has been released open source, and requires far less hardware and investment. Mike Pound is based at the University of Nottingham.
EXTRA BITS: https://youtu.be/tMm7DYTGJ44
Computerphile is supported by Jane Street. Learn more about them (and exciting career opportunities) at: https://jane-st.co/computerphile
The Deep Seek papers:
https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf
https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf
This video was filmed and edited by Sean Riley.
Computerphile is a sister project to Brady Haran’s Numberphile. More at https://www.bradyharanblog.com
source
From this, at this point in time, you'd expect OpenAI to crash (getting way too pricey), NVIDIA to do even better (because everyone will now want hardware, as stuff comes in reach), and companies like Meta being harder to predict, but perhaps more capable of pivoting. As for Microsoft, it'll be interesting to see who they want to buy next and whether they are too deeply into OpenAI.
Nice explanation, but it seems it still not possible to reproduce what they have done.
I like to give each new AI model a simple test to see how reliable its answers will be: I ask for the plot to a specific movie from the 1930s , of which I own one of the last reels in the world, and does not exist on the internet, apart from an IMDB page acknowledging that the movie was made. Every single AI model I've tested has given me a nonsense answer, guessing on the content based on the genre. When a model just straight up tells me it doesn't know anything about that movie, I'll know that it can be reliable and doesn't hallucinate gaps in its knowledge. DeepSeek invented details, so that's a fail on this metric for DeepSeek.
I call it the "humility test" and when AI can finally pass that I would bet that the information it provides will be significantly more accurate.
Am I the only one that thinks that having smaller specialised areas that can communicate to other specialised areas of the neural network is much closer to how actual brains work
Careful out there folks, this is new science that has been blessed to the public. The things that can be done with this are quite limitless and unexplored. Horrifying and powerful are the dangerous vocabulary you want to be looking out for.
Can you please do this fella a favor and adjust your cam white balance next time…
so what? the tech will catch up, what deepseek did gonna be outclassed by somebody else, it's how tech goes. now what's interesting is how people who factor the NASDAQ, they could just have sold invidia a week ago, now they buy low due to 'new chinese ai' and one month from now they just make the big money when new gpt model outclasses deepseek. it's all in the rumors and news, and here we are discussing the details.
i've been looking for source code cause everyone keep saying open source. there is no source code.
AI got stale for 2 years. DeepSeek was the answer for lack of innovation by lazy companies only throwing money at problems with no true value add.
do it smarter not harder
I would argue NVIDIA is going to sell even more videocards now in the future, since AI seems to be democratizing into mainstream, and there's always going to be a need for the cutting edge too. Companies on the cutting edge will just scale it up even further.
Do you see the reasoning capabilities of such models being able to solve unsolved maths/physics/compsci problems? If I provide it with an endless source of electricity, state of the art hardware and the correct supervision, would a reasoning model be able to solve the goldbach conjecture eventually? As humans, we have limits to how much we can THINK during a single day which translates to a week to a year to our whole lifespans, a neural network mathematician could think forever at ever increasing speeds. I am really excited to see where this takes subjects like pure maths where thinking is essentially the whole job (unlike physics, applied maths where you have to make experiments and whatnot).
Tiananmen square happened.
6:47 was very much waiting for the number 42 here
Thank you DeepSeek, your answers prove that you are a political propaganda machine.
The server is busy. Please try again later.
Why did this not come out of a university ?
Nice speech
I'm so happy for this channel
This was super helpful, thanks!
Deepseek another pump and dump instrument. AI contributed give any quality of life.any benefit the investors.
Deepseek another pump and dump instrument. AI contributed give any quality of life.any benefit the investors.
Deepseek another pump and dump instrument. AI contributed give any quality of life.any benefit the investors.
The fact that Deepseek is so openly propaganda is beyond concerning to me, open source or not. The model is babbling about the unification of the motherland and people gobble it up because its better than the things that came before.
F*ck Silicon Valley and Nvidia
Well for now I Wellcome DeepSeek!
ClosedAI and co, in general are way too expensive.
Truly fabulous explanation. Thank you very much.
Rare China W
I tried a couple of integrals (1st year calculus) that 4-o could solve with some intermediate prompting. Even with a bunch of help, DeepSeek (R1) could never come up with the completely correct answers. It did fine on programming problems and stuff like coming up with meal suggestions and shopping lists. Claude did better on the programming problems, but can’t really render math in a readable way.
Could be what Linux did to Unix.
cool
Has any of what the company behind Deep Seek claims been confirmed? So far I see alot of claims but everything comes from chinese media.
Deepfake
I think that Chinese just demonstrated how they can burst US economy bubles when it comes to tech😅 and we are all the ones who will benefit from that. That is the most powerful equalizer.
I love me some chinese propaganda
Yayy!! I'm so glad this is truly open (and just in time for fossdem 😊)
But what about Skynet?
Er, so what is a reward to AI?
Wow, so much naivety !
My research group and I had the idea of a sort of “load balanced llm” where we’d have multiple trained models, all trained to be masters of their specific topics. Math models would be trained to have this inner monologue that could solve problems step by step, coding models trained to analyze code, etc, but we turned away from it since initial tests showed we’d need a lot more time to make it work, which wasn’t viable for the sort of project we were trying to do. Kinda cool to see that a similar idea to ours was able to throw Wall Street into a frenzy
Another huge technology communism contributed with to the free world.
What deepseek's model actually did was to expose a multibillion group of companies (mostly US based) fraud, insisting that training the so-called "AI" models require continuous investments in new and expensive hardware. Deepseek created an "AI" model that have better performance compared to the well known models at a fraction of the cost in hardware and energy. That's a marvel of engineering and creative thinking, which is 100% HI (Human Intelligence) outcome. It's not a secret that the company, which lost in a single day $600B of capitalization, is Nvidia.