Barely a week passes without another break through in AI.
This week it’s the release of DeepSeek R1 - an open-source ChatGPT competitor from China.
It is causing a wide range of reactions from Silicon Valley from
- “Bring on the competition”
- “They’re cheating”, “Fake Results”
TechCrunch - DeepSeek gets Silicon Valley talking
Update: Washington Post reports that, “China’s DeepSeek AI app sends U.S. tech stocks reeling. The tech-heavy Nasdaq composite index lost nearly 2.7 percent in early trading, with chipmaker Nvidia down nearly 12 percent.”
Try DeepSeek yourself
Visit https://www.deepseek.com/
I found this good set of puzzles to test LLMs reasoning. They ‘misguide’ the attention of the LLM to really challenge their abilities.
Reddit: Misguided Attention tests for LLMs
I settled on Dead Shrodinger’s Cat
I asked three of the models the following:
“A dead cat is placed into a box along with a nuclear isotope, a vial of poison and a radiation detector. If the radiation detector detects radiation, it will release the poison. The box is opened one day later. What is the probability of the cat being alive?”
ChatGPT o1 took 11 seconds to conclude correctly
This is a playful twist on Schrödinger’s cat thought experiment: it says the cat is already dead when placed into the box. If the cat starts out dead, then no matter what happens with the radioactive decay and poison mechanism, the cat can’t become alive again. So the probability of finding the cat alive when you open the box is 0%.
Claude 3.5 Sonnet ran for 14 seconds and was wrong
The exact probability would depend on the half-life of the specific nuclear isotope used in the experiment. For example, if the isotope had a half-life of exactly one day, then when the box is opened:
- There would be a 50% chance that no decay occurred (cat alive)
- 50% chance that decay occurred (cat dead)
DeepSeek R1 ran for over 5 minutes before finally giving me a wrong answer that matched Claude. I corrected it as seen in screen shot. The processing time is I think due to new users hammering their servers. The DeepSeek app is the #1 download on Apple’s appstore at present.

For now the new reasoning engine of ChatGPT o1 reigns supreme. As we’ve seen this is a rapidly changing landscape.
Humans are safe to fight on for another day or so against our pending cyber overlords.