DeepSeek’s R1 AI Matches Google and Anthropic in Coding能力 Benchmark


  • DeepSeek’s updated R1 model has matched the coding performance of Google and Anthropic in the WebDev Arena competition, scoring 1,408.84.
  • The model tied for first place with Google’s Gemini-2.5 and Anthropic’s Claude Opus 4, demonstrating strong capabilities in coding tasks.
  • DeepSeek’s R1 has shown consistent performance close to leading models in various benchmark tests since its launch in January.
  • The R1-0528 update included improvements in reasoning and creative writing, as well as a 50% reduction in hallucinations.
  • DeepSeek’s open-source approach has facilitated rapid adoption and influenced other tech giants in China to consider similar strategies.

+

Get Details