anthropic.com

Claude Sonnet 5

marinesebastian · 1.2K points · 715 comments · hace 19 horas

Comments

5 preview comments · loading full thread

Jcampuzano2hace 18 horas

I'm struggling to understand why I'd ever use this instead of just using a lower effort level for opus given on many of the benchmarks listed the cost per task rises above opus at anything higher than medium effort. Only thing I can think of is for when someone is out of opus credits. Of course there are API billing use cases but I'd probably still just use opus on low.

conradkayhace 19 horas

Wow, seems worse even on price/performance than GLM 5.2, which is only 744b parameters. From the system card: "On CyberGym vulnerability discovery, Claude Sonnet 5 is less capable than Sonnet 4.6, and far less capable than Opus 4.8 and Mythos 5 As with the other evaluations in this section, these results were achieved with all safeguards turned off. When run with our default mitigations, Sonnet 5 scored a 0 on CyberGym"

theHocineSaadhace 35 minutos

What's interesting is that Claude Sonnet 5 costs more per task ($2.29) than Opus 4.8 ($1.80), while the latter is obviously better! It actually costs more per task than every other model. It's only cheaper than Claude Fable 5. Source: https://artificialanalysis.ai/?cost=cost-per-task#price-and-..., as of writing this comment (the results are frequently changing)

microtonalhace 19 horas

Claude Sonnet 5 is built to be the most agentic Sonnet model yet. It can make plans, use tools like browsers and terminals, and run autonomously at a level that, just a few months ago, required larger and more expensive models. I have been using Sonnet 4.6 more than Opus, because I'm mostly doing agent-assisted development and not fully agent-driven development. This announcement does not make me positive, I have found that the more models are optimized for fully agentic development, the worse they get at assisted development and often start doing too much despite very strict/specific instructions. I have been moving more and more to K2.7 Code and GLM-5.2 the last few weeks. They are often good enough for assistance, very fast, and cheap.

nijavehace 1 hora

> Me: What was the sushi place near latitude 41 in Columbus? Did it go out of business I don't see it on Google maps anymore > Sonnet 5 (medium): None of these past chats mention a sushi place — I don't have anything on record about that. Do you remember the name, or roughly which part of Columbus (neighborhood/street) it was near? That'll help me search and check its current status. Not impressed. It got the name right on high effort one shot but hallucinated the date relativity (Jan 2026 is not last month...). Worked okay on extra. Sonnet 4.6 worked fine on medium, high, and extra one shot. Edit: as the replies point out, the prompt is definitely ambiguous however Sonnet 5 didn't even extract the semantic meaning "looking for <place> near <compound place>" which all the human replies seem to understand. Even Haiku 4.5 identifies the semantic meaning although it fails to retrieve the correct results > Haiku 4.5 (reasoning off) I'll help you find information about that sushi place. Let me search for sushi restaurants that were near latitude 41 in Columbus and check their current status. >_Searched the web_ > I see that "Latitude 41" is a restaurant in Columbus, but it's actually a modern American restaurant, not specifically a sushi place. However, based on your mention of latitude 41, that's what came up. Let me search more specifically for sushi restaurants that may have closed in Columbus around that area.