Anthropic launches Claude Sonnet 4.5: improvements and new features for developers

This month, Anthropic unveils Claude Sonnet 4.5, an enhanced version of its natural language processing model, known to be one of the favorites among developers. The company describes this update as the "best coding model in the world," highlighting its commitment to innovation and excellence in software development.

Claude Sonnet 4.5: Remarkable Advances in Performance

The new version of Claude Sonnet promises more reliable instruction tracking and improved ability to refactor existing code. In the SWE-Bench Verified assessment, which measures how models perform on real GitHub pull request tasks, Sonnet 4.5 scored 77.2%, rising to 82% when parallel computing was applied to the test. These results demonstrate significant progress compared to previous versions.

In certain areas, the company claims that Sonnet 4.5 surpasses Opus 4.1, its flagship model, especially in tasks related to the financial services industry. The OSWorld evaluation, which measures the performance of AI models on real computational tasks, shows that Sonnet 4.5 leads the list with a 61.4% success rate, a notable increase from the 43.9% achieved by Sonnet 4, also outperforming Opus 4.1, which remained around 44%.

A significant change is Sonnet 4.5's capability to autonomously execute complex tasks for 30 hours, as opposed to the 7 hours its predecessor, Opus 4, could manage. With these improvements, Anthropic asserts that Sonnet 4.5 maintains focus and performance over time, although further tests will be required to confirm this effectiveness in real-world situations.

Comparison with Competitors

In most coding tests, Sonnet 4.1 has outperformed competitors like OpenAI's GPT-5 and Google's Gemini 2.5 Pro. However, in visual reasoning benchmarks, where Anthropic's models have generally performed more modestly, the competition seems to be in the lead.

Another relevant aspect is that Anthropic has improved the model's access to additional features, aligning its performance with that of its coding agent, Claude Code. These new functionalities include access to virtual machines, better context management, and support for multiple agents.

Innovative Features of Claude Code

With the launch of Sonnet 4.5, updates to Claude Code, Anthropic's programming agent, are also introduced. It has proven to be a valuable tool, generating over $500 million in annual revenue, with a usage growth of over 10% in the last three months. Claude Code will now feature a native extension for Visual Studio Code, allowing developers to see in real-time the changes made by the agent.

Additionally, improvements have been made to the command line interface, offering greater visibility of the status and a searchable command history, facilitating the reuse of past commands. Checkpoints have also been introduced to simplify code rollback in case Claude Code does not act as expected, eliminating the need for manual backups.

Launch of the Claude Agent SDK

For developers interested in creating agents based on the same infrastructure as Claude Code, Anthropic has introduced the Claude Agent SDK. This new toolkit will enable developers to build custom agents using the same infrastructure that powers Claude Code. The SDK includes essential features such as memory management, context management, and permission administration.

As a complement, developers will receive a memory tool in the API that will assist agents in maintaining context across prolonged tasks. An automatic context management feature that will adjust Claude's context window by removing outdated data as necessary will also be added.

“Imagine with Claude”: Generating Software in Real Time

One of the most intriguing features launched alongside Sonnet 4.5 is “Imagine with Claude,” an experiment illustrating how to generate software and user interfaces in real time. Anthropic explains that “no functionality is predefined; no code is pre-written. What you see is Claude creating in real time, responding to and adapting to requests as you interact.”

This experiment, which will be available exclusively to subscribers of the Claude Max plan for the next five days, showcases significant potential in on-demand software creation, although clarity regarding the internal processes while Claude develops these applications is still lacking.

As the field of artificial intelligence progresses, discussions around tools that allow for instant software generation are gaining traction. While solutions like Lovable approach this idea, Anthropic's proposal to create desirable software on the spot presents a promising future for the industry.

Conclusion

With the launch of Claude Sonnet 4.5, Anthropic continues to set the standard in the development of artificial intelligence models applied to coding. Improvements in instruction-following capabilities, autonomy in complex tasks, and the introduction of new functionalities reinforce its position in the market.

The pricing structure remains at $3/$15 per million tokens of input/output, representing the same fee that Anthropic applied for Sonnet 4.

For those interested in learning more about advancements in artificial intelligence and software development, you are invited to continue exploring the content on this blog.