Anthropic Releases AI Tool That Can Use Your PC
- By Paul Mah
- October 23, 2024
Anthropic has released a new tool that can use a computer like humans, by moving the mouse cursor, clicking on buttons, and typing on the keyboard.
Simply known as “Computer Use,” the tool is in beta and available exclusively with the firm's mid-range 3.5 Sonnet model via the API.
Computer use feature
According to Anthropic, users can direct it with multi-step instructions to accomplish tasks “by looking at a screen, moving a cursor, clicking buttons, and typing text.”
“With computer use, we're trying something fundamentally new. Instead of making specific tools to help Claude complete individual tasks, we're teaching it general computer skills – allowing it to use a wide range of standard tools and software programs designed for people,” wrote Anthropic on its blog.
To achieve this, it built an API that allows Claude to perceive and interact with computer interfaces through what it calls an “action-execution layer.” Under the hood, Claude takes screenshots to “see,” then counts how many pixels vertically or horizontally to move the cursor to click in the correct place.
Developers can pass instructions such as “use data from my computer and online to fill out this form” and use this capability to automate repetitive tasks, build and test software, or even conduct open-ended tasks.
However, Anthropic cautioned that the capability is still in its nascent stages and that tasks people perform effortlessly, such as scrolling, dragging, and zooming, currently present challenges for Claude. Developers are encouraged to begin with low-risk tasks first while Anthropic works to improve its capabilities.
Claude performance
Anthropic also announced a major upgrade to its Claude family of AI models, updating the already impressive Claude 3.5 Sonnet and releasing the new Claude 3.5 Haiku.
Sonnet outperforms OpenAI’s GPT-4o and Google’s Gemini 1.5 Pro on graduate-level reasoning tasks, coding, and visual analysis, says Anthropic. It sees a significant improvement for AI-powered coding in particular, and GitLab found it delivered stronger reasoning of up to 10% across use cases with no added latency. This makes it an ideal choice for multi-step software development.
As reported on TechCrunch, the upgraded 3.5 Sonnet self-corrects and retries tasks when it encounters obstacles. It can also work toward objectives that require dozens or hundreds of steps.
In addition, the new Claude 3.5 Haiku is now as capable as Claude 3.0 Opus, its largest model. Fitted with a more concise natural language model, it is said to be three times faster than its peers and outperforms Claude 3.5 Sonnet and GPT-4o in most tests.
The upgraded Claude 3.5 Sonnet is now available for all users, while the new Claude 3.5 Haiku will be released later this month.
Image credit: Anthropic
Paul Mah
Paul Mah is the editor of DSAITrends, where he report on the latest developments in data science and AI. A former system administrator, programmer, and IT lecturer, he enjoys writing both code and prose.