Member-only story
Will LLMs Control Computers in the Future?
The idea of computers autonomously handling tasks for us has fascinated many. Imagine asking your computer to complete complex operations while you sit back and relax. With Anthropic’s Claude 3.5, this is not just imagination — it’s becoming a reality. In a new case study, researchers explored the ability of Large Language Models (LLMs) like Claude to control computers using a Graphical User Interface (GUI). The results? A promising glimpse into the future.
How Claude 3.5 Uses Computers
Claude doesn’t just process text like traditional LLMs; it interacts with computer interfaces in a very human-like way. The process works as follows:
- Instructions and Screenshots: Claude receives textual instructions and screenshots of the current computer screen.
- Action Simulation: It interprets the screenshot, formulates a plan, and simulates human actions like mouse movements, clicks, and typing.
- Execution and Iteration: Tasks are executed in steps until completed or more input is needed.
Tested Scenarios
The study examined Claude’s performance across a variety of real-world scenarios:
- Web Search: Searching for items, comparing products, and adding items to a shopping cart.