Member-only story

Will LLMs Control Computers in the Future?

3 min readNov 20, 2024

The idea of computers autonomously handling tasks for us has fascinated many. Imagine asking your computer to complete complex operations while you sit back and relax. With Anthropic’s Claude 3.5, this is not just imagination — it’s becoming a reality. In a new case study, researchers explored the ability of Large Language Models (LLMs) like Claude to control computers using a Graphical User Interface (GUI). The results? A promising glimpse into the future.

How Claude 3.5 Uses Computers

Claude doesn’t just process text like traditional LLMs; it interacts with computer interfaces in a very human-like way. The process works as follows:

Instructions and Screenshots: Claude receives textual instructions and screenshots of the current computer screen.
Action Simulation: It interprets the screenshot, formulates a plan, and simulates human actions like mouse movements, clicks, and typing.
Execution and Iteration: Tasks are executed in steps until completed or more input is needed.

Will LLMs Control Computers in the Future?

Tested Scenarios

The study examined Claude’s performance across a variety of real-world scenarios:

Web Search: Searching for items, comparing products, and adding items to a shopping cart.

Will LLMs Control Computers in the Future?

How Claude 3.5 Uses Computers

Tested Scenarios

Written by Emad Dehnavi

No responses yet