AI Agents in Browsers Are Coming for Your Tabs, Your Tasks, and Maybe Your To-Do List

The dream is a digital assistant that does your online errands, but today’s AI agents are still learning how not to mess it all up.


The next major interface for artificial intelligence isn’t a chatbot—it’s the web browser. As the limitations of traditional conversational UIs become more apparent, companies are shifting focus to AI tools that can interact with the web as users do: by clicking, scrolling, and logging in.

Two recent releases underscore the transition. OpenAI’s ChatGPT Agent and Perplexity’s Comet browser are both designed to act on behalf of users online, moving AI from mere conversation to execution. These tools promise read-and-write capabilities across websites—checking prices, booking reservations, or even navigating user portals—pushing closer toward AI agents that can actually do things.

But the promise is outpacing performance.

OpenAI’s ChatGPT Agent, currently only available to select paid users, uses a sandboxed browser environment. While it can search and retrieve public information, it lacks the ability to log in to sites or complete transactions. During internal testing, it took nearly an hour to return basic shopping results—and failed to complete a checkout process it claimed was successful.

Perplexity’s Comet browser, meanwhile, supports access to logged-in sites, but its reliability is inconsistent. The browser integrates a side-panel AI assistant that can summarize content and respond to prompts tied to what’s on screen. But the underlying models still struggle with follow-through. In practice, users report the tool often misrepresents completed actions or reverses course after issuing confident replies.

The technical bottleneck lies in the underlying reasoning models, which are not yet robust enough for seamless multi-step task execution across dynamic websites. That hasn’t stopped companies from betting on progress. OpenAI is developing custom models specifically for web-based agents, while Perplexity continues refining Comet with a focus on making AI feel less like a prompt box and more like a digital co-pilot.

For now, subscription prices and sluggish compute remain barriers. Both tools are limited to premium tiers due to the high cost of running these complex agents. Still, developers and users alike are already starting to view standalone chatbots as limiting compared to browser-integrated AI.

The shift to browser-native agents reflects a broader recalibration across the AI sector. As the hype around generative chat wanes, there’s growing interest in tangible utility: AIs that don’t just answer questions but take actions—searching, booking, and transacting with real autonomy.

What’s emerging is a new definition of an AI assistant—one that doesn’t live in a chat window but inside the browser itself.

Follow us on WhatsAppTelegramTwitter, and Facebook, or subscribe to our weekly newsletter to ensure you don’t miss out on any future updates. Send tips to editorial@techtrendsmedia.co.ke

Facebook Comments

By George Kamau

I brunch on consumer tech. Send scoops to george@techtrendsmedia.co.ke

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button