The Web Scraping Club

The Web Scraping Club

THE LAB #97: My first week with OpenClaw

160,000 Stars in Two Months: What OpenClaw Means for Scrapers

Pierluigi Vinciguerra's avatar
Pierluigi Vinciguerra
Feb 05, 2026
∙ Paid

On November 24, 2025, a new open-source project appeared on GitHub. Two months later, it has over 160,000 stars. The project is OpenClaw, and it describes itself as “the AI that actually does things.” And if you are interested in tech/AI and you don’t live under a rock, you’ve probably heard about it, since it gained huge popularity in the past weeks. You can find even a Podcast Episode with its creator on The Pragmatic Engineer newsletter, which I highly recommend.

Of course, I could not miss running it, and in the past ten days, I’ve played with OpenClaw. What follows is what I learned, what surprised me, and why I think it can be interesting also for someone doing web scraping, even if it’s not its core use.

Before proceeding, let me thank NetNut, the platinum partner of the month. They have prepared a juicy offer for you: up to 1 TB of web unblocker for free.

Claim your offer


What OpenClaw actually is

OpenClaw is an AI assistant that runs locally on your machine. It can control your computer: read and write files, execute shell commands, manage your calendar, send emails, and browse the web. You interact with it through messaging apps you already use. Telegram, WhatsApp, Discord, Signal, Slack, iMessage. You message it like you would message a coworker, and it does things on your behalf.

The key difference from cloud-based assistants is that everything runs on your infrastructure. The agent lives on your computer, sees what you allow it to see, and acts within the boundaries you define. It supports multiple AI backends: Anthropic Claude, OpenAI models, or local models if you prefer to keep everything offline.

The tagline on the repository says it well: “A smart model with eyes and hands at a desk with keyboard and mouse.”

What makes OpenClaw more than a simple automation wrapper is the breadth of its integration layer. Out of the box, it connects to over 50 services and tools. Gmail for email, Obsidian for notes, GitHub for repositories, Spotify for music, Philips Hue for smart home control. Each integration is a capability the agent can invoke when relevant. You do not need to specify which tool to use. You describe what you want, and the agent figures out which integration applies.

The architecture includes a skill system that deserves attention. Skills are modular capabilities that the agent can learn, create, and modify. Users have reported that OpenClaw has written its own extensions and updated its own prompts autonomously. This is not science fiction. The agent has file system access and can modify its own configuration. Whether this is exciting or terrifying depends on your perspective.

There is also a memory layer. OpenClaw remembers context across sessions. It learns your preferences, your common requests, and your workflow patterns. Over time, it becomes more useful because it accumulates knowledge about how you work. This persistence is stored locally, which matters for privacy, but it also means the agent builds an increasingly detailed model of your behavior.

Background execution is another differentiator. OpenClaw can run cron jobs, scheduled reminders, and background tasks. You can tell it to check something every hour or to notify you when a condition is met. It is not just reactive. It can be proactive, monitoring and acting without constant prompting.

Finally, it works in group chats. You can add your OpenClaw bot to a Telegram group, and it will participate in conversations, responding when mentioned or when configured to do so. This opens possibilities for shared assistants across teams, though it also multiplies the security considerations.


If you don’t use LLMs for scraping, you need IPs with good reputation. For this reason, we’re using a proxy provider like our partner Ping Proxies, that’s sharing with TWSC readers this offer.

💰 - Use TWSC for 15% OFF | $1.75/GB Residential Bandwidth | ISP Proxies in 15+ Countries


Setting it up

Installation is straightforward. A one-liner gets you started:

curl -fsSL https://openclaw.ai/install.sh | bash

Or via npm:

npm i -g openclaw
openclaw onboard

I went with the npm route. The onboarding process walks you through connecting your AI provider and setting up your first communication channel.

For Telegram, you create a bot through BotFather, grab the token, and configure it in OpenClaw. The CLI guides you through the pairing process. When someone messages your bot for the first time, you approve them with a simple command:

openclaw pairing approve telegram <CODE>

After that, you can message your bot, and it responds as your personal assistant.

The browser extension required a few extra steps. The official documentation says it works with Chrome only, but since Brave is Chromium-based, I decided to try it anyway. It worked. You install the extension with:

openclaw browser extension install

Then load it as an unpacked extension in Brave by enabling Developer Mode and pointing to the path returned by `openclaw browser extension path`. The extension lets you attach specific tabs to OpenClaw’s control. Only attached tabs can be controlled, which is a reasonable security measure.


Check the TWSC YouTube Channel


Living with it

I configured OpenClaw to use Claude as its backend. The experience has been surprisingly natural. I message it on Telegram with requests, and it executes them. Check my calendar for conflicts. Find a file I worked on last week. Send a reminder at 3 pm.

What makes it different from simply asking ChatGPT is that it actually does things. It does not tell me how to check my calendar. It checks my calendar and tells me what it found.

I also connected it to my TWSC accounting system, which has some APIs. In a few steps, OpenClaw built a small app that lets me use Telegram to check invoice status, revenues, expenses, and more.

The browser extension adds another dimension. I can ask it to navigate to a website, read specific content, fill out forms, or extract information. It operates within a real browser session, with real cookies, logged into my real accounts if I choose to attach those tabs.
This is what I used to automate posting a note on Substack, which doesn’t have an API. I just prompted the desired message, and the LLM (Claude 4.5) understood where on the Substack website notes are posted and how to create one.


The elephant in the room

I need to address something that should be obvious by now. Using OpenClaw means giving an AI agent extensive access to your computer. It can see your browser sessions, your files, your credentials. It can execute commands.

Consider what this means in practice. When you attach a browser tab, the agent can read every cookie, every session token, every piece of data on that page. If you attach a tab where you are logged into your email, the agent can read your email. If you attach a tab with your banking session, the agent theoretically has access to that session. The same applies to files. If you give OpenClaw access to your file system, it can read your SSH keys, your environment files with API credentials, your password manager exports if you have any lying around.

The agent can also execute shell commands. This means it can install software, modify system configuration, create network connections, and run arbitrary code. OpenClaw does have a sandbox mode that restricts some of these capabilities, but the default configuration is permissive because that is what makes it useful.

With my current setup using Claude as the backend, every interaction passes through Anthropic’s servers. The model sees what I ask, sees the context from my computer, and processes it on their infrastructure. When I ask OpenClaw to read a balance sheet, the request goes to Anthropic’s API, is processed, and returns. I trust Anthropic’s privacy practices, but this is still a significant amount of sensitive data leaving my machine.

There is also the question of prompt injection and model behavior. What happens if you navigate to a malicious page that contains instructions designed to manipulate the agent? Modern LLMs are susceptible to prompt injection attacks where content on a webpage could potentially influence the agent’s behavior. This is not theoretical. It is an active area of security research, and there are no perfect defenses yet.

The risk profile changes depending on your configuration. Using a cloud model like Claude or GPT-4 means your data flows through external servers, but those models are also more capable and more likely to handle edge cases correctly. Using a local model keeps everything on your machine, but local models may make mistakes that a more capable model would avoid.

This is why my next step is to migrate to a local model. OpenClaw supports running with local LLMs, which means the entire pipeline can stay on my hardware. The tradeoff is capability. Local models are not yet at the level of Claude or GPT-4 for complex reasoning tasks. But for an assistant that executes relatively simple commands, they might be good enough. And critically, my cookies, my session tokens, my file contents never leave my network.

Practical recommendations if you decide to use OpenClaw: use a dedicated browser profile for attached tabs, separate from your personal browsing. Do not attach tabs with active banking or financial sessions. Be cautious about which directories you give the agent access to. Consider running in sandbox mode until you understand the tool’s behavior. And seriously consider the local model option if you plan to use this for anything sensitive.

If you are going to run an AI agent with this level of access, running it locally is the only configuration that makes full sense from a security perspective. The convenience of cloud models is real, but so is the exposure.


Need help with your scraping project?


Why this matters for web scraping

User's avatar

Continue reading this post for free, courtesy of Pierluigi Vinciguerra.

Or purchase a paid subscription.
© 2026 Pierluigi · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture