Fork Purpose

This fork of browser-use/web-ui adds CLI support specifically designed for AI agents like Cursor Agent. It enables direct command-line interaction with browser automation tasks, making it ideal for integration with AI development environments and automated workflows.

CLI Documentation

See CLI Guide for comprehensive documentation on:

Available LLM providers and models
Detailed command reference
Environment configuration
Example usage patterns

Quick Start

# Run a task (browser will auto-start if needed)
browser-use run "go to example.com and create a report about the page structure"

# Run with specific provider and vision capabilities
browser-use run "analyze the layout and visual elements" --provider Google --vision

# Run with specific model selection
browser-use run "analyze the page" --provider Anthropic --model-index 1

# Explicitly start browser with custom options (optional)
browser-use start --headless --window-size 1920x1080

# Close browser when done
browser-use close

Supported LLM Providers

OpenAI (gpt-4o) - Vision-capable model for advanced analysis
Anthropic (claude-3-5-sonnet-latest, claude-3-5-sonnet-20241022) - Advanced language understanding
Google (gemini-1.5-pro, gemini-2.0-flash) - Fast and efficient processing
DeepSeek (deepseek-chat) - Cost-effective default option

See the CLI Guide for detailed provider configuration and usage examples.

CLI Commands

start - (Optional) Initialize browser session with custom options:
- --headless - Run in headless mode
- --window-size - Set window dimensions (e.g., "1920x1080")
- --disable-security - Disable security features
- --user-data-dir - Use custom Chrome profile
- --proxy - Set proxy server
run - Execute tasks (auto-starts browser if needed):
- --model - Choose LLM (deepseek-chat, gemini, gpt-4, claude-3)
- --vision - Enable visual analysis
- --record - Record browser session
- --trace-path - Save debugging traces
- --max-steps - Limit task steps
- --add-info - Provide additional context
close - Clean up browser session

Example Tasks

The browser-tasks-example.ts provides ready-to-use task sequences for:

Product research automation
Documentation analysis
Page structure analysis
Debug sessions with tracing

Configuration

See .env.example for all available configuration options, including:

API keys for different LLM providers
Browser settings
Session persistence options

This project builds upon the foundation of the browser-use, which is designed to make websites accessible for AI agents.

We would like to officially thank WarmShao for his contribution to this project.

WebUI: is built on Gradio and supports a most of browser-use functionalities. This UI is designed to be user-friendly and enables easy interaction with the browser agent.

Expanded LLM Support: We've integrated support for various Large Language Models (LLMs), including: Gemini, OpenAI, Azure OpenAI, Anthropic, DeepSeek, Ollama etc. And we plan to add support for even more models in the future.

Custom Browser Support: You can use your own browser with our tool, eliminating the need to re-login to sites or deal with other authentication challenges. This feature also supports high-definition screen recording.

Persistent Browser Sessions: You can choose to keep the browser window open between AI tasks, allowing you to see the complete history and state of AI interactions.

bu-webui-demo.mp4

Installation Options

Option 1: Local Installation

Read the quickstart guide or follow the steps below to get started.

Python 3.11 or higher is required.

First, we recommend using uv to setup the Python environment.

uv venv --python 3.11

and activate it with:

source .venv/bin/activate

Install the dependencies:

uv pip install -r requirements.txt

Then install playwright:

playwright install

Name		Name	Last commit message	Last commit date
Latest commit History 199 Commits
.vscode		.vscode
assets		assets
cli		cli
docs		docs
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
analyze_trace.py		analyze_trace.py
demo_logging.py		demo_logging.py
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
pytest_output.txt		pytest_output.txt
requirements.txt		requirements.txt
supervisord.conf		supervisord.conf
test_gemini_connection.py		test_gemini_connection.py
test_results.txt		test_results.txt
webui.py		webui.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fork Purpose

CLI Documentation

Quick Start

Supported LLM Providers

CLI Commands

Example Tasks

Configuration

Installation Options

Option 1: Local Installation

About

Uh oh!

Releases

Packages

Languages

License

drumnation/browser-use-cli

Folders and files

Latest commit

History

Repository files navigation

Fork Purpose

CLI Documentation

Quick Start

Supported LLM Providers

CLI Commands

Example Tasks

Configuration

Installation Options

Option 1: Local Installation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages