A web-based tool that automates browser tasks using AI. Watch the automation happen in real-time while controlling it through a modern web interface.
- Real-time browser automation visualization
- Clean web interface for task control
- OpenAI GPT-4o powered automation
- Support for both local and containerized deployment
- Python 3.11 or higher
- Poetry package manager
- OpenAI API key
- Clone the repository:
git clone <repository-url>
cd browser-automation
- Install Poetry if you haven't already:
curl -sSL https://install.python-poetry.org | python3 -
- Install dependencies:
poetry install
- Install Playwright browser:
poetry run playwright install chromium
- Start the application:
poetry run python api.py
- Open http://localhost:8000 in your browser
The browser automation will appear directly on your system when you run a task.
- Docker installed on your system
- OpenAI API key
- Build the Docker image:
docker build -t browser-automation .
- Run the container:
docker run -p 8000:8000 -p 6080:6080 browser-automation
- Access the application:
- Open http://localhost:8000 in your browser
- The browser automation will be visible in the embedded VNC viewer
- Enter your OpenAI API key in the web interface
- Type your automation task (e.g., "Search for the cheapest guitar on amazon")
- Click "Run Task"
- Watch the automation happen in real-time
- View the results in the interface
- When running locally, Playwright opens on your system
- When using Docker, automation is visible through the embedded VNC viewer
- Default VNC password is "password"
- The application uses ports 8000 (web interface) and 6080 (VNC when using Docker)
- If the VNC viewer shows a black screen, try refreshing the page
- If tasks fail, check your OpenAI API key and task description
- For Docker issues, ensure both ports (8000 and 6080) are available