Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebrowser Patches Not Effectively Bypassing Bot Detection #80

Closed
Antonio-Margheriti opened this issue Dec 30, 2024 · 2 comments
Closed

Comments

@Antonio-Margheriti
Copy link

Antonio-Margheriti commented Dec 30, 2024

I have been attempting to use Rebrowser patches with both the npm and Python packages, using the patched version directly. Below is the Python script I am using to launch the browser and connect via CDP:

from rebrowser_playwright.sync_api import sync_playwright
with sync_playwright() as playwright:
     # Create persistent context with custom options
     context = playwright.chromium.launch_persistent_context(
         user_data_dir='',
         headless=False,  # Required for `--headless=new`
         ignore_default_args=["--disable-extensions"],
         viewport={"width": 1440, "height": 900},
         screen={"width": 1440, "height": 900},
         args=['--headless=new',
                '--disable-blink-features=AutomationControlled',
                '--disable-software-rasterizer',
                '--ignore-gpu-blocklist',
                '-enable-webgl',
                '--enable-features=WebRTC-H264WithOpenH264FFmpeg,WebGL2ComputeRendering',
                '--disable-background-timer-throttling',
                '--disable-renderer-backgrounding',
                '--no-sandbox',
                '--disable-gpu',
                '--remote-allow-origins=*',
                '--remote-debugging-address=0.0.0.0',
                '--remote-debugging-port=9222',
                '--disable-dev-shm-usage'],
     )
     try:
             # Keep the browser running indefinitely
             while True:
                 time.sleep(1)  # Sleep to prevent high CPU usage
     except KeyboardInterrupt:
         print("\nClosing the browser...")

I have experimented with various argument combinations, but regardless of the setup, I encounter the same issue. Specifically, when I navigate to https://bot-detector.rebrowser.net/ and take a screenshot through Playwright, I consistently receive the following detection result:

image

Steps to Reproduce

Use rebrowser_playwright of version 1.49.1.
Use the provided script to launch a persistent Chromium context with Rebrowser patches.
Navigate to https://bot-detector.rebrowser.net/ via CDP, using something like this:

const { chromium } = require('playwright');
const fs = require('fs');

const browser = await chromium.connectOverCDP("http://localhost:9222");
const context = browser.contexts()[0];
const page = context.pages()[0];
await page.goto("https://bot-detector.rebrowser.net/");
const screenshotBuffer = await page.screenshot();
fs.writeFileSync('screenshot.png', screenshotBuffer);

In this specific case Im using Python 3.12.7, however this reproduces in node as well.

I would appreciate any guidance on what might be missing in my setup or any configuration changes required to address this issue.

Thank you in advance for your assistance!

@nwebson
Copy link
Contributor

nwebson commented Dec 30, 2024

Could you please share the code how you could reproduce it in node?
It looks like it doesn't actually use the patched files for some reason.

@Antonio-Margheriti
Copy link
Author

Of course! I wanted to isolate the problem as much as possible and not to confuse it with this.
This is my code snippet:

let { chromium } = require('rebrowser-playwright');
const fs = require('fs');

(async () => {
  try {
    const context = await chromium.launchPersistentContext('', {
      headless: false,
      ignoreDefaultArgs: ["--disable-extensions"],
      args: [
        '--headless=new',
        '--disable-blink-features=AutomationControlled',
        '--disable-software-rasterizer',
        '--ignore-gpu-blocklist',
        '-enable-webgl',
        '--enable-features=WebRTC-H264WithOpenH264FFmpeg,WebGL2ComputeRendering',
        '--disable-background-timer-throttling',
        '--disable-renderer-backgrounding',
        '--no-sandbox',
        '--disable-gpu',
        '--remote-allow-origins=*',
        '--remote-debugging-address=0.0.0.0',
        '--remote-debugging-port=9222',
        '--disable-dev-shm-usage',
      ]
    });
    const page = context.pages()[0];
    await page.goto('https://bot-detector.rebrowser.net/');
    await new Promise(r => setTimeout(r, 2000));
    const screenshotBuffer = await page.screenshot();
    fs.writeFileSync('screenshot.png', screenshotBuffer);
    // process.exit(0);
  } catch (error) {
    console.error('Error launching browser:', error);
    process.exit(1);
  }
})();

Interestingly, without the sleep I actually get this:
screenshot-remote
However, if I do wait I still get this:
screenshot

Upon further inspection it turns out that the node code is actually working well and it was indeed this issue after all :)
I have no idea what causes the python version to not work, but you can regard this as closed :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants