response.content.text in HAR is undecodable base64 #80

n-kb · 2018-01-28T11:55:19Z

When I do a plain-vanilla test, using this code:

from browsermobproxy import Server
server = Server("venv/bin/browsermob-proxy-2.1.4/bin/browsermob-proxy", options={'port':8008})
server.start()
proxy = server.create_proxy()

from selenium import webdriver
profile  = webdriver.FirefoxProfile()
profile.set_proxy(proxy.selenium_proxy())
driver = webdriver.Firefox(firefox_profile=profile)

proxy.new_har("google", options={"captureContent":True, "captureBinaryContent":True})
driver.get("http://www.google.co.uk")

server.stop()
driver.quit()

Some responses have response.content.text base64 encoded but doesn't decode to what it should. In this example, what should be an HTML page decodes to gibberish: https://gist.github.com/n-kb/8b8818230c54be998007ee855e037404#file-google-har-L191

Because the problem only arises with text and not with images, I'm hypothesizing that the issue comes from this (from RFC1341): "A CRLF sequence in base64 data should be converted to a quoted-printable line break, but ONLY when converting text data" but I'm not familiar at all with these things.

Any idea how this could be solved?

The text was updated successfully, but these errors were encountered:

Fireclunge · 2018-08-14T19:44:29Z

I hope this helps anyone with the same issue (after spending way too much time looking for a solution)

It appears to be caused by brotli compression garbling the output before the encoding. Reversing the process worked for me :)

import brotli
import base64

decoded_text = brotli.decompress(
    base64.b64decode(entry['response']['content']['text'])
    ).decode()

ericbeland · 2019-06-04T21:31:22Z

If anyone would like to try, we have a fork of the BrowserMob proxym renamed as the BrowserUp Proxy with Brotli support merged. It should be a drop in replacement for the binary, and should be compatible when used via REST, the only exception being that we dropped the deprecated legacy routes. We are actively maintaining this and adding to it, whereas development on the BrowserMob proxy itself has been dead for a few years.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

response.content.text in HAR is undecodable base64 #80

response.content.text in HAR is undecodable base64 #80

n-kb commented Jan 28, 2018

Fireclunge commented Aug 14, 2018 •

edited

Loading

ericbeland commented Jun 4, 2019

response.content.text in HAR is undecodable base64 #80

response.content.text in HAR is undecodable base64 #80

Comments

n-kb commented Jan 28, 2018

Fireclunge commented Aug 14, 2018 • edited Loading

ericbeland commented Jun 4, 2019

Fireclunge commented Aug 14, 2018 •

edited

Loading