Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ext/har: add HAR logger extension #610

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
110 changes: 110 additions & 0 deletions ext/har/logger.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
package har


import (
"encoding/json"
"net/http"
"os"
"sync"
"time"

"github.com/elazarl/goproxy"
)

// Logger implements a HAR logging extension for goproxy
type Logger struct {
mu sync.Mutex
har *Har
captureContent bool
}

// NewLogger creates a new HAR logger instance
func NewLogger() *Logger {
return &Logger{
har: New(),
}
}

// OnRequest handles incoming HTTP requests
func (l *Logger) OnRequest(req *http.Request, ctx *goproxy.ProxyCtx) (*http.Request, *http.Response) {
// Store the start time in context for later use
if ctx != nil {
ctx.UserData = time.Now()
}
return req, nil
}

// OnResponse handles HTTP responses
func (l *Logger) OnResponse(resp *http.Response, ctx *goproxy.ProxyCtx) *http.Response {
if resp == nil || ctx == nil || ctx.Req == nil || ctx.UserData == nil {
return resp
}

startTime, ok := ctx.UserData.(time.Time)
if !ok {
return resp
}

// Create HAR entry
entry := Entry{
StartedDateTime: startTime,
Time: time.Since(startTime).Milliseconds(),
Request: ParseRequest(ctx.Req, l.captureContent),
Response: ParseResponse(resp, l.captureContent),
Cache: Cache{},
CameronBadman marked this conversation as resolved.
Show resolved Hide resolved
Timings: Timings{
Send: 0,
Wait: time.Since(startTime).Milliseconds(),
Receive: 0,
},
}

// Add server IP
entry.FillIPAddress(ctx.Req)

// Add to HAR log thread-safely
l.mu.Lock()
l.har.AppendEntry(entry)
l.mu.Unlock()

return resp
}

// SetCaptureContent enables or disables request/response body capture
func (l *Logger) SetCaptureContent(capture bool) {
l.mu.Lock()
defer l.mu.Unlock()
l.captureContent = capture
}
CameronBadman marked this conversation as resolved.
Show resolved Hide resolved

// SaveToFile writes the current HAR log to a file
func (l *Logger) SaveToFile(filename string) error {
l.mu.Lock()
defer l.mu.Unlock()

file, err := os.Create(filename)
if err != nil {
return err
}
defer file.Close()

encoder := json.NewEncoder(file)
encoder.SetIndent("", " ")
CameronBadman marked this conversation as resolved.
Show resolved Hide resolved
return encoder.Encode(l.har)
}

// Clear resets the HAR log
func (l *Logger) Clear() {
l.mu.Lock()
defer l.mu.Unlock()
l.har = New()
}

// GetEntries returns a copy of the current HAR entries
func (l *Logger) GetEntries() []Entry {
l.mu.Lock()
defer l.mu.Unlock()
entries := make([]Entry, len(l.har.Log.Entries))
copy(entries, l.har.Log.Entries)
return entries
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HAR exporting part must be redesigned from scratch, here there are a bunch of issues.
Let me know if you want to do that, or in case I can fix this personally.

Basically, instead of this, the most common use will be the periodically send (e.g. every 5 minutes) of the HAR entries to an external logger API service or graphana or whatever.
So we have to accept a function from the user that, periodically, will consume the list of entries, pass it to this user function, and delete it from our queue.
The user will do whatever he wants with this list of entries, because he passed us its own function.

A goroutine will read from the entries channel and append to a local slice.
For the function call, there can be two options that the user can select:

  • every N requests we call the function (we implement this using a counter)
  • every time.Duration (e.g. 5 minutes) we call the user function

Let me know about this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually fixing all the issues you have pointed out right now, I can do this part tonight, but if not tomorrow. I haven't worked with go routines before, but I think I understand the general idea for the saving function. thank you.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah don't worry, it's holiday time, so feel free to take your time.
Your help is really appreciated here!
It's not an urgent feature.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nah I should be thanking you! the feedback is appreciated.

Yeah don't worry, it's holiday time, so feel free to take your time. Your help is really appreciated here! It's not an urgent feature.

Copy link
Author

@CameronBadman CameronBadman Jan 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ErikPelli went ahead and updated the code to the specification you asked for, I found it actually pretty difficult and it required a surprising amount of updates to the rest of the package, but I think it works correctly. Hope to hear from you soon.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CameronBadman you're doing a 100 milliseconds polling with mutex locks, this will dangerously increase the CPU usage and it's not the correct approach for Go.
You should create a channel where the OnReponse() function can send the parsed data.
The function that you called exportLoop() will then contonuously collect from this channel (no polling at all).
If the user specified a duration, in the exportLoop() you will have a select between a ticker with this duration and the data channel read.
When you read from the data channel, you append to a local response slice.
If the user specified a length amount (> 0) and it has been reached, or If the ticker has been triggered and length is > 0, you pass the slice to the callback and set the value of the local variable to nil.
You don't need mutexes at all with this approach, nor polling.

114 changes: 114 additions & 0 deletions ext/har/logger_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@

package har_test

import (
"bytes"
"encoding/json"
"io"
"net/http"
"net/http/httptest"
"net/url"
"os"
"testing"

"github.com/elazarl/goproxy"
"github.com/elazarl/goproxy/ext/har"
)

type ConstantHandler string

func (h ConstantHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
io.WriteString(w, string(h))
}

func oneShotProxy(proxy *goproxy.ProxyHttpServer) (client *http.Client, s *httptest.Server) {
s = httptest.NewServer(proxy)

proxyUrl, _ := url.Parse(s.URL)
tr := &http.Transport{Proxy: http.ProxyURL(proxyUrl)}
client = &http.Client{Transport: tr}
return
}

func TestHarLogger(t *testing.T) {
// Create a response we expect
expected := "hello world"
background := httptest.NewServer(ConstantHandler(expected))
defer background.Close()

// Set up the proxy with HAR logger
proxy := goproxy.NewProxyHttpServer()
logger := har.NewLogger()
logger.SetCaptureContent(true)

proxy.OnRequest().DoFunc(logger.OnRequest)
proxy.OnResponse().DoFunc(logger.OnResponse)

client, proxyserver := oneShotProxy(proxy)
defer proxyserver.Close()

// Make a request
resp, err := client.Get(background.URL)
if err != nil {
t.Fatal(err)
}

// Read the response
msg, err := io.ReadAll(resp.Body)
if err != nil {
t.Fatal(err)
}
resp.Body.Close()

if string(msg) != expected {
t.Errorf("Expected '%s', actual '%s'", expected, string(msg))
}

// Test POST request with content
postData := "test=value"
req, err := http.NewRequest("POST", background.URL, bytes.NewBufferString(postData))
if err != nil {
t.Fatal(err)
}
req.Header.Set("Content-Type", "application/x-www-form-urlencoded")

resp, err = client.Do(req)
if err != nil {
t.Fatal(err)
}
resp.Body.Close()

// Save HAR file and verify content
tmpfile := "test.har"
err = logger.SaveToFile(tmpfile)
if err != nil {
t.Fatal(err)
}
defer os.Remove(tmpfile)

// Read and verify HAR content
harData, err := os.ReadFile(tmpfile)
if err != nil {
t.Fatal(err)
}

var harLog har.Har
if err := json.Unmarshal(harData, &harLog); err != nil {
t.Fatal(err)
}

// Verify we captured both requests
if len(harLog.Log.Entries) != 2 {
t.Errorf("Expected 2 entries in HAR log, got %d", len(harLog.Log.Entries))
}

// Verify GET request
if harLog.Log.Entries[0].Request.Method != "GET" {
t.Errorf("Expected GET request, got %s", harLog.Log.Entries[0].Request.Method)
}

// Verify POST request
if harLog.Log.Entries[1].Request.Method != "POST" {
t.Errorf("Expected POST request, got %s", harLog.Log.Entries[1].Request.Method)
}
}
ErikPelli marked this conversation as resolved.
Show resolved Hide resolved
Loading
Loading