Handle functions stored and called from struct fields #23

picatz · 2024-01-03T16:00:38Z

Today, functions that are stored and called from struct fields aren't handled properly when constructing the call graph. Let's look at the following example program where a function called (run) is used in the command struct:

package main

import (
	"database/sql"
	"fmt"
	"net/http"
	"os"
)

func login(db *sql.DB, q string) error {
	_, err := db.Query(q) // want "potential sql injection"
	if err != nil {
		return err
	}
	return nil
}

func startServer() error {
	db, err := sql.Open("sqlite3", ":memory:")
	if err != nil {
		return err
	}

	mux := http.NewServeMux()

	mux.HandleFunc("/login", func(w http.ResponseWriter, r *http.Request) {
		pass, _ := r.URL.User.Password()
		login(db, pass)
	})

	return http.ListenAndServe(":8080", mux)
}

type command struct {
	name string
	run  func(args []string) error
}

type commands []*command

func (c commands) run(args []string) error {
	for _, cmd := range c {
		if cmd.name == args[0] {
			return cmd.run(args[1:])
		}
	}

	return fmt.Errorf("unknown command: %s", args[0])
}

type cli struct {
	commands commands
}

func (c *cli) run(args []string) error {
	return c.commands.run(args)
}

func main() {
	c := &cli{
		commands{
			{
				name: "start-server",
				run: func(args []string) error {
					startServer()
					return nil
				},
			},
		},
	}

	err := c.run(os.Args[1:])
	if err != nil {
		panic(err)
	}
}

The resulting call graph ends up being:

n10:net/http.ListenAndServe

n2:main$1
        → n3:startServer

n3:startServer
        → n7:database/sql.Open
        → n8:net/http.NewServeMux
        → n9:(*net/http.ServeMux).HandleFunc
        → n10:net/http.ListenAndServe

n6:(*database/sql.DB).Query

n5:startServer$1
        → n4:login
        → n11:(*net/url.Userinfo).Password

n7:database/sql.Open

n8:net/http.NewServeMux

n9:(*net/http.ServeMux).HandleFunc
        → n5:startServer$1

n11:(*net/url.Userinfo).Password

n0:main
        → n1:(*cli).run

n1:(*cli).run

n4:login
        → n6:(*database/sql.DB).Query

☝️ Notice how (*database/sql.DB).Query is there, but there's no call path that can be constructed back to main. To illustrate this, we can start at the SQL query, and work our way back:

n6:(*database/sql.DB).Query → n4:login → n5:startServer$1 → n9:(*net/http.ServeMux).HandleFunc → n3:startServer  → n2:main$1

n2:main$1, while it looks like it's n0:main, and that's even almost the case, it's really the anonymous function defined in main (func(args []string) error {...}). It's also just a coincidence of the example, and that "chain" could be broken otherwise; like, if c := &cli{ ... } happened in a global variable instead.

So, we need a way to link n2:main$1 back to n0:main to solve this issue.

The text was updated successfully, but these errors were encountered:

picatz · 2024-01-19T00:45:16Z

A visualization of the call graph being constructed, hopefully demonstrating the missing link that needs to be achieved to fix this bug:

picatz · 2024-01-19T01:10:18Z

Some other call graph algorithm visualizations for fun:

data = {
  const csvData = await FileAttachment("[email protected]").csv();
  
  // assuming your data has source and target columns
  const links = csvData.map(d => ({source: d.source_func, target: d.target_func}));
  
  // get unique nodes
  const nodes = Array.from(new Set(links.flatMap(l => [l.source, l.target])), id => ({id}));

  return {nodes, links};
}

ForceGraph(data, {
  nodeId: (d) => d.id, // node identifier, to match links
  nodeTitle: (d) => d.id, // hover text
  width: 4000,
  height: 4000,
  invalidation // stop when the cell is re-run
})

`callgraphutil.NewVulncheckCallGraph(ctx, mainFn.Prog, srcFns)`

`rta.Analyze([]*ssa.Function{mainFn}, true).CallGraph`

`cha.CallGraph(mainFn.Prog)`

`static.CallGraph(mainFn.Prog)`

`cg := vta.CallGraph(vtaFns, cha.CallGraph(mainFn.Prog))`

vtaFns := make(map[*ssa.Function]bool)
for _, fn := range srcFns {
	vtaFns[fn] = true
}
vtaFns[mainFn] = true

picatz · 2024-01-19T01:27:54Z

It's interesting seeing the similar structure in the CHA or RTA construction and the one from callgraphutil being used right now in sqli and other analyzers:

Note

While these structures look very similar, there are some key differences in the construction that aid in later analysis.

For example, looking at rta the "middle" node is actually a (reflect.Value).Call, not an (*net/http.ServerMux).HandleFunc.

`callgraphutil.NewGraph(mainFn, buildSSA.SrcFuncs...)`

`rta.Analyze([]*ssa.Function{mainFn}, true).CallGraph`

`cha.CallGraph(mainFn.Prog)`

picatz · 2024-02-19T20:45:33Z

Continuing to revisit this problem, attempting to rephrase it: I'm trying to find instances of function calls that are reachable from main. Functions stored and called from struct fields, especially those called extra indirectly from slice indexing, need to be accounted.

This also begs the question, for the case of SQL injection analysis, if there might be merit in dropping the "reachable from main" requirement for results. If we worked "backwards" in the analysis, starting from a (*database/sql.DB).Query node, we could identify if it was used within the http.Handler, and the related *http.Request object used with it:

n6:(*database/sql.DB).Query → n4:login → n5:startServer$1 → n9:(*net/http.ServeMux).HandleFunc

Today, we identify paths from the "root" (main), and then find paths to our sinks (like the SQL query):

taint/check.go

Line 70 in ca0546c

sinkPaths := callgraphutil.PathsSearchCallTo(cg.Root, sink)

🤔 If we started at n6, we could work out who directly called it (n4), and continue along until we reach one of our sources. I'm curious if it may also end up being faster / cheaper in the long run, because we don't always need to re-start out searches from main.

But, it might be nice to have both "forward" and "backward" call graph path analysis that are robust to use the "right tool for the job" when possible.

picatz added the bug Something isn't working label Jan 3, 2024

picatz mentioned this issue Jan 4, 2024

Use standard golang.org/x/tools/go/callgraph package #24

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle functions stored and called from struct fields #23

Handle functions stored and called from struct fields #23

picatz commented Jan 3, 2024 •

edited

Loading

picatz commented Jan 19, 2024 •

edited

Loading

picatz commented Jan 19, 2024

picatz commented Jan 19, 2024 •

edited

Loading

picatz commented Feb 19, 2024

Handle functions stored and called from struct fields #23

Handle functions stored and called from struct fields #23

Comments

picatz commented Jan 3, 2024 • edited Loading

picatz commented Jan 19, 2024 • edited Loading

picatz commented Jan 19, 2024

callgraphutil.NewVulncheckCallGraph(ctx, mainFn.Prog, srcFns)

rta.Analyze([]*ssa.Function{mainFn}, true).CallGraph

cha.CallGraph(mainFn.Prog)

static.CallGraph(mainFn.Prog)

cg := vta.CallGraph(vtaFns, cha.CallGraph(mainFn.Prog))

picatz commented Jan 19, 2024 • edited Loading

callgraphutil.NewGraph(mainFn, buildSSA.SrcFuncs...)

rta.Analyze([]*ssa.Function{mainFn}, true).CallGraph

cha.CallGraph(mainFn.Prog)

picatz commented Feb 19, 2024

picatz commented Jan 3, 2024 •

edited

Loading

picatz commented Jan 19, 2024 •

edited

Loading

`callgraphutil.NewVulncheckCallGraph(ctx, mainFn.Prog, srcFns)`

`rta.Analyze([]*ssa.Function{mainFn}, true).CallGraph`

`cha.CallGraph(mainFn.Prog)`

`static.CallGraph(mainFn.Prog)`

`cg := vta.CallGraph(vtaFns, cha.CallGraph(mainFn.Prog))`

picatz commented Jan 19, 2024 •

edited

Loading

`callgraphutil.NewGraph(mainFn, buildSSA.SrcFuncs...)`

`rta.Analyze([]*ssa.Function{mainFn}, true).CallGraph`

`cha.CallGraph(mainFn.Prog)`