Skip to content

Commit

Permalink
Merge pull request #1066 from projectdiscovery/dev
Browse files Browse the repository at this point in the history
v1.1.1
  • Loading branch information
dogancanbakir authored Oct 28, 2024
2 parents 9ba3bb8 + 9495585 commit f8486d4
Show file tree
Hide file tree
Showing 45 changed files with 702 additions and 965 deletions.
46 changes: 23 additions & 23 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,26 +20,26 @@ updates:
allow:
- dependency-name: "github.com/projectdiscovery/*"

# # Maintain dependencies for GitHub Actions
# - package-ecosystem: "github-actions"
# directory: "/"
# schedule:
# interval: "weekly"
# target-branch: "dev"
# commit-message:
# prefix: "chore"
# include: "scope"
# labels:
# - "Type: Maintenance"
#
# # Maintain dependencies for docker
# - package-ecosystem: "docker"
# directory: "/"
# schedule:
# interval: "weekly"
# target-branch: "dev"
# commit-message:
# prefix: "chore"
# include: "scope"
# labels:
# - "Type: Maintenance"
# Maintain dependencies for GitHub Actions
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"
target-branch: "dev"
commit-message:
prefix: "chore"
include: "scope"
labels:
- "Type: Maintenance"

# Maintain dependencies for docker
- package-ecosystem: "docker"
directory: "/"
schedule:
interval: "weekly"
target-branch: "dev"
commit-message:
prefix: "chore"
include: "scope"
labels:
- "Type: Maintenance"
2 changes: 2 additions & 0 deletions .goreleaser/windows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ builds:
goos:
- windows
goarch:
- 386
- arm64
- amd64

archives:
Expand Down
5 changes: 2 additions & 3 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
FROM golang:1.20.6-alpine AS builder
FROM golang:1.21-alpine AS build-env
RUN apk add --no-cache git gcc musl-dev
WORKDIR /app
COPY . /app
RUN go mod download
RUN go build ./cmd/katana

FROM alpine:3.18.5
RUN apk -U upgrade --no-cache \
&& apk add --no-cache bind-tools ca-certificates chromium
RUN apk add --no-cache bind-tools ca-certificates chromium
COPY --from=builder /app/katana /usr/local/bin/

ENTRYPOINT ["katana"]
10 changes: 3 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@

- Fast And fully configurable web crawling
- **Standard** and **Headless** mode
- **Active** and **Passive** mode
- **JavaScript** parsing / crawling
- Customizable **automatic form filling**
- **Scope control** - Preconfigured field / Regex
Expand All @@ -44,7 +43,7 @@
katana requires **Go 1.18** to install successfully. To install, just run the below command or download pre-compiled binary from [release page](https://github.com/projectdiscovery/katana/releases).

```console
go install github.com/projectdiscovery/katana/cmd/katana@latest
CGO_ENABLED=1 go install github.com/projectdiscovery/katana/cmd/katana@latest
```

**More options to install / run katana-**
Expand Down Expand Up @@ -157,10 +156,6 @@ HEADLESS:
-cwu, -chrome-ws-url string use chrome browser instance launched elsewhere with the debugger listening at this URL
-xhr, -xhr-extraction extract xhr request url,method in jsonl output

PASSIVE:
-ps, -passive enable passive sources to discover target endpoints
-pss, -passive-source string[] passive source to use for url discovery (waybackarchive,commoncrawl,alienvault)

SCOPE:
-cs, -crawl-scope string[] in scope url regex to be followed by crawler
-cos, -crawl-out-scope string[] out of scope url regex to be excluded by crawler
Expand Down Expand Up @@ -193,6 +188,7 @@ OUTPUT:
-o, -output string file to write output to
-sr, -store-response store http requests/responses
-srd, -store-response-dir string store http requests/responses to custom directory
-sfd, -store-field-dir string store per-host field to custom directory
-or, -omit-raw omit raw requests/responses from jsonl output
-ob, -omit-body omit response body from jsonl output
-j, -jsonl write output in jsonl format
Expand Down Expand Up @@ -688,7 +684,7 @@ katana -u https://tesla.com -f email,phone
*`-store-field`*
---

To compliment `field` option which is useful to filter output at run time, there is `-sf, -store-fields` option which works exactly like field option except instead of filtering, it stores all the information on the disk under `katana_field` directory sorted by target url.
To compliment `field` option which is useful to filter output at run time, there is `-sf, -store-fields` option which works exactly like field option except instead of filtering, it stores all the information on the disk under `katana_field` directory sorted by target url. Use `-sfd` or `-store-field-dir` to store data in a different location.

```
katana -u https://tesla.com -sf key,fqdn,qurl -silent
Expand Down
16 changes: 11 additions & 5 deletions cmd/katana/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import (
"github.com/projectdiscovery/katana/pkg/types"
errorutil "github.com/projectdiscovery/utils/errors"
fileutil "github.com/projectdiscovery/utils/file"
folderutil "github.com/projectdiscovery/utils/folder"
"github.com/rs/xid"
)

Expand Down Expand Up @@ -68,7 +69,14 @@ func main() {
gologger.Fatal().Msgf("could not execute crawling: %s", err)
}

// on successful execution remove the resume file in case it exists
// on successful execution:

// deduplicate the lines in each file in the store-field-dir
//use options.StoreFieldDir once https://github.com/projectdiscovery/katana/pull/877 is merged
storeFieldDir := "katana_field"
_ = folderutil.DedupeLinesInFiles(storeFieldDir)

// remove the resume file in case it exists
if fileutil.FileExists(resumeFilename) {
os.Remove(resumeFilename)
}
Expand Down Expand Up @@ -126,10 +134,6 @@ pipelines offering both headless and non-headless crawling.`)
flagSet.StringVarP(&options.ChromeWSUrl, "chrome-ws-url", "cwu", "", "use chrome browser instance launched elsewhere with the debugger listening at this URL"),
flagSet.BoolVarP(&options.XhrExtraction, "xhr-extraction", "xhr", false, "extract xhr request url,method in jsonl output"),
)
flagSet.CreateGroup("passive", "Passive",
flagSet.BoolVarP(&options.Passive, "passive", "ps", false, "enable passive sources to discover target endpoints"),
flagSet.StringSliceVarP(&options.PassiveSource, "passive-source", "pss", nil, "passive source to use for url discovery (waybackarchive,commoncrawl,alienvault)", goflags.NormalizedStringSliceOptions),
)

flagSet.CreateGroup("scope", "Scope",
flagSet.StringSliceVarP(&options.Scope, "crawl-scope", "cs", nil, "in scope url regex to be followed by crawler", goflags.FileCommaSeparatedStringSliceOptions),
Expand Down Expand Up @@ -168,6 +172,8 @@ pipelines offering both headless and non-headless crawling.`)
flagSet.StringVarP(&options.OutputFile, "output", "o", "", "file to write output to"),
flagSet.BoolVarP(&options.StoreResponse, "store-response", "sr", false, "store http requests/responses"),
flagSet.StringVarP(&options.StoreResponseDir, "store-response-dir", "srd", "", "store http requests/responses to custom directory"),
flagSet.BoolVarP(&options.NoClobber, "no-clobber", "ncb", false, "do not overwrite output file"),
flagSet.StringVarP(&options.StoreFieldDir, "store-field-dir", "sfd", "", "store per-host field to custom directory"),
flagSet.BoolVarP(&options.OmitRaw, "omit-raw", "or", false, "omit raw requests/responses from jsonl output"),
flagSet.BoolVarP(&options.OmitBody, "omit-body", "ob", false, "omit response body from jsonl output"),
flagSet.BoolVarP(&options.JSON, "jsonl", "j", false, "write output in jsonl format"),
Expand Down
85 changes: 44 additions & 41 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -11,23 +11,21 @@ require (
github.com/lukasbob/srcset v0.0.0-20190730101422-86b742e617f3
github.com/mitchellh/mapstructure v1.5.0
github.com/pkg/errors v0.9.1
github.com/projectdiscovery/dsl v0.0.48
github.com/projectdiscovery/fastdialer v0.0.64
github.com/projectdiscovery/goflags v0.1.45
github.com/projectdiscovery/gologger v1.1.12
github.com/projectdiscovery/hmap v0.0.41
github.com/projectdiscovery/mapcidr v1.1.16
github.com/projectdiscovery/ratelimit v0.0.33
github.com/projectdiscovery/retryablehttp-go v1.0.53
github.com/projectdiscovery/useragent v0.0.41
github.com/projectdiscovery/utils v0.0.85
github.com/projectdiscovery/wappalyzergo v0.0.115
github.com/projectdiscovery/dsl v0.2.5
github.com/projectdiscovery/fastdialer v0.2.9
github.com/projectdiscovery/goflags v0.1.64
github.com/projectdiscovery/gologger v1.1.27
github.com/projectdiscovery/hmap v0.0.63
github.com/projectdiscovery/mapcidr v1.1.34
github.com/projectdiscovery/ratelimit v0.0.60
github.com/projectdiscovery/retryablehttp-go v1.0.82
github.com/projectdiscovery/utils v0.2.15
github.com/projectdiscovery/wappalyzergo v0.1.24
github.com/remeh/sizedwaitgroup v1.0.0
github.com/rs/xid v1.5.0
github.com/shirou/gopsutil/v3 v3.23.7
github.com/stretchr/testify v1.9.0
go.uber.org/multierr v1.11.0
golang.org/x/net v0.21.0
golang.org/x/net v0.29.0
gopkg.in/yaml.v3 v3.0.1
)

Expand All @@ -37,46 +35,49 @@ require (
github.com/Masterminds/semver/v3 v3.2.1 // indirect
github.com/Mzack9999/gcache v0.0.0-20230410081825-519e28eab057 // indirect
github.com/VividCortex/ewma v1.2.0 // indirect
github.com/alecthomas/chroma v0.10.0 // indirect
github.com/alecthomas/chroma/v2 v2.14.0 // indirect
github.com/andybalholm/brotli v1.0.6 // indirect
github.com/aymanbagabas/go-osc52/v2 v2.0.1 // indirect
github.com/charmbracelet/glamour v0.6.0 // indirect
github.com/bits-and-blooms/bitset v1.13.0 // indirect
github.com/charmbracelet/glamour v0.8.0 // indirect
github.com/charmbracelet/lipgloss v0.13.0 // indirect
github.com/charmbracelet/x/ansi v0.3.2 // indirect
github.com/cheggaaa/pb/v3 v3.1.4 // indirect
github.com/cloudflare/circl v1.3.7 // indirect
github.com/ditashi/jsbeautifier-go v0.0.0-20141206144643-2520a8026a9c // indirect
github.com/dlclark/regexp2 v1.8.1 // indirect
github.com/dlclark/regexp2 v1.11.4 // indirect
github.com/docker/go-units v0.5.0 // indirect
github.com/fatih/color v1.15.0 // indirect
github.com/gaukas/godicttls v0.0.4 // indirect
github.com/gaissmai/bart v0.9.5 // indirect
github.com/golang/protobuf v1.5.3 // indirect
github.com/google/go-github/v30 v30.1.0 // indirect
github.com/google/go-querystring v1.1.0 // indirect
github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510 // indirect
github.com/google/uuid v1.3.1 // indirect
github.com/hashicorp/go-version v1.6.0 // indirect
github.com/hdm/jarm-go v0.0.7 // indirect
github.com/kataras/jwt v0.1.8 // indirect
github.com/klauspost/compress v1.16.7 // indirect
github.com/klauspost/compress v1.17.4 // indirect
github.com/klauspost/pgzip v1.2.5 // indirect
github.com/kr/pretty v0.3.1 // indirect
github.com/lucasb-eyer/go-colorful v1.2.0 // indirect
github.com/mattn/go-colorable v0.1.13 // indirect
github.com/mattn/go-isatty v0.0.19 // indirect
github.com/mattn/go-runewidth v0.0.14 // indirect
github.com/mattn/go-isatty v0.0.20 // indirect
github.com/mattn/go-runewidth v0.0.16 // indirect
github.com/mholt/archiver/v3 v3.5.1 // indirect
github.com/minio/selfupdate v0.6.1-0.20230907112617-f11e74f84ca7 // indirect
github.com/muesli/reflow v0.3.0 // indirect
github.com/muesli/termenv v0.15.1 // indirect
github.com/olekukonko/tablewriter v0.0.5 // indirect
github.com/muesli/termenv v0.15.3-0.20240618155329-98d742f6907a // indirect
github.com/pierrec/lz4/v4 v4.1.2 // indirect
github.com/projectdiscovery/asnmap v1.1.0 // indirect
github.com/projectdiscovery/asnmap v1.1.1 // indirect
github.com/projectdiscovery/blackrock v0.0.1 // indirect
github.com/projectdiscovery/gostruct v0.0.2 // indirect
github.com/projectdiscovery/machineid v0.0.0-20240226150047-2e2c51e35983 // indirect
github.com/projectdiscovery/stringsutil v0.0.2 // indirect
github.com/quic-go/quic-go v0.37.7 // indirect
github.com/refraction-networking/utls v1.5.4 // indirect
github.com/rivo/uniseg v0.4.4 // indirect
github.com/refraction-networking/utls v1.6.7 // indirect
github.com/rivo/uniseg v0.4.7 // indirect
github.com/rogpeppe/go-internal v1.12.0 // indirect
github.com/sashabaranov/go-openai v1.14.2 // indirect
github.com/shirou/gopsutil/v3 v3.23.7 // indirect
github.com/shoenig/go-m1cpu v0.1.6 // indirect
github.com/smacker/go-tree-sitter v0.0.0-20230720070738-0d0a9f78d8f8 // indirect
github.com/spaolacci/murmur3 v1.1.0 // indirect
Expand All @@ -90,10 +91,13 @@ require (
github.com/tidwall/tinyqueue v0.1.1 // indirect
github.com/ysmood/fetchup v0.2.3 // indirect
github.com/ysmood/got v0.34.1 // indirect
github.com/yuin/goldmark v1.5.4 // indirect
github.com/yuin/goldmark-emoji v1.0.1 // indirect
github.com/yuin/goldmark v1.7.4 // indirect
github.com/yuin/goldmark-emoji v1.0.3 // indirect
github.com/zcalusic/sysinfo v1.0.2 // indirect
golang.org/x/oauth2 v0.11.0 // indirect
golang.org/x/term v0.17.0 // indirect
golang.org/x/sync v0.8.0 // indirect
golang.org/x/term v0.24.0 // indirect
golang.org/x/time v0.5.0 // indirect
google.golang.org/appengine v1.6.7 // indirect
google.golang.org/protobuf v1.33.0 // indirect
)
Expand All @@ -110,38 +114,37 @@ require (
github.com/dsnet/compress v0.0.2-0.20210315054119-f66993602bf5 // indirect
github.com/go-ole/go-ole v1.2.6 // indirect
github.com/golang/snappy v0.0.4 // indirect
github.com/gorilla/css v1.0.0 // indirect
github.com/gorilla/css v1.0.1 // indirect
github.com/lufia/plan9stats v0.0.0-20211012122336-39d0f177ccd0 // indirect
github.com/microcosm-cc/bluemonday v1.0.25 // indirect
github.com/microcosm-cc/bluemonday v1.0.27 // indirect
github.com/miekg/dns v1.1.56 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.2 // indirect
github.com/nwaples/rardecode v1.1.3 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/power-devops/perfstat v0.0.0-20210106213030-5aafc221ea8c // indirect
github.com/projectdiscovery/networkpolicy v0.0.8
github.com/projectdiscovery/retryabledns v1.0.58 // indirect
github.com/projectdiscovery/networkpolicy v0.0.9
github.com/projectdiscovery/retryabledns v1.0.81 // indirect
github.com/saintfish/chardet v0.0.0-20230101081208-5e3ef4b5456d // indirect
github.com/syndtr/goleveldb v1.0.0 // indirect
github.com/tklauser/go-sysconf v0.3.12 // indirect
github.com/tklauser/numcpus v0.6.1 // indirect
github.com/ulikunitz/xz v0.5.11 // indirect
github.com/weppos/publicsuffix-go v0.30.1-0.20230422193905-8fecedd899db // indirect
github.com/xi2/xz v0.0.0-20171230120015-48954b6210f8 // indirect
github.com/yl2chen/cidranger v1.0.2 // indirect
github.com/ysmood/goob v0.4.0 // indirect
github.com/ysmood/gson v0.7.3 // indirect
github.com/ysmood/leakless v0.8.0 // indirect
github.com/yusufpapurcu/wmi v1.2.4 // indirect
github.com/zmap/rc2 v0.0.0-20190804163417-abaa70531248 // indirect
github.com/zmap/zcrypto v0.0.0-20230422215203-9a665e1e9968 // indirect
go.etcd.io/bbolt v1.3.7 // indirect
golang.org/x/crypto v0.19.0 // indirect
golang.org/x/exp v0.0.0-20230905200255-921286631fa9
golang.org/x/mod v0.12.0 // indirect
golang.org/x/sys v0.17.0 // indirect
golang.org/x/text v0.14.0 // indirect
golang.org/x/tools v0.13.0 // indirect
golang.org/x/crypto v0.27.0 // indirect
golang.org/x/exp v0.0.0-20230905200255-921286631fa9 // indirect
golang.org/x/mod v0.17.0 // indirect
golang.org/x/sys v0.25.0 // indirect
golang.org/x/text v0.18.0 // indirect
golang.org/x/tools v0.21.1-0.20240508182429-e35e4ccd0d2d // indirect
gopkg.in/djherbis/times.v1 v1.3.0 // indirect
gopkg.in/yaml.v2 v2.4.0
)
Loading

0 comments on commit f8486d4

Please sign in to comment.