Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support tagfiltertree for fast matching metricIDs to queries #4310

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .golangci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,6 @@ linters:
- gci
- goconst
- gocritic
- golint
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

golint is deprecated now and mainly not detecting the type correctly when using Generics.

- gosimple
- govet
- ineffassign
Expand Down
22 changes: 22 additions & 0 deletions src/metrics/tagfiltertree/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Tag Filter Tree

## Motivation
There are many instances where we want to match an input metricID against
a set of tag filters. One such use-case is metric attribution to namespaces.
Iterating through each filter individually and matching them is extremely expensive
since it has to be done on each incoming metricID. Therefore, this data structure
pre-compiles a set of tag filters in order to optimize matches against an input metricID.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand this paragraph.

  1. "attribution to namespaces" - what does this mean? What is a namespace in this context?
  2. How does this pre-compiled data structure prevent you from having to do matching on each incoming metricID?

Perhaps a diagram or example would help here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the Readme


## Usage
First create a trie using New() and then add tagFilters using AddTagFilter().
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then I guess you use Match somehow? A code example here would be useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

The tags within a filter can be specified in any order but to condense the compiled
output of the trie, try and specify the most common set of tags in the beginning
and in the same order.
For instance, in case you have a tag "service" which you anticipate to be present
in all filters then make sure that is specified first and then specify the remaining tags
in the filter.
The trie also supports "*" for a tag value which can be used to ensure the existance of a tag
in the input metricID.

## Caveats
The trie might return duplicates and it is up to the caller to de-dup the results.
30 changes: 30 additions & 0 deletions src/metrics/tagfiltertree/options.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
package tagfiltertree

import "github.com/m3db/m3/src/metrics/filters"

// Options is a set of options for the attributor.
type Options interface {
TagFilterOptions() filters.TagsFilterOptions
SetTagFilterOptions(tf filters.TagsFilterOptions) Options
}

type options struct {
tagFilterOptions filters.TagsFilterOptions
}

// NewOptions creates a new set of options.
func NewOptions() Options {
return &options{}
}

// TagFilterOptions returns the tag filter options.
func (o *options) TagFilterOptions() filters.TagsFilterOptions {
return o.tagFilterOptions
}

// SetTagFilterOptions sets the tag filter options.
func (o *options) SetTagFilterOptions(tf filters.TagsFilterOptions) Options {
opts := *o
opts.tagFilterOptions = tf
return &opts
}
40 changes: 40 additions & 0 deletions src/metrics/tagfiltertree/pointer_set.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
package tagfiltertree

import "math/bits"

// PointerSet is a set of pointers backed by a bitmap to
// represent a sparse set of at most 127 pointers.
type PointerSet struct {
bits [2]uint64 // Using 2 uint64 gives us 128 bits (0 to 127).
}

// Set adds a pointer at index i (0 <= i < 127).
func (ps *PointerSet) Set(i byte) {
if i < 64 {
ps.bits[0] |= (1 << i)
} else {
ps.bits[1] |= (1 << (i - 64))
}
}

// IsSet checks if a pointer is present at index i.
func (ps *PointerSet) IsSet(i byte) bool {
if i < 64 {
return ps.bits[0]&(1<<i) != 0
}
return ps.bits[1]&(1<<(i-64)) != 0
}

// CountSetBitsUntil counts how many bits are set to 1 up to index i (inclusive).
func (ps *PointerSet) CountSetBitsUntil(i byte) int {
if i < 64 {
// Count bits in the first uint64 up to index i.
return bits.OnesCount64(ps.bits[0] & ((1 << (i + 1)) - 1))
}

// Count all bits in the first uint64.
count := bits.OnesCount64(ps.bits[0])
// Count bits in the second uint64 up to index i - 64.
count += bits.OnesCount64(ps.bits[1] & ((1 << (i - 64 + 1)) - 1))
return count
}
61 changes: 61 additions & 0 deletions src/metrics/tagfiltertree/pointer_set_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
package tagfiltertree

import (
"math"
"testing"

"github.com/stretchr/testify/require"
)

func TestPointerSetCountBits(t *testing.T) {
tests := []struct {
name string
setBits []uint64
expected int
}{
{
name: "empty set",
setBits: []uint64{0, 0},
expected: 0,
},
{
name: "single set bit",
setBits: []uint64{0, 1},
expected: 1,
},
{
name: "multiple set bits",
setBits: []uint64{7, 7},
expected: 6,
},
{
name: "all set bits",
setBits: []uint64{math.MaxUint64, math.MaxUint64},
expected: 128,
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
ps := PointerSet{}
l := tt.setBits[0]
r := tt.setBits[1]
var i byte
for i = 0; i < 128; i++ {
if i < 64 {
if l&0x1 == 1 {
ps.Set(i)
}
l >>= 1
} else {
if r&0x1 == 1 {
ps.Set(i)
}
r >>= 1
}
}

require.Equal(t, tt.expected, ps.CountSetBitsUntil(127))
})
}
}
Loading