Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CBG-3909: use deltas for pv and mv when persisting to the bucket #7096

Merged
merged 19 commits into from
Oct 11, 2024
Merged
43 changes: 43 additions & 0 deletions base/util.go
Original file line number Diff line number Diff line change
Expand Up @@ -1018,6 +1018,49 @@ func HexCasToUint64(cas string) uint64 {
return binary.LittleEndian.Uint64(casBytes[0:8])
}

// HexCasToUint64ForDelta will convert hex cas to uint64 accounting for any stripped zeros in delta calculation
func HexCasToUint64ForDelta(casByte []byte) (uint64, error) {
var decoded []byte

// as we strip any zeros off the end of the hex value for deltas, the input delta could be odd length
if len(casByte)%2 != 0 {
casByte = append(casByte, '0')
}

// create byte array for decoding into
decodedLen := hex.DecodedLen(len(casByte))
// binary.LittleEndian.Uint64 expects length 8 byte array, if larger we should error, if smaller
// (because of stripped 0's then we should make it length 8).
if decodedLen > 8 {
return 0, fmt.Errorf("corrupt hex value, decoded length larger than expected")
}
if decodedLen < 8 {
// can be less than 8 given we have stripped the 0's for some values, in this case we need to ensure large eniough
decoded = make([]byte, 8)
} else {
decoded = make([]byte, decodedLen)
}

if _, err := hex.Decode(decoded, casByte); err != nil {
return 0, err
}
res := binary.LittleEndian.Uint64(decoded)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't binary.LittleEndian.Uint64 expect a length 8 []byte as input? Here it looks like we're setting decoded to MaxDecodedLength (18) - can you help me understand the case where casByte[2:] is longer than 8 bytes but we need to support this and are intentionally ignoring trailing bytes?

I thought this was just handling the case where it's less than 8 bytes because we've trimmed trailing zeros.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I don't think the max decoded length line is needed, I have removed it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, hex.Decode is going to panic if casByte[2:] overflows decoded, so it's possible the previous code was defensive. https://go.dev/play/p/_LEScaOT1FG
Our HexCasToUint64 has an explicit length check to protect against this. Possibly we just want to do the same (but also supporting the case here that the length can be less than 8): doing something like return error if len(casByte) > 10 at the top of the function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some safety checks around this area, let me know what you think

return res, nil
}

// Uint64ToLittleEndianHexAndStripZeros will convert a uint64 type to little endian hex, stripping any zeros off the end
// + stripping 0x from start
func Uint64ToLittleEndianHexAndStripZeros(cas uint64) string {
hexCas := Uint64CASToLittleEndianHex(cas)

i := len(hexCas) - 1
for i > 2 && hexCas[i] == '0' {
i--
}
// strip 0x from start
return string(hexCas[2 : i+1])
}

func HexToBase64(s string) ([]byte, error) {
decoded := make([]byte, hex.DecodedLen(len(s)))
if _, err := hex.Decode(decoded, []byte(s)); err != nil {
Expand Down
37 changes: 37 additions & 0 deletions base/util_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1735,3 +1735,40 @@ func TestCASToLittleEndianHex(t *testing.T) {
littleEndianHex := Uint64CASToLittleEndianHex(casValue)
require.Equal(t, expHexValue, string(littleEndianHex))
}

func TestUint64CASToLittleEndianHexAndStripZeros(t *testing.T) {
hexLE := "0x0000000000000000"
u64 := HexCasToUint64(hexLE)
hexLEStripped := Uint64ToLittleEndianHexAndStripZeros(u64)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running this test locally, the value of hexLEStripped is "000". Is this what we intend?
It looks like any hex value with only two non-zero digits retains an extra zero. (e.g. 0xa500000000000000 results in a50 when I would have expected a5. Is there a requirement for a minimum length of three?

Copy link
Contributor Author

@gregns1 gregns1 Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot, fixed this so now that delta is just 0, also added new test case for this

u64Stripped, err := HexCasToUint64ForDelta([]byte(hexLEStripped))
require.NoError(t, err)
assert.Equal(t, u64, u64Stripped)

hexLE = "0xffffffffffffffff"
u64 = HexCasToUint64(hexLE)
hexLEStripped = Uint64ToLittleEndianHexAndStripZeros(u64)
u64Stripped, err = HexCasToUint64ForDelta([]byte(hexLEStripped))
require.NoError(t, err)
assert.Equal(t, u64, u64Stripped)

hexLE = "0xd123456e789a0bcf"
u64 = HexCasToUint64(hexLE)
hexLEStripped = Uint64ToLittleEndianHexAndStripZeros(u64)
u64Stripped, err = HexCasToUint64ForDelta([]byte(hexLEStripped))
require.NoError(t, err)
assert.Equal(t, u64, u64Stripped)

hexLE = "0xd123456e78000000"
u64 = HexCasToUint64(hexLE)
hexLEStripped = Uint64ToLittleEndianHexAndStripZeros(u64)
u64Stripped, err = HexCasToUint64ForDelta([]byte(hexLEStripped))
require.NoError(t, err)
assert.Equal(t, u64, u64Stripped)

hexLE = "0xa500000000000000"
u64 = HexCasToUint64(hexLE)
hexLEStripped = Uint64ToLittleEndianHexAndStripZeros(u64)
u64Stripped, err = HexCasToUint64ForDelta([]byte(hexLEStripped))
require.NoError(t, err)
assert.Equal(t, u64, u64Stripped)
}
8 changes: 5 additions & 3 deletions db/document.go
Original file line number Diff line number Diff line change
Expand Up @@ -1117,7 +1117,8 @@ func (doc *Document) UnmarshalWithXattrs(ctx context.Context, data, syncXattrDat
}
}
if hlvXattrData != nil {
err := base.JSONUnmarshal(hlvXattrData, &doc.SyncData.HLV)
// parse the raw bytes of the hlv and convert deltas back to full values in memory
err := base.JSONUnmarshal(hlvXattrData, &doc.HLV)
if err != nil {
return pkgerrors.WithStack(base.RedactErrorf("Failed to unmarshal HLV during UnmarshalWithXattrs() doc with id: %s (DocUnmarshalAll/Sync). Error: %v", base.UD(doc.ID), err))
}
Expand Down Expand Up @@ -1157,7 +1158,8 @@ func (doc *Document) UnmarshalWithXattrs(ctx context.Context, data, syncXattrDat
}
}
if hlvXattrData != nil {
err := base.JSONUnmarshal(hlvXattrData, &doc.SyncData.HLV)
// parse the raw bytes of the hlv and convert deltas back to full values in memory
err := base.JSONUnmarshal(hlvXattrData, &doc.HLV)
if err != nil {
return pkgerrors.WithStack(base.RedactErrorf("Failed to unmarshal HLV during UnmarshalWithXattrs() doc with id: %s (DocUnmarshalNoHistory). Error: %v", base.UD(doc.ID), err))
}
Expand Down Expand Up @@ -1251,7 +1253,7 @@ func (doc *Document) MarshalWithXattrs() (data, syncXattr, vvXattr, mouXattr, gl
}
}
if doc.SyncData.HLV != nil {
vvXattr, err = base.JSONMarshal(&doc.SyncData.HLV)
vvXattr, err = base.JSONMarshal(doc.SyncData.HLV)
if err != nil {
return nil, nil, nil, nil, nil, pkgerrors.WithStack(base.RedactErrorf("Failed to MarshalWithXattrs() doc vv with id: %s. Error: %v", base.UD(doc.ID), err))
}
Expand Down
36 changes: 22 additions & 14 deletions db/document_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -239,24 +239,16 @@ const doc_meta_no_vv = `{
"time_saved": "2017-10-25T12:45:29.622450174-07:00"
}`

const doc_meta_vv = `{
"cvCas":"0x40e2010000000000",
"src":"cb06dc003846116d9b66d2ab23887a96",
"ver":"0x40e2010000000000",
"mv":{
"s_LhRPsa7CpjEvP5zeXTXEBA":"c0ff05d7ac059a16",
"s_NqiIe0LekFPLeX4JvTO6Iw":"1c008cd6ac059a16"
},
"pv":{
"s_YZvBpEaztom9z5V/hDoeIw":"f0ff44d6ac059a16"
}
}`
const doc_meta_vv = `{"cvCas":"0x40e2010000000000","src":"cb06dc003846116d9b66d2ab23887a96","ver":"0x40e2010000000000",
"mv":["c0ff05d7ac059a16@s_LhRPsa7CpjEvP5zeXTXEBA","1c008cd6@s_NqiIe0LekFPLeX4JvTO6Iw"],
"pv":["f0ff44d6ac059a16@s_YZvBpEaztom9z5V/hDoeIw"]
}`

func TestParseVersionVectorSyncData(t *testing.T) {
mv := make(HLVVersions)
pv := make(HLVVersions)
mv["s_LhRPsa7CpjEvP5zeXTXEBA"] = 1628620455147864000 //"c0ff05d7ac059a16"
mv["s_NqiIe0LekFPLeX4JvTO6Iw"] = 1628620455139868700
mv["s_LhRPsa7CpjEvP5zeXTXEBA"] = 1628620455147864000
mv["s_NqiIe0LekFPLeX4JvTO6Iw"] = 1628620458747363292
pv["s_YZvBpEaztom9z5V/hDoeIw"] = 1628620455135215600

ctx := base.TestCtx(t)
Expand Down Expand Up @@ -295,6 +287,22 @@ func TestParseVersionVectorSyncData(t *testing.T) {
assert.True(t, reflect.DeepEqual(pv, doc.SyncData.HLV.PreviousVersions))
}

const doc_meta_vv_corrupt = `{"cvCas":"0x40e2010000000000","src":"cb06dc003846116d9b66d2ab23887a96","ver":"0x40e2010000000000",
"mv":["c0ff05d7ac059a16@s_LhRPsa7CpjEvP5zeXTXEBA","1c008cd61c008cd61c008cd6@s_NqiIe0LekFPLeX4JvTO6Iw"],
"pv":["f0ff44d6ac059a16@s_YZvBpEaztom9z5V/hDoeIw"]
}`

func TestParseVersionVectorCorruptDelta(t *testing.T) {

ctx := base.TestCtx(t)

sync_meta := []byte(doc_meta_no_vv)
vv_meta := []byte(doc_meta_vv_corrupt)
_, err := unmarshalDocumentWithXattrs(ctx, "doc1", nil, sync_meta, vv_meta, nil, nil, nil, nil, 1, DocUnmarshalAll)
require.Error(t, err)

}

// TestRevAndVersion tests marshalling and unmarshalling rev and current version
func TestRevAndVersion(t *testing.T) {

Expand Down
Loading
Loading