Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect hashes on small strings #16

Open
josefnpat opened this issue Sep 20, 2019 · 7 comments
Open

Incorrect hashes on small strings #16

josefnpat opened this issue Sep 20, 2019 · 7 comments

Comments

@josefnpat
Copy link

I was double checking my indexes, and found a collision. Surpised, I did some more research:

nothanks@DESKTOP-616042V ~/repos/roguecraft-squadron
$ cat test1.txt
"Computer, what happened to the IEE Bishop?"

nothanks@DESKTOP-616042V ~/repos/roguecraft-squadron
$ cat test1.txt  | md5sum.exe
5f0d58eb72664a1e2848b191edfd1bee *-

nothanks@DESKTOP-616042V ~/repos/roguecraft-squadron
$ cat test2.txt
"Computer, what is Miho Takeda's current status?"

nothanks@DESKTOP-616042V ~/repos/roguecraft-squadron
$ cat test2.txt | md5sum.exe
03f3a365b8dfc21ddad179289a0138f6 *-


nothanks@DESKTOP-616042V ~/repos/roguecraft-squadron
$ cat test.lua
md5 = require"src.libs.md5"

print(md5.tohex('"Computer, what happened to the IEE Bishop?"'))
print(md5.tohex('"Computer, what is Miho Takeda\'s current status?"'))


nothanks@DESKTOP-616042V ~/repos/roguecraft-squadron
$ lua test.lua
22436f6d70757465722c207768617420
22436f6d70757465722c207768617420

It seems that I not only found a collision, but also that hashes for smaller strings are not correct.

Am I doing something wrong, or is this a bug?

@pablomayobre
Copy link
Contributor

Heyo! It seems like you are not performing the sum on your data. Try the following:

md5 = require"src.libs.md5"

print(md5.tohex(md5.sum('"Computer, what happened to the IEE Bishop?"')))
print(md5.tohex(md5.sum('"Computer, what is Miho Takeda\'s current status?"')))

Or simpler:

md5 = require"src.libs.md5"

print(md5.sumhexa('"Computer, what happened to the IEE Bishop?"'))
print(md5.sumhexa('"Computer, what is Miho Takeda\'s current status?"'))

md5.tohex is just a nice shorthand for string.formatting the hex data, it does not perform the sum.

You either need md5.sum or md5.sumhexa for that, the former returns the MD5 sum as raw bytes in a string, while the later uses hexadecimal digits to represent the MD5 sum.

@tst2005
Copy link

tst2005 commented Sep 23, 2019

Hello,

md5.lua is probably buggy...
but in your testcase your data is not equal to your file content.
You must add a new line (\n) at the end of data to reflect the file content.

assert( "5f0d58eb72664a1e2848b191edfd1bee" == md5sum('"Computer, what happened to the IEE Bishop?"\n'))
assert( "03f3a365b8dfc21ddad179289a0138f6" == md5sum('"Computer, what is Miho Takeda\'s current status?"\n'))

@pablomayobre
Copy link
Contributor

Why do you think md5.lua is probably buggy @tst2005?

And yeah, the newline is needed if you are comparing against md5sum.exe as seen here

@tst2005
Copy link

tst2005 commented Sep 23, 2019

Because the md5sum of your both sample (with or without ending new line) are different.

Tested with a linux shell

$ echo -n '"Computer, what happened to the IEE Bishop?"' |md5sum
3f299a0664f45f2eb57e629a6c433326  -

$ echo -n '"Computer, what is Miho Takeda'\''s current status?"' |md5sum
1485157c7d8b58fc0cfee7a614ae507f  -

Tested with lua-lockbox :

local function md5sum(k)
        require("lockbox").ALLOW_INSECURE=true

        local Stream = require("lockbox.util.stream");
        local Digest = require("lockbox.digest.md5");

        return Digest()
                .update(Stream.fromString(k))
                .finish()
                .asHex()
end
assert( "3f299a0664f45f2eb57e629a6c433326" == md5sum('"Computer, what happened to the IEE Bishop?"'))
assert( "1485157c7d8b58fc0cfee7a614ae507f" == md5sum('"Computer, what is Miho Takeda\'s current status?"'))

assert( "5f0d58eb72664a1e2848b191edfd1bee" == md5sum('"Computer, what happened to the IEE Bishop?"\n'))
assert( "03f3a365b8dfc21ddad179289a0138f6" == md5sum('"Computer, what is Miho Takeda\'s current status?"\n'))

@pablomayobre
Copy link
Contributor

Ummm did you read my comment though?

The hashes @josefnpat posted are wrong because he didn't actually perform the sum on the string. That in addition to missing the newline will make it report wrong hashes.

So I encourage to test this yourself:

local function md5sum(k)
        local md5 = require("md5")
        return md5.sumhexa(k)
end
assert( "3f299a0664f45f2eb57e629a6c433326" == md5sum('"Computer, what happened to the IEE Bishop?"'))
assert( "1485157c7d8b58fc0cfee7a614ae507f" == md5sum('"Computer, what is Miho Takeda\'s current status?"'))

assert( "5f0d58eb72664a1e2848b191edfd1bee" == md5sum('"Computer, what happened to the IEE Bishop?"\n'))
assert( "03f3a365b8dfc21ddad179289a0138f6" == md5sum('"Computer, what is Miho Takeda\'s current status?"\n'))

@josefnpat
Copy link
Author

Ok, i got the release tomorrow, but I'll come back to this in a day or two to test. I'm suspecting it's the bloody newlines as well. Thanks

@Grocel
Copy link

Grocel commented Oct 1, 2019

I can confirm that the hash is fine for these cases. I ran that test myself and can confirm that the missing newline changed the result as intended. I also used a third party conversion tool to make sure that all four results are correct for their input strings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants