Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency in database #30

Open
devl00p opened this issue Nov 2, 2022 · 1 comment
Open

Inconsistency in database #30

devl00p opened this issue Nov 2, 2022 · 1 comment

Comments

@devl00p
Copy link

devl00p commented Nov 2, 2022

While working on improving the htp module on Wapiti ( wapiti-scanner/wapiti#344 ), I noticed several inconsistencies in the hashtheplanet database.

What happens is that a version appears in the hash table but doesn't have its counterpart in the version table

sqlite> select count(*) from hash where versions like "%4.0.0-alpha4x%" and technology = "WordPress";
202
sqlite> select count(*) from version where technology = "WordPress" and version = "4.0.0-alpha4x";
0

This is particularily true with the aforementioned version that appears with a lot of hashes (I cut the output):

GET https://blog.logrocket.com/wp-includes/js/tinymce/license.txt (0) led to technology ('magento2', '"{\\"versions\\": [\\"2.3.0\\", \\"2.3.1\\", \\"2.3.2\\", \\"2.3.2-p2\\", \\"2.3.3\\", \\"2.3.3-p1\\", \\"2.3.4\\", \\"2.3.4-p2\\", \\"2.3.5\\", \\"2.3.5-p1\\", \\"2.3.5-p2\\", \\"2.3.6\\", \\"2.3.6-p1\\", \\"2.3.7\\", \\"2.3.7-p1\\", \\"2.3.7-p2\\", \\"2.3.7-p3\\", \\"2.3.7-p4\\", \\"2.4.0\\", \\"2.4.0-p1\\", \\"2.4.1\\", \\"2.4.1-p1\\", \\"2.4.2\\", \\"2.4.2-p1\\", \\"2.4.2-p2\\", \\"2.4.3\\", \\"2.4.3-p1\\", \\"2.4.3-p2\\", \\"2.4.3-p3\\", \\"2.4.4\\", \\"4.0.0-alpha1\\", \\"4.0.0-alpha10\\", \\"4.0.0-alpha11\\", \\"4.0.0-alpha12\\", \\"4.0.0-alpha2\\", \\"4.0.0-alpha3\\", \\"4.0.0-alpha4\\", \\"4.0.0-alpha4x\\"]}"')

GET https://blog.logrocket.com/wp-includes/js/mediaelement/mediaelementplayer.css (0) led to technology ('joomla-cms', '"{\\"versions\\": [\\"4.0.0-alpha4x\\"]}"')

GET https://blog.logrocket.com/wp-includes/sodium_compat/src/Core/Curve25519/README.md (0) led to technology ('WordPress', '"{\\"versions\\": [\\"5.2\\", \\"3.10.0\\", \\"3.10.0-alpha1\\", \\"3.10.0-alpha2\\",  \\"4.0.0\\", \\"4.0.0-alpha1\\", \\"4.0.0-alpha10\\", \\"4.0.0-alpha11\\", \\"4.0.0-alpha12\\", \\"4.0.0-alpha2\\", \\"4.0.0-alpha3\\", \\"4.0.0-alpha4\\", \\"4.0.0-alpha4x\\", \\"4.0.0-alpha5\\", \\"4.0.0-alpha6\\", \\"psr12anchor\\"]}"')


GET https://blog.logrocket.com/wp-content/themes/twentytwentytwo/templates/blank.html (0) led to technology ('underscore', '"{\\"versions\\": [\\"1.12.1\\", \\"1.13.0-0\\", \\"1.13.0-2\\", \\"1.13.0-1\\", \\"8.0-alpha10\\", \\"8.0-alpha11\\", \\"8.0-alpha12\\", \\"8.0-alpha13\\", \\"8.0-alpha2\\", \\"8.0-alpha3\\", \\"8.0-alpha4\\", \\"8.0-alpha5\\", \\"8.0-alpha6\\", \\"8.0-alpha7\\", \\"8.0-alpha8\\",  \\"4.0.0\\", \\"4.0.0-alpha1\\", \\"4.0.0-alpha10\\", \\"4.0.0-alpha11\\", \\"4.0.0-alpha12\\", \\"4.0.0-alpha2\\", \\"4.0.0-alpha3\\", \\"4.0.0-alpha4\\", \\"4.0.0-alpha4x\\", \\"4.0.0-alpha5\\", \\"4.0.0-alpha6\\", \\"4.0.0-alpha7\\", \\"4.0.0-alpha8\\", \\"4.0.0-alpha9\\", \\"4.0.0-beta\\", \\"4.0.0-beta2\\", \\"4.0.0-beta3\\", \\"4.0.0-beta4\\", \\"4.0.0-beta5\\", \\"4.0.0-beta6\\", \\"4.0.0-beta7\\", \\"4.0.0-rc1\\", \\"psr12anchor\\", \\"psr12final\\", \\"search1\\"]}"')

Only the joomla-cms entry is relevant because that tag is specific to Joomla: https://github.com/joomla/joomla-cms/releases/tag/4.0.0-alpha4x

It is the same problem with tags psr12anchor and psr12final and certainly more.

Also some hashes should maybe be blacklisted because they match files that can be found in a lot of software like (in the previous output) :

  • a file with a single empty line (blank.html)
  • the default LGPL licence file

Those invalid version numbers certainly have an impact on the database size (issue #28 )

@tarraschk
Copy link
Member

Merci @devl00p on va regarder

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants