-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sbserver differs from online/browser lookup? #30
Comments
Thanks for the bug report. We'll look into shortly. |
Anything new on this? |
Bumping this.... Anything new ? |
I am having this exact same issue with the following URL: |
Just wanted to confirm that sblookup also reports this URL as safe:
Plus, this is the output of the test as indicated in the README file:
Finally, I can confirm that there no problem with my API key since I can successfully query this URL using https://github.com/afilipovich/gglsbl on the same machine:
|
With further testing I noticed that when I specified a database file for sbserver and sblookup, the created file is only 6 megabytes. In comparison, the gglsbl Python module creates a local sqlite database that is over 1.4 gigs in size. So maybe what's happening here is that the go client is silently failing to download and/or save the hash database locally. |
Just wanted to share that the lack of feedback on this issue has led me to file this repository under "abandonware". I am using https://github.com/afilipovich/gglsbl instead. It works great, is fast and the author is very very responsive to reported issues. Would recommend that @serpiente, @gliwka and @Heavenwalker take a look at this alternative too if they haven't found another already. |
@asieira Thanks for the hint! Unfortunately I need the REST api, altough it should be possible to combine gglsbl with flask to get there. @dsnet @colonelxc Any progress on this? Sbserver isn't working correctly at this point and the worst part is that it's failing silently! This could leave applications depending on it and their Users vulnerable! |
/cc: @alexwoz |
I have actually built a Flask + gunicorn dockerized REST server on top of gglsbl and was planning on open sourcing it. Would that help? |
@asieira Sure, that would be amazing :-) |
I do not work in this team anymore, but I can assure you that this project is not abandonware. |
Hi everyone, Thank you for all of your contributions to this repo and your patience while we investigated -- based on your reports/comments we've been able to clarify the issue. As part of our API, some clients receive a different list of threats due to data sharing restrictions. This is why you may see discrepancies between the Go client and Safe Browsing-enabled browsers like Chrome. Upon investigating the bugs filed in this repo, we realized that there was a different problem afoot - a bug on the server-side - that will be patched in the coming weeks. Thanks, |
@asieira any updates? |
Finally published the repo I had talked about before, you can find it at https://github.com/mlsecproject/gglsbl-rest if you want to try it out. Any comments and suggestions are most welcome. |
@alexwoz @colonelxc |
@gliwka This issue should be resolved. Please update this bug if you continue to experience any inconsistencies. |
I'm running into the same issues described by other users who commented earlier in this issue thread. Notably, if I use https://transparencyreport.google.com/safe-browsing/search to search for a known malware URL such as 999fitness.com I'm correctly told "Some pages on this site are unsafe". Yet when I use Postman/cURL/sblookup to classify 999fitness.com I receive an "empty" 200 response, indicating there is nothing wrong with the URL. When I the Google API Explorer (https://developers.google.com/apis-explorer/?hl=en_US#p/safebrowsing/v4/safebrowsing.threatMatches.find) to classify the same URL, it just "spins" endlessly. As of right now the explorer has been running for 23 minutes without actually returning a response. Reviewing the Google Cloud Platform API monitor, I'm told everything is just fine, and every one of my queries returned a 200. I was going to post a question on the Google Safe Browsing API forum (https://groups.google.com/forum/#!forum/google-safe-browsing-api) but ironically it is full of spam. Not complaining; just trying to figure out what exactly is going on with this service. Jason |
Essentially the API is focused on answering the question, "Do we think it is safe to go to this site right now?". For foo.com, it is. The malware was on a different (more specific) path (or subdomain). This often happens when a site has been hacked. The attacker will add their own content and redirect users from other sites to the specific path/subdomain. This sometimes has no impact on the rightful content of the site, and so we try to minimize the scope of what is blocked to only the paths that will actually try to infect you. The transparency report does API-style checks, but it also checks if there are more specific paths/subdomains that are known to be bad. So for the second and third URLs, it is responding the same as the API does. For the first URL, it knows that there are more specific paths that are known to be bad. So it says some pages are unsafe, even though foo.com is fine to visit on its own. Does that help? |
Hi @wjgilmore, Thanks for your message, and apologies for the confusion. I can see why the Transparency Report wording and Safe Browsing API responses appear to contradict one another. The Transparency Report communicates the extent to which the provided site is bad; in this case, the site is only "partially" bad ("Some pages on this site..."). The Safe Browsing API, however, will only return a verdict when the provided URL is definitively bad; i.e. we have determined that all URLs (including the root domain) are not unsafe for a user to access. Hopefully that makes sense! Alex |
Hi @colonelxc and @alexwoz Thank you both for these detailed explanations. To summarize:
Is my understanding correct? Our project attempts to determine whether any URLs found in an incoming text message contain potentially dangerous links (phishing, malware, etc). We were under the impression the Safe Browsing API would offer an ideal solution. However it is certainly possible the URL found in a text message would be "safe" yet ultimately lead the unsuspecting user to a subsequently dangerous endpoint. So it sounds like we're going to have to look for an alternative solution. Thanks again, I really appreciate your time. Jason |
Hey @wjgilmore, As @colonelxc mentioned, the Safe Browsing API answers the question of whether the provided URL is safe for a user to access at this time. Your use case sounds very well-suited for this check. The Safe Browsing lists are intended to contain URL expressions from various points of the navigation, including those that users receive links to (e.g. through an SMS). If the initial URL redirects a user to an unsafe endpoint, then there's a good chance that the initial URL and those of subsequent navigations are all on a Safe Browsing list. Hopefully that addresses some of your concerns. Alex |
@alexwoz @colonelxc I'm finding differences between the Safe Browsing API (what's returned from running the The transparency report is saying that the url is unsafe but |
Is it possible that results from the API are more up to date than https://transparencyreport.google.com or are they using the same api? |
Thanks @summera Yeah, I saw such discrepancy in the past but I cannot tell which source is more up to date as I am not affiliated with Google. |
@afilipovich Thanks for the response! Very weird. So have you or anyone else been able to determine how accurate this is in a real world production environment? It seems to me, based on what's been reported in this issue and the google group and with my own simple tests, that there are a lot of false negatives being returned from the API. Since phishing and malware urls are constantly changing it's challenging to determine whether this is really going to catch much and how accurate it will be. |
Due to data sharing restrictions, the set of URLs accessible via the Safe Browsing API, Transparency Report, and web browser integrations may differ. It is our goal to ensure these discrepancies are as rare as possible, but it's not guaranteed. |
I think any detection technology will have false negatives, no solution can claim to catch everything. So that is something we should already expect. In particular, it seems to me the Google Safebrowing API must be removing malicious entries from their database either through an aging process or by detection of when they are no longer active. In any case, I will take a solution that does that to minimize false positives over a very noisy one every time. |
You can try to compare results from It does not use local cache so it has performance limitations, but it excludes possible issues with |
which database is specified in the database.go file line number 110 ? |
same issue at http://58.194.172.18/Thesis/, any update? |
@alexwoz can google please update the gsb developers page and mention this fact there.
|
Still confused! does the Did anyone get the response of full hashes after submitting hash prefixes? |
gglsbl and this client both get the same lists. There was a bug in the past that caused them to get different lists. It is still true that browser clients (chrome, safari, firefox, etc) can receive slightly different threat lists. As alexwoz pointed out, this is due to data sharing restrictions we have with a subset of our data. The Safe Browsing team works hard to improve our detection capabilities to get good coverage for all clients. If you're looking for URLs to test, try some of the top ones in http://testsafebrowsing.appspot.com/. If you have issues with a different client implementation, you can start a new thread, or post on https://groups.google.com/forum/#!forum/google-safe-browsing-api |
Has this issue been fixed? |
I have noticed that the sbserver returns an empty response for some urls while Chrome browser and online lookup tool ( https://www.google.com/transparencyreport/safebrowsing/diagnostic/ ) does return a correct danger response. I have looked and the server is updating its list. Anyone know what is happening?
A sample url for which this happens.
http://www.precision-mouldings.com/.ls/.https:/.www.paypal.co.uk/uk.web.apps.mpp.home.sign.in.country.a.GB.locale.a.en.GB-6546refhs8ehgf8-890b7fefut9546954543ds867hgf9-1egey3ds4820435t546ggc-u4ydstgu5438gjksssGB/plmgeo.php
The text was updated successfully, but these errors were encountered: