-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Centralize what flags as "malicious" #95
Comments
This seems like a desirable endgoal-- albeit I'm not keen on redeploying just to fix the weighting. I pitched the idea of an additional table to track malicious packages (to the outrage of everyone) but I do think that the idea of segregating detections from the global package list can help us aggregate data better for what exactly we're detecting and why. This'll be a lot more relevant when we introduce additional detection schemas such as the AST idea, whereby we'll want to know what context something was detected in, since both detection systems will be using YARA. Do I think it's a perfect idea? Nah. But if we look at how we're trying to do the BigQuery dataset polling, where the bot is simply revolving every 'x' seconds and moving the query to encompass the last notifications, it seems like we might be able to apply that here. To clarify that because I think it's kind of confusing, being able to query our table every minute, or providing a callback for the cronjob to run that query as well, might be useful. (IE. When the cronjob runs and adds jobs to the package queue, it could also query the current state of the database using the last time it was ran and now, and report all the detections via a webhook.) This simplifies the model at least in my head. |
Here's another idea that was proposed: As for malicious packages, we could simply dispatch those as they come in from clients (so it'd be real-time). |
This doesn't make sense to me (dispatching from the clients) as we'd lose the premise of this-- the score threshold. Unless we're pushing it down to the clients themselves through get job. Which, ehhhh... I mean I guess? |
The way this would work is clients would send their result up to the API as usual (including the score, the rules matched, etc). If this sent score exceeds some threshold we've set server-side, we'd trigger a webhook (from the server). This causes no change in client behaviour. |
As for this, we could probably just save it as in the database, and have an endpoint to tweak it. Perhaps a function in the bot to hit this endpoint so we can tweak the weights without having to make an HTTP request ourselves |
Both sound reasonable to me, thanks for clarifying. You're cleared hot for implementation unless anyone has some other nits. |
On further thought this might be more difficult than anticipated to do within the API 🤔 |
I propose a new endpoint, perhaps something along the lines of |
Going to close this as succeeded by #260 |
The way the current system works is that the API returns all packages that have been scanned (it quite literally just dumps the results of the SQLAlchemy result) within the given constraints in the request. This then means that the consumer (the bot, in this case) has to filter through the response for the packages it wants to display (in this case, the bot will filter through packages with a score greater than or equal to 5).
It has been expressed numerous times that what constitutes as "malicious" should be in a centralized location (such as this API). There are a few ways of going about doing this, I'd like to get ideas on the table in this issue. A basic solution we could start off with is a field in
constants.py
that we can tweak (though worth nothing we would have to redeploy to tweak this). The API response would then return a list of packages scanned, and a list of malicious packages.We can also discuss having the API itself dispatch a webhook to the appropriate channels instead of having the bot poll the API every 60 seconds. I'm leaning more towards this approach.
The text was updated successfully, but these errors were encountered: