-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Defender export provider inadvertently downloads user data when retrieving a count of users without advanced auditing #1404
Comments
I marked this as high priority because I think we should investigate to ensure that the tool performs within a reasonable time threshold against larger tenants to contribute towards higher user satisfaction. |
Can you check whether adding $top=1 or some such improves performance by sending back less data? |
Yes. Already got a fix working with that approach yesterday in MS Graph Explorer but now trying to replicate the fix using Cmdlets or Invoke-MgGraphRequest calls. Also running the profiler to ensure that indeed this is what slows down Defender with larger tenant counts and we don't chase the wrong root cause. |
Size dataBased on some tests today I calculated the average downloaded size of a user record from the Get-MgBetaUser cmdlet. 1,144 users = 28,228,552 bytes Therefore, if a tenant had 20k users |
How the fix works under the hoodIf you were to run the query in MS Graph Explorer you can see how the fix works a little bit under the hood. Note that the fix requires adding a request header named ConsistencyLevel = eventual and a querystring of $count=true&$top=1. The count is returned in a variable named @odata.count inside the response body. |
💡 Summary
Recently we noticed that when the tenant size becomes larger (e.g. 20,000 users) the Defender provider takes significantly longer to execute than previously observed and that prompted my code review which resulted in this issue.
The Defender export provider contains code with calls the server-side API associated with Get-MgBetaUser using a filter to get a count of users that do not have the advanced auditing feature enabled. The way the code is structured, the only piece of data that ends up being used is a result count value which is passed to the Rego policy checks. I manually ran the cmdlet with the same parameters used in the code against a test tenant and noticed that it not only provides data in the count variable, but the cmdlet also returns a dataset of all the users that match the filter with a comprehensive set of properties in Powershell objects. This leads me to believe that for large tenants, particularly ones that have a lot of users that are NOT assigned the advanced audit feature, this code may take some time to complete because it is downloading the dataset from the server. It could also be the case that downloading the data is fast and the query simply takes a long time to complete on the server side so that should be investigated.
I think this should be investigated and anything that we can do to improve the performance of this policy check for larger tenants should be considered. This is especially pertinent because many large tenants might not have the advanced audit feature turned on for a wide subset of users.
Here is the code that performs the count against the server using a Filter parameter.
Observe below when I ran the same command manually against a test tenant that it returns the full dataset of matching users and their properties. Even though we are not using this information in the code because the cmdlet is being piped to Out-Null, the data is still being downloaded which could take time, not to mention a Powershell object is being created for every downloaded user with a host of properties in RAM.
Get-MgBetaUser -Filter "not assignedPlans/any(a:a/servicePlanId eq 2f442157-a11c-46b9-ae5b-6e39ff4e5849 and a/capabilityStatus eq 'Enabled')" -Count UsersWithoutAdvancedAuditCount -ConsistencyLevel eventual -All
Each downloaded user gets its own Powershell object each containing 128 property keys and associated values:
Motivation and context
Improving the performance of the tool is paramount to user satisfaction. Also with larger tenants this code seems to significantly slow down our automated functional tests when running Defender so if we correct this it will enable us to test features for larger tenants without performance impacts.
The text was updated successfully, but these errors were encountered: