Defender export provider inadvertently downloads user data when retrieving a count of users without advanced auditing #1404

tkol2022 · 2024-11-05T18:33:49Z

💡 Summary

Recently we noticed that when the tenant size becomes larger (e.g. 20,000 users) the Defender provider takes significantly longer to execute than previously observed and that prompted my code review which resulted in this issue.

The Defender export provider contains code with calls the server-side API associated with Get-MgBetaUser using a filter to get a count of users that do not have the advanced auditing feature enabled. The way the code is structured, the only piece of data that ends up being used is a result count value which is passed to the Rego policy checks. I manually ran the cmdlet with the same parameters used in the code against a test tenant and noticed that it not only provides data in the count variable, but the cmdlet also returns a dataset of all the users that match the filter with a comprehensive set of properties in Powershell objects. This leads me to believe that for large tenants, particularly ones that have a lot of users that are NOT assigned the advanced audit feature, this code may take some time to complete because it is downloading the dataset from the server. It could also be the case that downloading the data is fast and the query simply takes a long time to complete on the server side so that should be investigated.

I think this should be investigated and anything that we can do to improve the performance of this policy check for larger tenants should be considered. This is especially pertinent because many large tenants might not have the advanced audit feature turned on for a wide subset of users.

Here is the code that performs the count against the server using a Filter parameter.

Observe below when I ran the same command manually against a test tenant that it returns the full dataset of matching users and their properties. Even though we are not using this information in the code because the cmdlet is being piped to Out-Null, the data is still being downloaded which could take time, not to mention a Powershell object is being created for every downloaded user with a host of properties in RAM.

Get-MgBetaUser -Filter "not assignedPlans/any(a:a/servicePlanId eq 2f442157-a11c-46b9-ae5b-6e39ff4e5849 and a/capabilityStatus eq 'Enabled')" -Count UsersWithoutAdvancedAuditCount -ConsistencyLevel eventual -All

Each downloaded user gets its own Powershell object each containing 128 property keys and associated values:

Motivation and context

Improving the performance of the tool is paramount to user satisfaction. Also with larger tenants this code seems to significantly slow down our automated functional tests when running Defender so if we correct this it will enable us to test features for larger tenants without performance impacts.

The text was updated successfully, but these errors were encountered:

tkol2022 · 2024-11-05T18:37:04Z

I marked this as high priority because I think we should investigate to ensure that the tool performs within a reasonable time threshold against larger tenants to contribute towards higher user satisfaction.

gdasher · 2024-11-06T15:23:03Z

Can you check whether adding $top=1 or some such improves performance by sending back less data?

tkol2022 · 2024-11-06T16:02:32Z

Can you check whether adding $top=1 or some such improves performance by sending back less data?

Yes. Already got a fix working with that approach yesterday in MS Graph Explorer but now trying to replicate the fix using Cmdlets or Invoke-MgGraphRequest calls. Also running the profiler to ensure that indeed this is what slows down Defender with larger tenant counts and we don't chase the wrong root cause.

tkol2022 · 2024-11-06T16:59:24Z

Size data

Based on some tests today I calculated the average downloaded size of a user record from the Get-MgBetaUser cmdlet.

1,144 users = 28,228,552 bytes
Each user's downloaded record is ~ 24,675 bytes

Therefore, if a tenant had 20k users
20,000 users = 493,500,000 or about 1/2 Gigabyte of data would be downloaded

tkol2022 · 2024-11-06T20:42:31Z

Description of the fix

To correct the problem of downloading unnecessary data and to make the query run faster I made the following code change to the lines in ExportDefenderProvider.psm1.

I replaced "All = $true" with "Top = 1" in the query parameters when calling the Get-MgBetaUser cmdlet. This still runs the query in the back-end but only returns a single record which seems to make the operation lightning fast.

I verified that the count that is created by the query is still correct in the JSON.

In a test tenant that has 2,041 users, the new query takes 0.65 seconds.

In a test tenant that has 2,041 users, the old query (without the fix) takes 10.76 seconds.

tkol2022 · 2024-11-06T20:45:09Z

How the fix works under the hood

If you were to run the query in MS Graph Explorer you can see how the fix works a little bit under the hood. Note that the fix requires adding a request header named ConsistencyLevel = eventual and a querystring of $count=true&$top=1. The count is returned in a variable named @odata.count inside the response body.

tkol2022 added the enhancement This issue or pull request will add new or improve existing functionality label Nov 5, 2024

tkol2022 self-assigned this Nov 6, 2024

tkol2022 added this to the Kraken milestone Nov 6, 2024

tkol2022 linked a pull request Nov 6, 2024 that will close this issue

improve performance of Defender query to count users without advanced auditing #1406

Open

20 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Defender export provider inadvertently downloads user data when retrieving a count of users without advanced auditing #1404

Defender export provider inadvertently downloads user data when retrieving a count of users without advanced auditing #1404

tkol2022 commented Nov 5, 2024

tkol2022 commented Nov 5, 2024

gdasher commented Nov 6, 2024

tkol2022 commented Nov 6, 2024 •

edited

Loading

tkol2022 commented Nov 6, 2024

tkol2022 commented Nov 6, 2024

tkol2022 commented Nov 6, 2024

Defender export provider inadvertently downloads user data when retrieving a count of users without advanced auditing #1404

Defender export provider inadvertently downloads user data when retrieving a count of users without advanced auditing #1404

Comments

tkol2022 commented Nov 5, 2024

💡 Summary

Motivation and context

tkol2022 commented Nov 5, 2024

gdasher commented Nov 6, 2024

tkol2022 commented Nov 6, 2024 • edited Loading

tkol2022 commented Nov 6, 2024

Size data

tkol2022 commented Nov 6, 2024

Description of the fix

tkol2022 commented Nov 6, 2024

How the fix works under the hood

tkol2022 commented Nov 6, 2024 •

edited

Loading