Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redundant and exsessive data in JSON output file when using "group-members" flag #69

Closed
TasteOfSpaghetti opened this issue Dec 13, 2023 · 5 comments · Fixed by #84
Closed

Comments

@TasteOfSpaghetti
Copy link

Hi,

When running AzureHound with the "group-members" flag, it appears that the JSON file contains a whole lot of irrelevant data. This is an issue in large environments. The size of the JSON file can grow to multiple gigabytes, which then cannot be ingested into BHCE due to the size limit (around 400MB from my testing). Using Chophound to cut the file into smaller pieces might do the trick, but even some single group nodes within the JSON are above 400MB which is above the BHCE upload limit for a given file. This results in data ingestion not being possible.

Looking through the JSON file, it appears that attributes on each of the group members like:

assignedLicenses
assignedPlans
provisionedPlans

Take up a large portion of the file.

Additionally, attributes like:

country
department
faxNumber

And a whole lot more, is present in the data for each group member.

I do not see why this data is part of the "group-members" ingestion.

I would think that 95% of the data collected can be removed.

In my mind, only the raw membership data should be included, as in:

groupId (Group ID)
memberId (ID of the groups or users that are members of said group)

All the other data for the groups and users themselves, should not be included in this data collection type.

@JonasBK
Copy link

JonasBK commented Dec 13, 2023

Confirmed. I see the same thing for AZGroupOwner and AZAppOwner. So it might be a generic thing where there is a potential to remove redundant data.

@TasteOfSpaghetti
Copy link
Author

It would be great if the majority of the data can be removed. I bet it would both benefit performance when running AzureHound and also ingestion into BHCE due to smaller file sizes.

Also, I don't know this for sure, but maybe a large portion of the data isn't even ingested into BHCE and used for anything. It might just be collected and never used.

@malacupa
Copy link

I agree with @TasteOfSpaghetti and I've created pull request for changing this in the past. BloodHound team then came with #67 which should've fixed this. I didn't manage to test it before #64 was closed but now I can see #67 did not help and #64 still makes sense.
BloodHound team, could you reconsider merging #64 or implementing something along those lines?

@StephenHinck
Copy link
Contributor

I've asked the team to re-review #64 after this recent change.

@egilas
Copy link

egilas commented Jul 2, 2024

Until the fix has been implemented:

I've written a small script to trim the AzGroupMembers down to just id's instead of all extended properties - https://github.com/egilas/AzureHoundTrimmer/

3.3GB Azurehound output --> 350 MB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants