Skip to content

[Windows] Stop enabling sysmon per default to avoid unhealthy agent status #13893

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

SimonKoetting
Copy link
Contributor

Currently, Sysmon collection is enabled by default when you add the Windows integration.
Since Version 9 this can cause the Agent to be in an unhealthy state in fleet when the Channel does not exist.
See: elastic/elastic-agent#7448

As sysmon isn't installed on Windows per default, it also shouldn't be enabled in the integration as default to improve user experience when they just add the Windows integration without changing the default values.

@SimonKoetting SimonKoetting requested review from a team as code owners May 13, 2025 06:57
@SimonKoetting SimonKoetting requested review from belimawr and rdner May 13, 2025 06:57
@elastic-vault-github-plugin-prod

🚀 Benchmarks report

Package windows 👍(6) 💚(2) 💔(1)

Expand to view
Data stream Previous EPS New EPS Diff (%) Result
forwarded 1240.69 915.75 -324.94 (-26.19%) 💔

To see the full report comment with /test benchmark fullreport

@elasticmachine
Copy link

💚 Build Succeeded

Copy link

@andrewkroh andrewkroh added Integration:windows Windows Team:Security-Windows Platform Security Windows Platform team [elastic/sec-windows-platform] labels May 13, 2025
@elasticmachine
Copy link

Pinging @elastic/sec-windows-platform (Team:Security-Windows Platform)

@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Data-Plane Agent Data Plane team [elastic/elastic-agent-data-plane] label May 14, 2025
@elasticmachine
Copy link

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@cmacknz
Copy link
Member

cmacknz commented May 20, 2025

This feels like it is a breaking change for users who actually do use sysmon. It is less disruptive for people without sysmon to turn it off, then to turn it off unconditionally and break anyone who relies on it.

I think ideally we'd be able to detect that sysmon isn't installed and just not run the input at all, but that would require a change in agent.

An even easier way to fix this that isn't a breaking change is to stop marking the winlog input as degraded/failed when it can't find the sysmon channel.

@SimonKoetting
Copy link
Contributor Author

@cmacknz but this change would only apply for all new policies, right?
So all current installations won't notice any change, meaning also all users who already have a policy in place will get the error.
So the change is only, new users who installed sysmon would have to turn it on. All other users with a basic windows installation can just use the defaults and won't get any errors.

@cmacknz
Copy link
Member

cmacknz commented May 21, 2025

If that is how it works, then that is what we want to happen. I made an assumption that by explicitly adding a new default for a configuration value that wasn't specified before, we'd update that field value to the new explicit value everywhere (new and old installs).

If the upgrade behavior handles this properly that would be ideal, did we explicitly test the behavior on upgrade here to confirm this is how it works?

@SimonKoetting
Copy link
Contributor Author

@cmacknz, yeah, but we don't introduce a new value here. We just change the default value. Therefore, the value is already set in all policies.
Just tried that locally to double-check, as soon as I create a windows' policy without adjusting anything, the data stream sysmon_operation has
enabled: true
set.
When I manually install the new package version and update the integration policy, the value keeps being
enabled: true

Only when I create a new integration policy now, it's set to
enabled: false
when I change nothing

@cmacknz
Copy link
Member

cmacknz commented May 23, 2025

If works out to be purely a change of default for new installations and upgrades of the package work properly, then LGTM, no concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Integration:windows Windows Team:Elastic-Agent-Data-Plane Agent Data Plane team [elastic/elastic-agent-data-plane] Team:Security-Windows Platform Security Windows Platform team [elastic/sec-windows-platform]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants