Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REQUEST]: Troubleshooting guide for agentless agents #1466

Open
jen-huang opened this issue Nov 14, 2024 · 6 comments
Open

[REQUEST]: Troubleshooting guide for agentless agents #1466

jen-huang opened this issue Nov 14, 2024 · 6 comments

Comments

@jen-huang
Copy link
Contributor

Description

Troubleshooting guide for agentless agents that can be linked from the status info work done in https://github.com/elastic/ingest-dev/issues/3933.

Resources

TBD

Collaboration

TBD. The docs and product team will work together to determine the best path forward.

Point of contact.

Main contact: TBD

Stakeholders: TBD

@jen-huang
Copy link
Contributor Author

@smriti0321 I heard you are working with @benironside about overall product documentation for agentless. Should I move this issue elsewhere?

@benironside
Copy link

benironside commented Nov 22, 2024

@jen-huang This is addressed in: elastic/security-docs#6184
So as far as I'm concerned this issue can be closed.

@jen-huang
Copy link
Contributor Author

@benironside Thanks for letting me know about those docs! The work which needs to link to troubleshooting docs is generic for all agentless integrations, not just CSPM/security ones. For example it will be linked to from Elastic connector integrations too. IMO there should be a general place for docs for agentless integrations that links out to Solution-specific docs as needed.

@nimarezainia @kilfoyle Do we plan to add general agentless docs to existing Fleet and Integrations docs?

@kilfoyle
Copy link
Contributor

@jen-huang I'm sadly not at all up to speed on the agentless work. I can add a page identical to the one Ben created here into the Fleet docs, if that makes sense. Would that suffice or do you think there's more required for the generic content?

@jen-huang
Copy link
Contributor Author

@kilfoyle I think that would be a good start. I didn't diff the ESS vs Serverless docs so not sure if there is differing info that could be combined or split into multiple sections on the same page.

@nimarezainia
Copy link
Contributor

@jen-huang I agree with the fact that we would need a generic troubleshooting guide to deal with agentless infrastructure issues. We don;t really have anything at the moment roadmap wise to deal with this.

I'll discuss with @kpollich and @cmacknz as to what this type of a guide needs to include. At the moment CSPM team owns and operates the agentless infra.

jen-huang added a commit to elastic/kibana that referenced this issue Nov 26, 2024
## Summary

Resolves elastic/ingest-dev#3933. For
deployments that support agentless, integrations with agentless
deployment mode enabled will allow the status of agentless integration
policies to be tracked.

### Key technical changes

- A new field `supports_agentless` was added to package policies. This
field already exists on agent policies. When an agentless integration is
created, `supports_agentless: true` is now added to both the package
policy and its parent agent policy.
- This allows easier filtering for agentless integrations as we avoid
having to retrieve & check against every parent agent policy.
- This also means existing agentless policies do not get this new status
tracking UI, only new ones created after this change. Since agentless is
not yet GA, I think this is okay.
- `/api/fleet/agent_status/data` now takes optional query params
`pkgName` and `pkgVersion`. When both are specified, the API will check
if agent(s) have ingested data for only that package's datastreams.

## UI walkthrough
<details>
<summary>🖼️ Click to show screenshots</summary>

1. **Integration policies** page now shows two tables for integrations
meeting the above condition, one for agentless policies and one for
agent-based policies:


![image](https://github.com/user-attachments/assets/58c6a932-9bda-4229-ba5f-d341bdbd539a)

2. Clicking the status badge in the agentless policies table opens a
flyout with two steps: confirm agentless enrollment and confirm incoming
data:


![image](https://github.com/user-attachments/assets/e19e6ba0-f40d-48a7-a524-0373934ac46a)

3. Confirm agentless enrollment polls for an agent enrolled into that
integration policy's agent policy. If that agent is reporting an
unhealthy status, the integration component UI is shown. This UI is the
same one used on Fleet > Agents > Agent details page and shows all
components reported by that agent:


![image](https://github.com/user-attachments/assets/ce214f7f-4bdd-48e5-a5eb-a1e8fcc7a512)

4. Once a healthy agentless enrollment is established, confirm incoming
data starts polling for data for that integration ingested by that agent
ID in the past 5 minutes:


![image](https://github.com/user-attachments/assets/7f3de40b-3418-4174-b529-e805407949b6)

5. If data could not be retrieved in 5 minutes, an error message shows
while polling continues in the background:


![image](https://github.com/user-attachments/assets/a3fd198e-1570-4357-9b7f-e541a769d33f)

6. If data is retrieved, a success message is shown:


![image](https://github.com/user-attachments/assets/f4e442af-ca60-4448-9bfb-3f244cd03c2d)
</details>

## Testing
Easiest way to test is use the Cloud deployment from this PR. Enable
Beta integrations and navigate to CSPM. Add a CSPM integration using
`Agentless` setup technology. Then you can track the status of the
agentless deployment on the Integrations policies tab.

For local testing, the following is required to simulate agentless
agent:
1. Add the following to kibana.dev.yml:
```
xpack.cloud.id: 'anything-to-pass-cloud-validation-checks'
xpack.fleet.agentless.enabled: true
xpack.fleet.agentless.api.url: 'https://localhost:8443'
xpack.fleet.agentless.api.tls.certificate: './config/certs/ess-client.crt'
xpack.fleet.agentless.api.tls.key: './config/certs/ess-client.key'
xpack.fleet.agentless.api.tls.ca: './config/certs/ca.crt'
```
2. Apply [this
patch](https://gist.github.com/jen-huang/dfc3e02ceb63976ad54bd1f50c524cb4)
to prevent attempt to create agentless pod
3. Enroll a Fleet Server as usual
4. Enable Beta integrations and navigate to CSPM. Add a CSPM integration
using `Agentless` setup technology.
5. Enroll a normal Elastic Agent to the agent policy for that CSPM
integration by using the token from Enrollment tokens

## To-do
- [x] API tests
- [x] Unit UI tests
- [x] Manual Cloud tests
- [x] File docs request
  - elastic/ingest-docs#1466
- [ ] Update troubleshooting guide link once available

### Checklist

Delete any items that are not applicable to this PR.

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <[email protected]>
paulinashakirova pushed a commit to paulinashakirova/kibana that referenced this issue Nov 26, 2024
## Summary

Resolves elastic/ingest-dev#3933. For
deployments that support agentless, integrations with agentless
deployment mode enabled will allow the status of agentless integration
policies to be tracked.

### Key technical changes

- A new field `supports_agentless` was added to package policies. This
field already exists on agent policies. When an agentless integration is
created, `supports_agentless: true` is now added to both the package
policy and its parent agent policy.
- This allows easier filtering for agentless integrations as we avoid
having to retrieve & check against every parent agent policy.
- This also means existing agentless policies do not get this new status
tracking UI, only new ones created after this change. Since agentless is
not yet GA, I think this is okay.
- `/api/fleet/agent_status/data` now takes optional query params
`pkgName` and `pkgVersion`. When both are specified, the API will check
if agent(s) have ingested data for only that package's datastreams.

## UI walkthrough
<details>
<summary>🖼️ Click to show screenshots</summary>

1. **Integration policies** page now shows two tables for integrations
meeting the above condition, one for agentless policies and one for
agent-based policies:


![image](https://github.com/user-attachments/assets/58c6a932-9bda-4229-ba5f-d341bdbd539a)

2. Clicking the status badge in the agentless policies table opens a
flyout with two steps: confirm agentless enrollment and confirm incoming
data:


![image](https://github.com/user-attachments/assets/e19e6ba0-f40d-48a7-a524-0373934ac46a)

3. Confirm agentless enrollment polls for an agent enrolled into that
integration policy's agent policy. If that agent is reporting an
unhealthy status, the integration component UI is shown. This UI is the
same one used on Fleet > Agents > Agent details page and shows all
components reported by that agent:


![image](https://github.com/user-attachments/assets/ce214f7f-4bdd-48e5-a5eb-a1e8fcc7a512)

4. Once a healthy agentless enrollment is established, confirm incoming
data starts polling for data for that integration ingested by that agent
ID in the past 5 minutes:


![image](https://github.com/user-attachments/assets/7f3de40b-3418-4174-b529-e805407949b6)

5. If data could not be retrieved in 5 minutes, an error message shows
while polling continues in the background:


![image](https://github.com/user-attachments/assets/a3fd198e-1570-4357-9b7f-e541a769d33f)

6. If data is retrieved, a success message is shown:


![image](https://github.com/user-attachments/assets/f4e442af-ca60-4448-9bfb-3f244cd03c2d)
</details>

## Testing
Easiest way to test is use the Cloud deployment from this PR. Enable
Beta integrations and navigate to CSPM. Add a CSPM integration using
`Agentless` setup technology. Then you can track the status of the
agentless deployment on the Integrations policies tab.

For local testing, the following is required to simulate agentless
agent:
1. Add the following to kibana.dev.yml:
```
xpack.cloud.id: 'anything-to-pass-cloud-validation-checks'
xpack.fleet.agentless.enabled: true
xpack.fleet.agentless.api.url: 'https://localhost:8443'
xpack.fleet.agentless.api.tls.certificate: './config/certs/ess-client.crt'
xpack.fleet.agentless.api.tls.key: './config/certs/ess-client.key'
xpack.fleet.agentless.api.tls.ca: './config/certs/ca.crt'
```
2. Apply [this
patch](https://gist.github.com/jen-huang/dfc3e02ceb63976ad54bd1f50c524cb4)
to prevent attempt to create agentless pod
3. Enroll a Fleet Server as usual
4. Enable Beta integrations and navigate to CSPM. Add a CSPM integration
using `Agentless` setup technology.
5. Enroll a normal Elastic Agent to the agent policy for that CSPM
integration by using the token from Enrollment tokens

## To-do
- [x] API tests
- [x] Unit UI tests
- [x] Manual Cloud tests
- [x] File docs request
  - elastic/ingest-docs#1466
- [ ] Update troubleshooting guide link once available

### Checklist

Delete any items that are not applicable to this PR.

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <[email protected]>
jen-huang added a commit to jen-huang/kibana that referenced this issue Nov 27, 2024
## Summary

Resolves elastic/ingest-dev#3933. For
deployments that support agentless, integrations with agentless
deployment mode enabled will allow the status of agentless integration
policies to be tracked.

### Key technical changes

- A new field `supports_agentless` was added to package policies. This
field already exists on agent policies. When an agentless integration is
created, `supports_agentless: true` is now added to both the package
policy and its parent agent policy.
- This allows easier filtering for agentless integrations as we avoid
having to retrieve & check against every parent agent policy.
- This also means existing agentless policies do not get this new status
tracking UI, only new ones created after this change. Since agentless is
not yet GA, I think this is okay.
- `/api/fleet/agent_status/data` now takes optional query params
`pkgName` and `pkgVersion`. When both are specified, the API will check
if agent(s) have ingested data for only that package's datastreams.

## UI walkthrough
<details>
<summary>🖼️ Click to show screenshots</summary>

1. **Integration policies** page now shows two tables for integrations
meeting the above condition, one for agentless policies and one for
agent-based policies:

![image](https://github.com/user-attachments/assets/58c6a932-9bda-4229-ba5f-d341bdbd539a)

2. Clicking the status badge in the agentless policies table opens a
flyout with two steps: confirm agentless enrollment and confirm incoming
data:

![image](https://github.com/user-attachments/assets/e19e6ba0-f40d-48a7-a524-0373934ac46a)

3. Confirm agentless enrollment polls for an agent enrolled into that
integration policy's agent policy. If that agent is reporting an
unhealthy status, the integration component UI is shown. This UI is the
same one used on Fleet > Agents > Agent details page and shows all
components reported by that agent:

![image](https://github.com/user-attachments/assets/ce214f7f-4bdd-48e5-a5eb-a1e8fcc7a512)

4. Once a healthy agentless enrollment is established, confirm incoming
data starts polling for data for that integration ingested by that agent
ID in the past 5 minutes:

![image](https://github.com/user-attachments/assets/7f3de40b-3418-4174-b529-e805407949b6)

5. If data could not be retrieved in 5 minutes, an error message shows
while polling continues in the background:

![image](https://github.com/user-attachments/assets/a3fd198e-1570-4357-9b7f-e541a769d33f)

6. If data is retrieved, a success message is shown:

![image](https://github.com/user-attachments/assets/f4e442af-ca60-4448-9bfb-3f244cd03c2d)
</details>

## Testing
Easiest way to test is use the Cloud deployment from this PR. Enable
Beta integrations and navigate to CSPM. Add a CSPM integration using
`Agentless` setup technology. Then you can track the status of the
agentless deployment on the Integrations policies tab.

For local testing, the following is required to simulate agentless
agent:
1. Add the following to kibana.dev.yml:
```
xpack.cloud.id: 'anything-to-pass-cloud-validation-checks'
xpack.fleet.agentless.enabled: true
xpack.fleet.agentless.api.url: 'https://localhost:8443'
xpack.fleet.agentless.api.tls.certificate: './config/certs/ess-client.crt'
xpack.fleet.agentless.api.tls.key: './config/certs/ess-client.key'
xpack.fleet.agentless.api.tls.ca: './config/certs/ca.crt'
```
2. Apply [this
patch](https://gist.github.com/jen-huang/dfc3e02ceb63976ad54bd1f50c524cb4)
to prevent attempt to create agentless pod
3. Enroll a Fleet Server as usual
4. Enable Beta integrations and navigate to CSPM. Add a CSPM integration
using `Agentless` setup technology.
5. Enroll a normal Elastic Agent to the agent policy for that CSPM
integration by using the token from Enrollment tokens

## To-do
- [x] API tests
- [x] Unit UI tests
- [x] Manual Cloud tests
- [x] File docs request
  - elastic/ingest-docs#1466
- [ ] Update troubleshooting guide link once available

### Checklist

Delete any items that are not applicable to this PR.

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <[email protected]>
(cherry picked from commit 3188cda)

# Conflicts:
#	oas_docs/bundle.json
#	oas_docs/bundle.serverless.json
#	oas_docs/output/kibana.serverless.yaml
#	oas_docs/output/kibana.yaml
#	src/core/server/integration_tests/ci_checks/saved_objects/check_registered_types.test.ts
#	x-pack/plugins/fleet/public/components/package_policy_actions_menu.test.tsx
#	x-pack/plugins/fleet/public/components/package_policy_actions_menu.tsx
#	x-pack/plugins/fleet/server/routes/agent/handlers.ts
#	x-pack/plugins/fleet/server/types/models/package_policy.ts
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this issue Dec 12, 2024
## Summary

Resolves elastic/ingest-dev#3933. For
deployments that support agentless, integrations with agentless
deployment mode enabled will allow the status of agentless integration
policies to be tracked.

### Key technical changes

- A new field `supports_agentless` was added to package policies. This
field already exists on agent policies. When an agentless integration is
created, `supports_agentless: true` is now added to both the package
policy and its parent agent policy.
- This allows easier filtering for agentless integrations as we avoid
having to retrieve & check against every parent agent policy.
- This also means existing agentless policies do not get this new status
tracking UI, only new ones created after this change. Since agentless is
not yet GA, I think this is okay.
- `/api/fleet/agent_status/data` now takes optional query params
`pkgName` and `pkgVersion`. When both are specified, the API will check
if agent(s) have ingested data for only that package's datastreams.

## UI walkthrough
<details>
<summary>🖼️ Click to show screenshots</summary>

1. **Integration policies** page now shows two tables for integrations
meeting the above condition, one for agentless policies and one for
agent-based policies:


![image](https://github.com/user-attachments/assets/58c6a932-9bda-4229-ba5f-d341bdbd539a)

2. Clicking the status badge in the agentless policies table opens a
flyout with two steps: confirm agentless enrollment and confirm incoming
data:


![image](https://github.com/user-attachments/assets/e19e6ba0-f40d-48a7-a524-0373934ac46a)

3. Confirm agentless enrollment polls for an agent enrolled into that
integration policy's agent policy. If that agent is reporting an
unhealthy status, the integration component UI is shown. This UI is the
same one used on Fleet > Agents > Agent details page and shows all
components reported by that agent:


![image](https://github.com/user-attachments/assets/ce214f7f-4bdd-48e5-a5eb-a1e8fcc7a512)

4. Once a healthy agentless enrollment is established, confirm incoming
data starts polling for data for that integration ingested by that agent
ID in the past 5 minutes:


![image](https://github.com/user-attachments/assets/7f3de40b-3418-4174-b529-e805407949b6)

5. If data could not be retrieved in 5 minutes, an error message shows
while polling continues in the background:


![image](https://github.com/user-attachments/assets/a3fd198e-1570-4357-9b7f-e541a769d33f)

6. If data is retrieved, a success message is shown:


![image](https://github.com/user-attachments/assets/f4e442af-ca60-4448-9bfb-3f244cd03c2d)
</details>

## Testing
Easiest way to test is use the Cloud deployment from this PR. Enable
Beta integrations and navigate to CSPM. Add a CSPM integration using
`Agentless` setup technology. Then you can track the status of the
agentless deployment on the Integrations policies tab.

For local testing, the following is required to simulate agentless
agent:
1. Add the following to kibana.dev.yml:
```
xpack.cloud.id: 'anything-to-pass-cloud-validation-checks'
xpack.fleet.agentless.enabled: true
xpack.fleet.agentless.api.url: 'https://localhost:8443'
xpack.fleet.agentless.api.tls.certificate: './config/certs/ess-client.crt'
xpack.fleet.agentless.api.tls.key: './config/certs/ess-client.key'
xpack.fleet.agentless.api.tls.ca: './config/certs/ca.crt'
```
2. Apply [this
patch](https://gist.github.com/jen-huang/dfc3e02ceb63976ad54bd1f50c524cb4)
to prevent attempt to create agentless pod
3. Enroll a Fleet Server as usual
4. Enable Beta integrations and navigate to CSPM. Add a CSPM integration
using `Agentless` setup technology.
5. Enroll a normal Elastic Agent to the agent policy for that CSPM
integration by using the token from Enrollment tokens

## To-do
- [x] API tests
- [x] Unit UI tests
- [x] Manual Cloud tests
- [x] File docs request
  - elastic/ingest-docs#1466
- [ ] Update troubleshooting guide link once available

### Checklist

Delete any items that are not applicable to this PR.

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants