Skip to content

Commit

Permalink
chore(doc): some troubleshooting tip
Browse files Browse the repository at this point in the history
  • Loading branch information
squakez committed Oct 3, 2023
1 parent fe46bdd commit 4042c02
Show file tree
Hide file tree
Showing 12 changed files with 87 additions and 42 deletions.
2 changes: 1 addition & 1 deletion docs/modules/ROOT/nav-end.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@
** xref:observability/monitoring.adoc[Monitoring]
*** xref:observability/monitoring/operator.adoc[Operator]
*** xref:observability/monitoring/integration.adoc[Integration]
* Troubleshooting
* xref:troubleshooting/troubleshooting.adoc[Troubleshooting]
** xref:troubleshooting/debugging.adoc[Debugging]
** xref:troubleshooting/operating.adoc[Operating]
** xref:troubleshooting/known-issues.adoc[Known Issues]
Expand Down
31 changes: 0 additions & 31 deletions docs/modules/ROOT/pages/troubleshooting/known-issues.adoc
Original file line number Diff line number Diff line change
@@ -1,37 +1,6 @@
[[known-issues]]
= Known Issues

== `Error during unshare(CLONE_NEWUSER): Invalid argument`

Buildah is best used with the OCI container runtime.
When used with the Docker container runtime, it may not have the permissions to perform some required system calls.

From https://github.com/containers/buildah/issues/1901[containers/buildah#1901], it seems a system call, that's forbidden by default with the Docker container runtime, is still necessary when the user doesn't have the `CAP_SYS_ADMIN` capability.

The only option is to change the Docker container runtime to use a different _seccomp_ profile, e.g.:

[source,console]
----
$ docker run --security-opt seccomp=/usr/share/containers/seccomp.json
----

However, that requires being able to configure your cluster container runtime.

A work-around is to use another builder strategy, like Kaniko or Spectrum, e.g., when installing Camel K:

[source,console]
----
$ kamel install --build-publish-strategy=kaniko
----

Or by patching your `IntegrationPlatform` resource directly if you have Camel K already installed, e.g.:

[source,console]
----
$ kubectl patch ip camel-k --type='merge' -p '{"spec":{"build":{"publishStrategy":"kaniko"}}}'
----


== `[Openshift] Repeated install/uninstall and removal of CamelCatalog leads to re-creation of builder image`

Openshift's internal container image registry operates on image streams instead of directly on images. As a side effect in a non production usage it can lead to an increase of the container image storage. This is because the `uninstall` command will remove the CamelCatalog but can't remove the actual container image.
Expand Down
59 changes: 59 additions & 0 deletions docs/modules/ROOT/pages/troubleshooting/troubleshooting.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
= Troubleshooting Camel K Integrations

As soon as you start using Camel K in complex integration, you may have failures in the Integrations that you need to resolve. Most of the time, the first level of troubleshooting is to check the the log or the Custom Resources which are bound to a Camel application.

In particular, after you run an application (ie, `kamel run test.yaml`), if this does not start up properly, you will need to verify the following resources.

[[troubleshoot-integration-pod]]
== Checking Integration pod

Most of the time, your Integration build cycle runs fine. Then a Deployment and therefore a Pod are started. However, there could be "application" reason why the Pod is not starting.

First of all, you need to try to check the log of the application. Try using `kamel logs test` or `kubectl logs test-7856cb497b-smfkq`. If there is some problem within your Camel application, you will typically discover it at runtime only. Checking the logs and understanding the reason of the failure there should be the easiest approach.

NOTE: use logging trait to change the level of log, if needed.

[[troubleshoot-integration-cr]]
== Checking Integration custom resource

The custom resource that triggers the creation of a Camel application is the Integration custom resource. If something wrong happens during the build, you can look at the `.status.phase` and `.status.conditions` to understand what's going on. For example `kubectl get it -o yaml`:
```
status:
conditions:
...
- lastTransitionTime: "2023-09-29T13:53:17Z"
lastUpdateTime: "2023-09-29T13:57:50Z"
message: 'integration kit default/kit-ckbddjd5rv6c73cr99fg is in state "Error".
Failure: Get "https://1.2.3.4/v2/": dial tcp 1.2.3.4:443: i/o timeout; Get
"http://1.2.3.4/v2/": dial tcp 1.2.3.4:80: i/o timeout'
reason: IntegrationKitAvailable
status: "False"
type: IntegrationKitAvailable
...
phase: Error
```
This tells us that we were not able to correctly connect to the configured registry, reason why the build failed. This is the place that you want to monitor often, in order to understand the level of health of your Integration. We store more conditions related to the different services Camel K offers.

[[troubleshoot-integration-kit]]
== Checking IntegrationKit custom resource

The IntegrationKit is the second custom resource you want to look at if your Integration failed. Most of the time, the errors happening here are bubbled up into the Integration, but the IntegrationKit analysis can give you more information (`kubectl get ik kit-ckbddjd5rv6c73cr99fg -o yaml`).

[[troubleshoot-integration-build]]
== Checking Build custom resource

The Build is the another custom resource you want to look at if your Integration failed. This has even more level of details, giving a resume of each execution of the pipeline tasks used to build and publish the IntegrationKit. Run `kubectl get build kit-ckbddjd5rv6c73cr99fg -o yaml` and you will be able to see a higher level of details, above all if you're running with the builder `pod` strategy (which creates the build into a separate pod).

[[troubleshoot-other-cr]]
== Checking other custom resources

If you're still in trouble, other resources that can help you understand a little better the situation of your configuration are `IntegrationPlatform` (`kubectl get IntegrationPlatform`) and `CamelCatalog` (`kubectl get CamelCatalog`). If they are in phase error, for any reason, you will discover that looking at their status.

[[troubleshoot-operator-log]]
== Checking Camel K operator or builder pod log

Finally, after checking the status and conditions of all the custom resources, you can look at the health of the Camel K operator watching its log (ie, `kubectl logs camel-k-operator-7856cb497b-smfkq`).

If you're running the build with `pod` strategy, then, it may be interesting for you looking at the execution of the builder pod: `kubectl logs camel-k-kit-ckbddjd5rv6c73cr99fg`. Make sure to look at all pipeline containers pods to have a complete view of where the error could be.

NOTE: use `--log-level` parameter to change the level of operator log, if needed.
2 changes: 1 addition & 1 deletion e2e/common/support/startup_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,6 @@ func TestCommonCamelKInstallStartup(t *testing.T) {
Expect(KamelInstallWithIDAndKameletCatalog(ns.GetName(), ns.GetName()).Execute()).To(Succeed())
Eventually(OperatorPod(ns.GetName())).ShouldNot(BeNil())
Eventually(Platform(ns.GetName())).ShouldNot(BeNil())
Eventually(PlatformConditionStatus(ns.GetName(), v1.IntegrationPlatformConditionReady), TestTimeoutShort).
Eventually(PlatformConditionStatus(ns.GetName(), v1.IntegrationPlatformConditionTypeCreated), TestTimeoutShort).
Should(Equal(corev1.ConditionTrue))
}
2 changes: 1 addition & 1 deletion e2e/commonwithcustominstall/builder_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ func TestBuilderTimeout(t *testing.T) {
Expect(KamelInstallWithID(operatorID, ns).Execute()).To(Succeed())
Eventually(OperatorPod(ns)).ShouldNot(BeNil())
Eventually(Platform(ns)).ShouldNot(BeNil())
Eventually(PlatformConditionStatus(ns, v1.IntegrationPlatformConditionReady), TestTimeoutShort).
Eventually(PlatformConditionStatus(ns, v1.IntegrationPlatformConditionTypeCreated), TestTimeoutShort).
Should(Equal(corev1.ConditionTrue))

pl := Platform(ns)()
Expand Down
4 changes: 2 additions & 2 deletions e2e/commonwithcustominstall/catalog_builder_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ func TestCamelCatalogBuilder(t *testing.T) {
Expect(KamelInstallWithID(operatorID, ns).Execute()).To(Succeed())
Eventually(OperatorPod(ns)).ShouldNot(BeNil())
Eventually(Platform(ns)).ShouldNot(BeNil())
Eventually(PlatformConditionStatus(ns, v1.IntegrationPlatformConditionReady), TestTimeoutShort).
Eventually(PlatformConditionStatus(ns, v1.IntegrationPlatformConditionTypeCreated), TestTimeoutShort).
Should(Equal(corev1.ConditionTrue))
catalogName := fmt.Sprintf("camel-catalog-%s", strings.ToLower(defaults.DefaultRuntimeVersion))
Eventually(CamelCatalog(ns, catalogName)).ShouldNot(BeNil())
Expand Down Expand Up @@ -167,7 +167,7 @@ func TestCamelCatalogBuilder(t *testing.T) {
},
))

Eventually(PlatformConditionStatus(ns, v1.IntegrationPlatformConditionReady), TestTimeoutShort).
Eventually(PlatformConditionStatus(ns, v1.IntegrationPlatformConditionTypeCreated), TestTimeoutShort).
Should(Equal(corev1.ConditionTrue))
catalogName := fmt.Sprintf("camel-catalog-%s", strings.ToLower(defaults.DefaultRuntimeVersion))

Expand Down
2 changes: 1 addition & 1 deletion e2e/install/cli/global_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ func TestRunGlobalInstall(t *testing.T) {

t.Run("Global CamelCatalog reconciliation", func(t *testing.T) {
Eventually(Platform(operatorNamespace)).ShouldNot(BeNil())
Eventually(PlatformConditionStatus(operatorNamespace, v1.IntegrationPlatformConditionReady), TestTimeoutShort).
Eventually(PlatformConditionStatus(operatorNamespace, v1.IntegrationPlatformConditionTypeCreated), TestTimeoutShort).
Should(Equal(corev1.ConditionTrue))
catalogName := fmt.Sprintf("camel-catalog-%s", strings.ToLower(defaults.DefaultRuntimeVersion))
Eventually(CamelCatalog(operatorNamespace, catalogName)).ShouldNot(BeNil())
Expand Down
2 changes: 1 addition & 1 deletion e2e/install/cli/install_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ func TestBasicInstallation(t *testing.T) {
Expect(KamelInstallWithID(operatorID, ns).Execute()).To(Succeed())
Eventually(OperatorPod(ns)).ShouldNot(BeNil())
Eventually(Platform(ns)).ShouldNot(BeNil())
Eventually(PlatformConditionStatus(ns, v1.IntegrationPlatformConditionReady), TestTimeoutShort).
Eventually(PlatformConditionStatus(ns, v1.IntegrationPlatformConditionTypeCreated), TestTimeoutShort).
Should(Equal(corev1.ConditionTrue))

// Check if default security context has been applyed
Expand Down
2 changes: 1 addition & 1 deletion e2e/knative/support/startup_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,6 @@ func TestKNativeCamelKInstallStartup(t *testing.T) {
Expect(KamelInstallWithIDAndKameletCatalog(ns.GetName(), ns.GetName(), "--trait-profile", "knative").Execute()).To(Succeed())
Eventually(OperatorPod(ns.GetName())).ShouldNot(BeNil())
Eventually(Platform(ns.GetName())).ShouldNot(BeNil())
Eventually(PlatformConditionStatus(ns.GetName(), v1.IntegrationPlatformConditionReady), TestTimeoutShort).
Eventually(PlatformConditionStatus(ns.GetName(), v1.IntegrationPlatformConditionTypeCreated), TestTimeoutShort).
Should(Equal(corev1.ConditionTrue))
}
2 changes: 1 addition & 1 deletion pkg/apis/camel/v1/integrationplatform_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,7 @@ const (
IntegrationPlatformPhaseDuplicate IntegrationPlatformPhase = "Duplicate"

// IntegrationPlatformConditionReady is the condition if the IntegrationPlatform is ready.
// Deprecated: use IntegrationPlatformConditionTypeCreated
// Deprecated: use IntegrationPlatformConditionTypeCreated.
IntegrationPlatformConditionReady = "Ready"
// IntegrationPlatformConditionTypeCreated is the condition if the IntegrationPlatform has been created.
IntegrationPlatformConditionTypeCreated IntegrationPlatformConditionType = "Created"
Expand Down
4 changes: 2 additions & 2 deletions pkg/controller/integrationplatform/monitor.go
Original file line number Diff line number Diff line change
Expand Up @@ -69,14 +69,14 @@ func (action *monitorAction) Handle(ctx context.Context, platform *v1.Integratio
"IntegrationPlatformRegistryAvailable",
"registry not available because provided by Openshift")
} else {
if &platform.Status.Build.Registry == nil || platform.Status.Build.Registry.Address == "" {
if platform.Status.Build.Registry.Address == "" {
// error, we need a registry if we're not on Openshift
platform.Status.Phase = v1.IntegrationPlatformPhaseError
platform.Status.SetCondition(
v1.IntegrationPlatformConditionTypeRegistryAvailable,
corev1.ConditionFalse,
"IntegrationPlatformRegistryAvailable",
"registry not available")
"registry address not available, you need to set one")
} else {
platform.Status.Phase = v1.IntegrationPlatformPhaseReady
platform.Status.SetCondition(
Expand Down
17 changes: 17 additions & 0 deletions pkg/resources/resources.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 4042c02

Please sign in to comment.