chore(doc): some troubleshooting tip

apache · Oct 3, 2023 · 4042c02 · 4042c02
1 parent fe46bdd
commit 4042c02
Show file tree

Hide file tree

Showing 12 changed files with 87 additions and 42 deletions.
diff --git a/docs/modules/ROOT/nav-end.adoc b/docs/modules/ROOT/nav-end.adoc
@@ -70,7 +70,7 @@
 ** xref:observability/monitoring.adoc[Monitoring]
 *** xref:observability/monitoring/operator.adoc[Operator]
 *** xref:observability/monitoring/integration.adoc[Integration]
-* Troubleshooting
+* xref:troubleshooting/troubleshooting.adoc[Troubleshooting]
 ** xref:troubleshooting/debugging.adoc[Debugging]
 ** xref:troubleshooting/operating.adoc[Operating]
 ** xref:troubleshooting/known-issues.adoc[Known Issues]

diff --git a/docs/modules/ROOT/pages/troubleshooting/known-issues.adoc b/docs/modules/ROOT/pages/troubleshooting/known-issues.adoc
@@ -1,37 +1,6 @@
 [[known-issues]]
 = Known Issues
 
-== `Error during unshare(CLONE_NEWUSER): Invalid argument`
-
-Buildah is best used with the OCI container runtime.
-When used with the Docker container runtime, it may not have the permissions to perform some required system calls.
-
-From https://github.com/containers/buildah/issues/1901[containers/buildah#1901], it seems a system call, that's forbidden by default with the Docker container runtime, is still necessary when the user doesn't have the `CAP_SYS_ADMIN` capability.
-
-The only option is to change the Docker container runtime to use a different _seccomp_ profile, e.g.:
-
-[source,console]
-----
-$ docker run --security-opt seccomp=/usr/share/containers/seccomp.json
-----
-
-However, that requires being able to configure your cluster container runtime.
-
-A work-around is to use another builder strategy, like Kaniko or Spectrum, e.g., when installing Camel K:
-
-[source,console]
-----
-$ kamel install --build-publish-strategy=kaniko
-----
-
-Or by patching your `IntegrationPlatform` resource directly if you have Camel K already installed, e.g.:
-
-[source,console]
-----
-$ kubectl patch ip camel-k --type='merge' -p '{"spec":{"build":{"publishStrategy":"kaniko"}}}'
-----
-
-
 == `[Openshift] Repeated install/uninstall and removal of CamelCatalog leads to re-creation of builder image`
 
 Openshift's internal container image registry operates on image streams instead of directly on images. As a side effect in a non production usage it can lead to an increase of the container image storage. This is because the `uninstall` command will remove the CamelCatalog but can't remove the actual container image.

diff --git a/docs/modules/ROOT/pages/troubleshooting/troubleshooting.adoc b/docs/modules/ROOT/pages/troubleshooting/troubleshooting.adoc
@@ -0,0 +1,59 @@
+= Troubleshooting Camel K Integrations
+
+As soon as you start using Camel K in complex integration, you may have failures in the Integrations that you need to resolve. Most of the time, the first level of troubleshooting is to check the the log or the Custom Resources which are bound to a Camel application.
+
+In particular, after you run an application (ie, `kamel run test.yaml`), if this does not start up properly, you will need to verify the following resources.
+
+[[troubleshoot-integration-pod]]
+== Checking Integration pod
+
+Most of the time, your Integration build cycle runs fine. Then a Deployment and therefore a Pod are started. However, there could be "application" reason why the Pod is not starting.
+
+First of all, you need to try to check the log of the application. Try using `kamel logs test` or `kubectl logs test-7856cb497b-smfkq`. If there is some problem within your Camel application, you will typically discover it at runtime only. Checking the logs and understanding the reason of the failure there should be the easiest approach.
+
+NOTE: use logging trait to change the level of log, if needed.
+
+[[troubleshoot-integration-cr]]
+== Checking Integration custom resource
+
+The custom resource that triggers the creation of a Camel application is the Integration custom resource. If something wrong happens during the build, you can look at the `.status.phase` and `.status.conditions` to understand what's going on. For example `kubectl get it -o yaml`:
+```
+  status:
+    conditions:
+...
+    - lastTransitionTime: "2023-09-29T13:53:17Z"
+      lastUpdateTime: "2023-09-29T13:57:50Z"
+      message: 'integration kit default/kit-ckbddjd5rv6c73cr99fg is in state "Error".
+        Failure: Get "https://1.2.3.4/v2/": dial tcp 1.2.3.4:443: i/o timeout; Get
+        "http://1.2.3.4/v2/": dial tcp 1.2.3.4:80: i/o timeout'
+      reason: IntegrationKitAvailable
+      status: "False"
+      type: IntegrationKitAvailable
+...
+    phase: Error
+```
+This tells us that we were not able to correctly connect to the configured registry, reason why the build failed. This is the place that you want to monitor often, in order to understand the level of health of your Integration. We store more conditions related to the different services Camel K offers.
+
+[[troubleshoot-integration-kit]]
+== Checking IntegrationKit custom resource
+
+The IntegrationKit is the second custom resource you want to look at if your Integration failed. Most of the time, the errors happening here are bubbled up into the Integration, but the IntegrationKit analysis can give you more information (`kubectl get ik kit-ckbddjd5rv6c73cr99fg -o yaml`).
+
+[[troubleshoot-integration-build]]
+== Checking Build custom resource
+
+The Build is the another custom resource you want to look at if your Integration failed. This has even more level of details, giving a resume of each execution of the pipeline tasks used to build and publish the IntegrationKit. Run `kubectl get build kit-ckbddjd5rv6c73cr99fg -o yaml` and you will be able to see a higher level of details, above all if you're running with the builder `pod` strategy (which creates the build into a separate pod).
+
+[[troubleshoot-other-cr]]
+== Checking other custom resources
+
+If you're still in trouble, other resources that can help you understand a little better the situation of your configuration are `IntegrationPlatform` (`kubectl get IntegrationPlatform`) and `CamelCatalog` (`kubectl get CamelCatalog`). If they are in phase error, for any reason, you will discover that looking at their status.
+
+[[troubleshoot-operator-log]]
+== Checking Camel K operator or builder pod log
+
+Finally, after checking the status and conditions of all the custom resources, you can look at the health of the Camel K operator watching its log (ie, `kubectl logs camel-k-operator-7856cb497b-smfkq`).
+
+If you're running the build with `pod` strategy, then, it may be interesting for you looking at the execution of the builder pod: `kubectl logs camel-k-kit-ckbddjd5rv6c73cr99fg`. Make sure to look at all pipeline containers pods to have a complete view of where the error could be.
+
+NOTE: use `--log-level` parameter to change the level of operator log, if needed.
diff --git a/e2e/common/support/startup_test.go b/e2e/common/support/startup_test.go
@@ -46,6 +46,6 @@ func TestCommonCamelKInstallStartup(t *testing.T) {
 	Expect(KamelInstallWithIDAndKameletCatalog(ns.GetName(), ns.GetName()).Execute()).To(Succeed())
 	Eventually(OperatorPod(ns.GetName())).ShouldNot(BeNil())
 	Eventually(Platform(ns.GetName())).ShouldNot(BeNil())
-	Eventually(PlatformConditionStatus(ns.GetName(), v1.IntegrationPlatformConditionReady), TestTimeoutShort).
+	Eventually(PlatformConditionStatus(ns.GetName(), v1.IntegrationPlatformConditionTypeCreated), TestTimeoutShort).
 		Should(Equal(corev1.ConditionTrue))
 }
diff --git a/e2e/commonwithcustominstall/builder_test.go b/e2e/commonwithcustominstall/builder_test.go
@@ -41,7 +41,7 @@ func TestBuilderTimeout(t *testing.T) {
 		Expect(KamelInstallWithID(operatorID, ns).Execute()).To(Succeed())
 		Eventually(OperatorPod(ns)).ShouldNot(BeNil())
 		Eventually(Platform(ns)).ShouldNot(BeNil())
-		Eventually(PlatformConditionStatus(ns, v1.IntegrationPlatformConditionReady), TestTimeoutShort).
+		Eventually(PlatformConditionStatus(ns, v1.IntegrationPlatformConditionTypeCreated), TestTimeoutShort).
 			Should(Equal(corev1.ConditionTrue))
 
 		pl := Platform(ns)()

diff --git a/e2e/commonwithcustominstall/catalog_builder_test.go b/e2e/commonwithcustominstall/catalog_builder_test.go
@@ -44,7 +44,7 @@ func TestCamelCatalogBuilder(t *testing.T) {
 		Expect(KamelInstallWithID(operatorID, ns).Execute()).To(Succeed())
 		Eventually(OperatorPod(ns)).ShouldNot(BeNil())
 		Eventually(Platform(ns)).ShouldNot(BeNil())
-		Eventually(PlatformConditionStatus(ns, v1.IntegrationPlatformConditionReady), TestTimeoutShort).
+		Eventually(PlatformConditionStatus(ns, v1.IntegrationPlatformConditionTypeCreated), TestTimeoutShort).
 			Should(Equal(corev1.ConditionTrue))
 		catalogName := fmt.Sprintf("camel-catalog-%s", strings.ToLower(defaults.DefaultRuntimeVersion))
 		Eventually(CamelCatalog(ns, catalogName)).ShouldNot(BeNil())
@@ -167,7 +167,7 @@ func TestCamelCatalogBuilder(t *testing.T) {
 			},
 		))
 
-		Eventually(PlatformConditionStatus(ns, v1.IntegrationPlatformConditionReady), TestTimeoutShort).
+		Eventually(PlatformConditionStatus(ns, v1.IntegrationPlatformConditionTypeCreated), TestTimeoutShort).
 			Should(Equal(corev1.ConditionTrue))
 		catalogName := fmt.Sprintf("camel-catalog-%s", strings.ToLower(defaults.DefaultRuntimeVersion))
 

diff --git a/e2e/install/cli/global_test.go b/e2e/install/cli/global_test.go
@@ -62,7 +62,7 @@ func TestRunGlobalInstall(t *testing.T) {
 
 		t.Run("Global CamelCatalog reconciliation", func(t *testing.T) {
 			Eventually(Platform(operatorNamespace)).ShouldNot(BeNil())
-			Eventually(PlatformConditionStatus(operatorNamespace, v1.IntegrationPlatformConditionReady), TestTimeoutShort).
+			Eventually(PlatformConditionStatus(operatorNamespace, v1.IntegrationPlatformConditionTypeCreated), TestTimeoutShort).
 				Should(Equal(corev1.ConditionTrue))
 			catalogName := fmt.Sprintf("camel-catalog-%s", strings.ToLower(defaults.DefaultRuntimeVersion))
 			Eventually(CamelCatalog(operatorNamespace, catalogName)).ShouldNot(BeNil())

diff --git a/e2e/install/cli/install_test.go b/e2e/install/cli/install_test.go
@@ -52,7 +52,7 @@ func TestBasicInstallation(t *testing.T) {
 		Expect(KamelInstallWithID(operatorID, ns).Execute()).To(Succeed())
 		Eventually(OperatorPod(ns)).ShouldNot(BeNil())
 		Eventually(Platform(ns)).ShouldNot(BeNil())
-		Eventually(PlatformConditionStatus(ns, v1.IntegrationPlatformConditionReady), TestTimeoutShort).
+		Eventually(PlatformConditionStatus(ns, v1.IntegrationPlatformConditionTypeCreated), TestTimeoutShort).
 			Should(Equal(corev1.ConditionTrue))
 
 			// Check if default security context has been applyed

diff --git a/e2e/knative/support/startup_test.go b/e2e/knative/support/startup_test.go
@@ -46,6 +46,6 @@ func TestKNativeCamelKInstallStartup(t *testing.T) {
 	Expect(KamelInstallWithIDAndKameletCatalog(ns.GetName(), ns.GetName(), "--trait-profile", "knative").Execute()).To(Succeed())
 	Eventually(OperatorPod(ns.GetName())).ShouldNot(BeNil())
 	Eventually(Platform(ns.GetName())).ShouldNot(BeNil())
-	Eventually(PlatformConditionStatus(ns.GetName(), v1.IntegrationPlatformConditionReady), TestTimeoutShort).
+	Eventually(PlatformConditionStatus(ns.GetName(), v1.IntegrationPlatformConditionTypeCreated), TestTimeoutShort).
 		Should(Equal(corev1.ConditionTrue))
 }
diff --git a/pkg/apis/camel/v1/integrationplatform_types.go b/pkg/apis/camel/v1/integrationplatform_types.go
@@ -205,7 +205,7 @@ const (
 	IntegrationPlatformPhaseDuplicate IntegrationPlatformPhase = "Duplicate"
 
 	// IntegrationPlatformConditionReady is the condition if the IntegrationPlatform is ready.
-	// Deprecated: use IntegrationPlatformConditionTypeCreated
+	// Deprecated: use IntegrationPlatformConditionTypeCreated.
 	IntegrationPlatformConditionReady = "Ready"
 	// IntegrationPlatformConditionTypeCreated is the condition if the IntegrationPlatform has been created.
 	IntegrationPlatformConditionTypeCreated IntegrationPlatformConditionType = "Created"

diff --git a/pkg/controller/integrationplatform/monitor.go b/pkg/controller/integrationplatform/monitor.go
@@ -69,14 +69,14 @@ func (action *monitorAction) Handle(ctx context.Context, platform *v1.Integratio
 			"IntegrationPlatformRegistryAvailable",
 			"registry not available because provided by Openshift")
 	} else {
-		if &platform.Status.Build.Registry == nil || platform.Status.Build.Registry.Address == "" {
+		if platform.Status.Build.Registry.Address == "" {
 			// error, we need a registry if we're not on Openshift
 			platform.Status.Phase = v1.IntegrationPlatformPhaseError
 			platform.Status.SetCondition(
 				v1.IntegrationPlatformConditionTypeRegistryAvailable,
 				corev1.ConditionFalse,
 				"IntegrationPlatformRegistryAvailable",
-				"registry not available")
+				"registry address not available, you need to set one")
 		} else {
 			platform.Status.Phase = v1.IntegrationPlatformPhaseReady
 			platform.Status.SetCondition(

diff --git a/pkg/resources/resources.go b/pkg/resources/resources.go