From 9cfbc03100d9f922567a3b753435a2833377a225 Mon Sep 17 00:00:00 2001 From: Jan Calanog Date: Tue, 4 Feb 2025 23:13:21 +0100 Subject: [PATCH 01/15] github-action: Add AsciiDoc freeze warning (#264) * github-action: Add AsciiDoc freeze warning * github-action: Add AsciiDoc freeze warning --------- Co-authored-by: Brandon Morelli --- .../workflows/comment-on-asciidoc-changes.yml | 21 +++++++++++++++++++ 1 file changed, 21 insertions(+) create mode 100644 .github/workflows/comment-on-asciidoc-changes.yml diff --git a/.github/workflows/comment-on-asciidoc-changes.yml b/.github/workflows/comment-on-asciidoc-changes.yml new file mode 100644 index 000000000..8e5f836b1 --- /dev/null +++ b/.github/workflows/comment-on-asciidoc-changes.yml @@ -0,0 +1,21 @@ +--- +name: Comment on PR for .asciidoc changes + +on: + # We need to use pull_request_target to be able to comment on PRs from forks + pull_request_target: + types: + - synchronize + - opened + - reopened + branches: + - main + - master + - "9.0" + +jobs: + comment-on-asciidoc-change: + permissions: + contents: read + pull-requests: write + uses: elastic/docs-builder/.github/workflows/comment-on-asciidoc-changes.yml@main From 2459798a66c8cc66bfc5b31a2c998d47f81093c3 Mon Sep 17 00:00:00 2001 From: Marci W <333176+marciw@users.noreply.github.com> Date: Tue, 4 Feb 2025 17:53:41 -0500 Subject: [PATCH 02/15] Fix incorrectly mapped troubleshooting page (#298) * Fix incorrectly mapped page * Fix internal links * remove aspirational anchor links for now * Couple more aspirational anchors * Fix raw migrated files too --------- Co-authored-by: Brandon Morelli --- cloud-account/multifactor-authentication.md | 2 +- .../cloud-organization/billing/billing-faq.md | 6 +- raw-migrated-files/cloud/cloud/ec-about.md | 2 +- .../cloud/cloud/ec-deployment-no-op.md | 2 +- .../cloud/cloud/ec-faq-getting-started.md | 2 +- .../cloud/cloud/ec-monitoring.md | 2 +- .../ec-scenario_why_are_shards_unavailable.md | 4 +- .../ec-scenario_why_is_my_node_unavailable.md | 6 +- troubleshoot/toc.yml | 1 - troubleshoot/troubleshoot/cloud.md | 65 ------------------- troubleshoot/troubleshoot/index.md | 12 +++- 11 files changed, 24 insertions(+), 80 deletions(-) delete mode 100644 troubleshoot/troubleshoot/cloud.md diff --git a/cloud-account/multifactor-authentication.md b/cloud-account/multifactor-authentication.md index 25741b898..3f52bbdd4 100644 --- a/cloud-account/multifactor-authentication.md +++ b/cloud-account/multifactor-authentication.md @@ -58,7 +58,7 @@ To enable multifactor authentication using an authenticator app, you must verify You can remove a multifactor authentication method after it’s added by clicking **Remove**. -Before you remove an authentication method, you must set up an alternate method. If you can’t use any of your configured authentication methods — for example, if your device is lost or stolen — then [contact support](../troubleshoot/troubleshoot/cloud.md). +Before you remove an authentication method, you must set up an alternate method. If you can’t use any of your configured authentication methods — for example, if your device is lost or stolen — then [contact support](../troubleshoot/troubleshoot/index.md). ## Frequently asked questions [ec-account-security-mfa-faq] diff --git a/deploy-manage/cloud-organization/billing/billing-faq.md b/deploy-manage/cloud-organization/billing/billing-faq.md index d196f4801..961dbec57 100644 --- a/deploy-manage/cloud-organization/billing/billing-faq.md +++ b/deploy-manage/cloud-organization/billing/billing-faq.md @@ -68,7 +68,7 @@ $$$faq-payment$$$What are the available payment methods on Elasticsearch Service : For month-to-month payments only credit cards are accepted. We also allow payments by bank transfer for annual subscriptions. $$$faq-contact$$$Who can I contact for more information? -: If you have any further questions about your credit card statement, billing, or receipts, please send an email to `ar@elastic.co` or open a [Support case](../../../troubleshoot/troubleshoot/cloud.md) using the *Billing issue* category. +: If you have any further questions about your credit card statement, billing, or receipts, please send an email to `ar@elastic.co` or open a [Support case](../../../troubleshoot/troubleshoot/index.md) using the *Billing issue* category. $$$faq-charge$$$Why is my credit card charged? : If you are on a monthly plan, the charge is a recurring fee for using our hosted Elasticsearch Service. The fee is normally charged at the start of each month, but it can also be charged at other times during the month. If a charge is unsuccessful, we will try to charge your card again at a later date. @@ -86,10 +86,10 @@ $$$faq-chargednotusing$$$Why am I being charged if I don’t use any of my deplo : Even if you have no activity on your account and you haven’t logged into the [Elasticsearch Service Console](https://cloud.elastic.co?page=docs&placement=docs-body), your active deployments still incur costs that we need to charge you for. To avoid being charged for active but unused deployments, you can simply delete them. Your account will stay active with no charges, and you can always spin up more capacity when you need it. $$$faq-deleteaccount$$$How can I delete my Elastic Cloud account? -: To have your account removed, you can contact support through the Elasticsearch Service [Support form](https://cloud.elastic.co/support?page=docs&placement=docs-body) or use one of these [alternative contact methods](../../../troubleshoot/troubleshoot/cloud.md). For details about our data erasure policy, check [Privacy Rights and Choices](https://www.elastic.co/legal/privacy-statement#privacy-rights-and-choices?page=docs&placement=docs-body) in our General Privacy Statement. +: To have your account removed, you can contact support through the Elasticsearch Service [Support form](https://cloud.elastic.co/support?page=docs&placement=docs-body) or use one of these [alternative contact methods](../../../troubleshoot/troubleshoot/index.md). For details about our data erasure policy, check [Privacy Rights and Choices](https://www.elastic.co/legal/privacy-statement#privacy-rights-and-choices?page=docs&placement=docs-body) in our General Privacy Statement. $$$faq-refund$$$Can I get a refund? -: Charges are non-refundable, but once you delete a deployment we’ll stop charging you for that deployment immediately. You only pay for what you use and you can stop using the service at any time. For any special considerations warranting a potential refund, please use the Elasticsearch Service Console [Support form](https://cloud.elastic.co/support?page=docs&placement=docs-body) to open a support case and select *Billing issue* as the category. To ensure quick processing, be sure to provide detail about the reasons for the refund request as well as other matters pertaining to the issue. For other ways to open a Support case, check [Getting help](../../../troubleshoot/troubleshoot/cloud.md). +: Charges are non-refundable, but once you delete a deployment we’ll stop charging you for that deployment immediately. You only pay for what you use and you can stop using the service at any time. For any special considerations warranting a potential refund, please use the Elasticsearch Service Console [Support form](https://cloud.elastic.co/support?page=docs&placement=docs-body) to open a support case and select *Billing issue* as the category. To ensure quick processing, be sure to provide detail about the reasons for the refund request as well as other matters pertaining to the issue. For other ways to open a Support case, check [Contact us](../../../troubleshoot/troubleshoot/index.md). $$$faq-included$$$What is included in my paid Elasticsearch Service deployment? : All subscription tiers for the Elasticsearch Service include the following free allowance: diff --git a/raw-migrated-files/cloud/cloud/ec-about.md b/raw-migrated-files/cloud/cloud/ec-about.md index 51342ef8d..48821e47a 100644 --- a/raw-migrated-files/cloud/cloud/ec-about.md +++ b/raw-migrated-files/cloud/cloud/ec-about.md @@ -7,6 +7,6 @@ The information in this section covers: * [Elasticsearch Service Hardware](https://www.elastic.co/guide/en/cloud/current/ec-reference-hardware.html) * [Elasticsearch Service Regions](https://www.elastic.co/guide/en/cloud/current/ec-reference-regions.html) * [Service Status](../../../deploy-manage/cloud-organization/service-status.md) -* [Getting help](../../../troubleshoot/troubleshoot/cloud.md) +* [Getting help](../../../troubleshoot/troubleshoot/index.md) * [Restrictions and known problems](../../../deploy-manage/deploy/elastic-cloud/restrictions-known-problems.md) diff --git a/raw-migrated-files/cloud/cloud/ec-deployment-no-op.md b/raw-migrated-files/cloud/cloud/ec-deployment-no-op.md index 2dfea43a7..0c090e9f0 100644 --- a/raw-migrated-files/cloud/cloud/ec-deployment-no-op.md +++ b/raw-migrated-files/cloud/cloud/ec-deployment-no-op.md @@ -14,7 +14,7 @@ Re-saving the deployment configuration without making any changes is often all t **Seeing multiple warnings?** -If multiple health warnings appear for one of your deployments, or if your deployment is unhealthy, we recommend [Getting help](../../../troubleshoot/troubleshoot/cloud.md) through the Elastic Support Portal. +If multiple health warnings appear for one of your deployments, or if your deployment is unhealthy, we recommend [Getting help](../../../troubleshoot/troubleshoot/index.md) through the Elastic Support Portal. **Warning about system changes** diff --git a/raw-migrated-files/cloud/cloud/ec-faq-getting-started.md b/raw-migrated-files/cloud/cloud/ec-faq-getting-started.md index 8e6e61784..1a30da32e 100644 --- a/raw-migrated-files/cloud/cloud/ec-faq-getting-started.md +++ b/raw-migrated-files/cloud/cloud/ec-faq-getting-started.md @@ -34,7 +34,7 @@ This frequently-asked-questions list helps you with common questions while you g : Scale your clusters both up and down from the user console, whenever you like. The resizing of the cluster is transparently done in the background, and highly available clusters are resized without any downtime. If you scale your cluster down, make sure that the downsized cluster can handle your {{es}} memory requirements. Read more about sizing and memory in [Sizing {{es}}](https://www.elastic.co/blog/found-sizing-elasticsearch). $$$faq-subscriptions$$$Do you offer support? - : Yes, all subscription levels for Elasticsearch Service include support, handled by email or through the Elastic Support Portal. Different subscription levels include different levels of support. For the Standard subscription level, there is no service-level agreement (SLA) on support response times. Gold and Platinum subscription levels include an SLA on response times to tickets and dedicated resources. To learn more, check [Getting Help](../../../troubleshoot/troubleshoot/cloud.md). + : Yes, all subscription levels for Elasticsearch Service include support, handled by email or through the Elastic Support Portal. Different subscription levels include different levels of support. For the Standard subscription level, there is no service-level agreement (SLA) on support response times. Gold and Platinum subscription levels include an SLA on response times to tickets and dedicated resources. To learn more, check [Getting Help](../../../troubleshoot/troubleshoot/index.md). $$$faq-where$$$Where is Elasticsearch Service hosted? : We host our {{es}} clusters on Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure. Check out which [regions we support](https://www.elastic.co/guide/en/cloud/current/ec-reference-regions.html) and what [hardware we use](https://www.elastic.co/guide/en/cloud/current/ec-reference-hardware.html). New data centers are added all the time. diff --git a/raw-migrated-files/cloud/cloud/ec-monitoring.md b/raw-migrated-files/cloud/cloud/ec-monitoring.md index 278689b8d..2e9d502d7 100644 --- a/raw-migrated-files/cloud/cloud/ec-monitoring.md +++ b/raw-migrated-files/cloud/cloud/ec-monitoring.md @@ -96,7 +96,7 @@ In a production environment, we highly recommend storing your logs and metrics i ## Dedicated logs and metrics [ec-es-health-dedicated] -In a production environment, it’s important set up dedicated health monitoring in order to retain the logs and metrics that can be used to troubleshoot any health issues in your deployments. In the event of that you need to [contact our support team](../../../troubleshoot/troubleshoot/cloud.md), they can use the retained data to help diagnose any problems that you may encounter. +In a production environment, it’s important set up dedicated health monitoring in order to retain the logs and metrics that can be used to troubleshoot any health issues in your deployments. In the event of that you need to [contact our support team](../../../troubleshoot/troubleshoot/index.md), they can use the retained data to help diagnose any problems that you may encounter. You have the option of sending logs and metrics to a separate, specialized monitoring deployment, which ensures that they’re available in the event of a deployment outage. The monitoring deployment also gives you access to Kibana’s stack monitoring features, through which you can view health and performance data for all of your deployment resources. diff --git a/raw-migrated-files/cloud/cloud/ec-scenario_why_are_shards_unavailable.md b/raw-migrated-files/cloud/cloud/ec-scenario_why_are_shards_unavailable.md index 79bc791da..5cd5ae589 100644 --- a/raw-migrated-files/cloud/cloud/ec-scenario_why_are_shards_unavailable.md +++ b/raw-migrated-files/cloud/cloud/ec-scenario_why_are_shards_unavailable.md @@ -22,7 +22,7 @@ If a cluster has unassigned shards, you might see an error message such as this :alt: Unhealthy deployment error message ::: -If your issue is not addressed here, then [contact Elastic support for help](../../../troubleshoot/troubleshoot/cloud.md). +If your issue is not addressed here, then [contact Elastic support for help](../../../troubleshoot/troubleshoot/index.md). ## Analyze unassigned shards using the {{es}} API [ec-analyze_shards_with-api] @@ -186,7 +186,7 @@ Here’s how to resolve the most common causes of unassigned shards reported by * [Maximum retry times exceeded](../../../troubleshoot/monitoring/unavailable-shards.md#ec-max-retry-exceeded) * [Max shard per node reached the limit](../../../troubleshoot/monitoring/unavailable-shards.md#ec-max-shard-per-node) -If your issue is not addressed here, then [contact Elastic support for help](../../../troubleshoot/troubleshoot/cloud.md). +If your issue is not addressed here, then [contact Elastic support for help](../../../troubleshoot/troubleshoot/index.md). ### Disk is full [ec-disk-full] diff --git a/raw-migrated-files/cloud/cloud/ec-scenario_why_is_my_node_unavailable.md b/raw-migrated-files/cloud/cloud/ec-scenario_why_is_my_node_unavailable.md index aed231d01..33b6861d9 100644 --- a/raw-migrated-files/cloud/cloud/ec-scenario_why_is_my_node_unavailable.md +++ b/raw-migrated-files/cloud/cloud/ec-scenario_why_is_my_node_unavailable.md @@ -23,7 +23,7 @@ Some actions described here, such as stopping indexing or Machine Learning jobs, For production deployments, we recommend setting up a dedicated monitoring cluster to collect metrics and logs, troubleshooting views, and cluster alerts. -If your issue is not addressed here, then [contact Elastic support for help](../../../troubleshoot/troubleshoot/cloud.md). +If your issue is not addressed here, then [contact Elastic support for help](../../../troubleshoot/troubleshoot/index.md). ## Full disk on single-node deployment [ec-single-node-deployment-disk-used] @@ -53,7 +53,7 @@ If your issue is not addressed here, then [contact Elastic support for help](../ * Increase the disk size on your Hot data and Content tier (scale up). ::::{note} -If your {{es}} cluster is unhealthy and reports a status of red, then increasing the disk size of your Hot data and Content tier may fail. You might need to delete some data so the configuration can be edited. If you want to increase your disk size without deleting data, then [reach out to Elastic support](../../../troubleshoot/troubleshoot/cloud.md) and we will assist you with scaling up. +If your {{es}} cluster is unhealthy and reports a status of red, then increasing the disk size of your Hot data and Content tier may fail. You might need to delete some data so the configuration can be edited. If you want to increase your disk size without deleting data, then [reach out to Elastic support](../../../troubleshoot/troubleshoot/index.md) and we will assist you with scaling up. :::: @@ -100,7 +100,7 @@ If your {{es}} cluster is unhealthy and reports a status of red, then increasing * Increase the disk size (scale up). ::::{note} -If your {{es}} cluster is unhealthy and reports a status of red, the scale up configuration change to increasing disk size on the affected data tiers may fail. You might need to delete some data so the configuration can be edited. If you want to increase your disk size without deleting data, then [reach out to Elastic support](../../../troubleshoot/troubleshoot/cloud.md) and we will assist you with scaling up. +If your {{es}} cluster is unhealthy and reports a status of red, the scale up configuration change to increasing disk size on the affected data tiers may fail. You might need to delete some data so the configuration can be edited. If you want to increase your disk size without deleting data, then [reach out to Elastic support](../../../troubleshoot/troubleshoot/index.md) and we will assist you with scaling up. :::: diff --git a/troubleshoot/toc.yml b/troubleshoot/toc.yml index ef0416e76..4126cdb56 100644 --- a/troubleshoot/toc.yml +++ b/troubleshoot/toc.yml @@ -1,7 +1,6 @@ project: 'Troubleshoot' toc: - file: troubleshoot/index.md - - file: troubleshoot/cloud.md - file: elasticsearch/elasticsearch-reference.md children: - file: elasticsearch/fix-common-cluster-issues.md diff --git a/troubleshoot/troubleshoot/cloud.md b/troubleshoot/troubleshoot/cloud.md deleted file mode 100644 index 8356fd26c..000000000 --- a/troubleshoot/troubleshoot/cloud.md +++ /dev/null @@ -1,65 +0,0 @@ ---- -mapped_pages: - - https://www.elastic.co/guide/en/cloud/current/ec-get-help.html ---- - -# Troubleshoot [ec-get-help] - -With your Elasticsearch Service subscription, you get access to support from the creators of Elasticsearch, Kibana, Beats, Logstash, and much more. We’re here to help! - - -## How do I open a support case? [ec_how_do_i_open_a_support_case] - -All roads lead to the Elastic Support Portal, where you can access to all your cases, subscriptions, and licenses. - -As an Elasticsearch Service customer, you will receive an email with instructions how to log in to the Support Portal, where you can track both current and archived cases. If you are a new customer who just signed up for Elasticsearch Service, it can take a few hours for your Support Portal access to be set up. If you have questions, reach out to us at `support@elastic.co`. - -::::{note} -With the release of the new Support Portal, even if you have an existing account, you might be prompted to update your password. -:::: - - -There are three ways you can get to the portal: - -* Go directly to the Support Portal: [http://support.elastic.co](http://support.elastic.co) -* From the Elasticsearch Service Console: Go to the [Support page](https://cloud.elastic.co/support?page=docs&placement=docs-body) or select the support icon, that looks like a life preserver, on any page in the console. -* Contact us by email: `support@elastic.co` - - If you contact us by email, please use the email address that you registered with, so that we can help you more quickly. If you are using a distribution list as your registered email, you can also register a second email address with us. Just open a case to let us know the name and email address you would like to be added. - - -When opening a case, there are a few things you can do to get help faster: - -* Include the deployment ID that you want help with, especially if you have several deployments. The deployment ID can be found on the overview page for your cluster in the [Elasticsearch Service Console](https://cloud.elastic.co?page=docs&placement=docs-body). -* Describe the problem. Include any relevant details, including error messages you encountered, dates and times when the problem occurred, or anything else you think might be helpful. -* Upload any pertinent files. - - -## What level of support can I expect? [ec_what_level_of_support_can_i_expect] - -Support is governed by the [Elasticsearch Service Standard Terms of Service](https://www.elastic.co/legal/terms-of-service/cloud). The level of support you can expect to receive applies to your Elasticsearch Service environment only and depends on your subscription level: - -Elasticsearch Service Standard subscriptions -: Support is provided by email or through the Elastic Support Portal. The main focus of support is to ensure your Elasticsearch Service deployment shows a green status and is available. There is no guaranteed initial or ongoing response time, but we do strive to engage on every issue within three business days. We do not offer weekend coverage, so we respond Monday through Friday only. To learn more, check [Working with Elastic Support Elasticsearch Service Standard](https://www.elastic.co/support/welcome/cloud). - -Elasticsearch Service Gold and Platinum subscriptions -: Support is handled by email or through the Elastic Support Portal. Provides guaranteed response times for support issues, better support coverage hours, and support contacts at Elastic. Also includes support for how-to and development questions. The exact support coverage depends on whether you are a Gold or Platinum customer. To learn more, check [Elasticsearch Service Premium Support Services Policy](https://www.elastic.co/legal/support_policy/cloud_premium). - -::::{note} -If you are in free trial, you are also eligible to get the Elasticsearch Service Standard level support for as long as the trial is active. -:::: - - -If you are on an Elasticsearch Service Standard subscription and you are interested in moving to Gold or Platinum support, please [contact us](https://www.elastic.co/cloud/contact). We also recommend that you read our best practices guide for getting the most out of your support experience: [https://www.elastic.co/support/welcome](https://www.elastic.co/support/welcome). - - -## Join the community forums [ec_join_the_community_forums] - -Elasticsearch, Logstash, and Kibana enjoy the benefit of having vibrant and helpful communities. You have our assurance of high-quality support and single source of truth as an Elasticsearch Service customer, but the Elastic community can also be a useful resource for you whenever you need it. - -::::{tip} -As of May 1, 2017, support for Elasticsearch Service **Standard** customers has moved from the Discuss forum to our link: [Elastic Support Portal](https://support.elastic.co). You should receive login instructions by email. We will also monitor the forum and help you get into the Support Portal, in case you’re unsure where to go. -:::: - - -If you have any technical questions that are not for our Support team, hop on our [Elastic community forums](https://discuss.elastic.co/) and get answers from the experts in the community, including people from Elastic. diff --git a/troubleshoot/troubleshoot/index.md b/troubleshoot/troubleshoot/index.md index a98cd75f0..f6b810de7 100644 --- a/troubleshoot/troubleshoot/index.md +++ b/troubleshoot/troubleshoot/index.md @@ -2,6 +2,7 @@ mapped_urls: - https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/get-support-help.html - https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/troubleshooting-and-faqs.html + - https://www.elastic.co/guide/en/cloud/current/ec-get-help.html --- # Troubleshoot @@ -13,4 +14,13 @@ mapped_urls: % Use migrated content from existing pages that map to this page: % - [ ] ./raw-migrated-files/tech-content/starting-with-the-elasticsearch-platform-and-its-solutions/get-support-help.md -% - [ ] ./raw-migrated-files/tech-content/starting-with-the-elasticsearch-platform-and-its-solutions/troubleshooting-and-faqs.md \ No newline at end of file +% - [ ] ./raw-migrated-files/tech-content/starting-with-the-elasticsearch-platform-and-its-solutions/troubleshooting-and-faqs.md +% - next one added by marciw manually for tracking +% - [ ] ./raw-migrated-files/cloud/cloud/ec-get-help.md (especially IDs -- no generated list because page was incorrectly mapped to a new v3 page instead of as a many-to-one for this page) + +% WIP sections added by marciw +% see https://docs.elastic.dev/content-architecture/content-type/troubleshooting/entrypoint +% ## one for each grouping +% ## Additional resources [troubleshoot-additional-resources] +% ## Contact us [troubleshoot-contact-us] +% ### Working with support \ No newline at end of file From 03ba47ed91344ee57bbc45d9b021594b86aa2b41 Mon Sep 17 00:00:00 2001 From: Brandon Morelli Date: Tue, 4 Feb 2025 14:57:45 -0800 Subject: [PATCH 03/15] Rename search-your-data-semantic-search-elser.asciidoc to search-your-data-semantic-search-elser.asciidoc (#318) --- ...sciidoc => search-your-data-semantic-search-elser.asciidoc} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename serverless/pages/{search-your-data-semantic-search-elser.asciidoc => search-your-data-semantic-search-elser.asciidoc} (89%) diff --git a/serverless/pages/search-your-data-semantic-search-elser.asciidoc b/serverless/pages/search-your-data-semantic-search-elser.asciidoc similarity index 89% rename from serverless/pages/search-your-data-semantic-search-elser.asciidoc rename to serverless/pages/search-your-data-semantic-search-elser.asciidoc index 02712bc16..f33b6b592 100644 --- a/serverless/pages/search-your-data-semantic-search-elser.asciidoc +++ b/serverless/pages/search-your-data-semantic-search-elser.asciidoc @@ -6,4 +6,4 @@ // This page is not included in the index file, so it is not visible in the navigation menu anymore. HTTP redirects will be set up. -ℹ️ Refer to <> for an overview of semantic search in {es-serverless}. \ No newline at end of file +ℹ️ Refer to <> for an overview of semantic search in {es-serverless}. From 82a66beaec1ac9d801e985fa412389d3633211e5 Mon Sep 17 00:00:00 2001 From: Brandon Morelli Date: Tue, 4 Feb 2025 15:31:58 -0800 Subject: [PATCH 04/15] Reduce number of warnings (#320) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * add external_hosts * Update docset.yml * updates * txts —> txt --- .../configure-host-suse-cloud.md | 25 ++- .../configure-host-suse-onprem.md | 25 ++- .../configure-host-ubuntu-cloud.md | 29 ++- .../configure-host-ubuntu-onprem.md | 29 ++- .../deploy/cloud-on-k8s/logstash-plugins.md | 4 +- .../kibana-task-manager-health-monitoring.md | 13 +- .../ccr-tutorial-initial-setup.md | 16 +- docset.yml | 185 ++++++++++++++++++ .../visualize/maps/maps-connect-to-ems.md | 31 ++- .../transform-enrich/example-parse-logs.md | 2 +- ...bservability-correlate-application-logs.md | 2 +- .../observability-parse-log-data.md | 10 +- .../search-with-synonyms.md | 23 ++- .../semantic-search-inference.md | 18 +- .../observability/apm-common-problems.md | 4 +- .../observability/application-logs.md | 2 +- .../observability/logs-parse.md | 10 +- ...nfigure-apm-agent-central-configuration.md | 5 +- solutions/search/semantic-search/cohere-es.md | 3 +- .../semantic-search/semantic-search-elser.md | 2 +- .../semantic-search-inference.md | 18 +- .../search/vector/sparse-vector-elser.md | 2 +- .../elasticsearch/high-jvm-memory-pressure.md | 2 +- troubleshoot/ingest/fleet/common-problems.md | 2 +- .../apm-agent-go/apm-go-agent.md | 3 +- ...ur-universal-profiling-agent-deployment.md | 23 ++- 26 files changed, 330 insertions(+), 158 deletions(-) diff --git a/deploy-manage/deploy/cloud-enterprise/configure-host-suse-cloud.md b/deploy-manage/deploy/cloud-enterprise/configure-host-suse-cloud.md index e3f6d2651..d397f7a1d 100644 --- a/deploy-manage/deploy/cloud-enterprise/configure-host-suse-cloud.md +++ b/deploy-manage/deploy/cloud-enterprise/configure-host-suse-cloud.md @@ -20,9 +20,9 @@ If you want to install Elastic Cloud Enterprise on your own hosts, the steps for Regardless of which approach you take, the steps in this section need to be performed on every host that you want to use with Elastic Cloud Enterprise. -## Install Docker [ece-install-docker-sles12-cloud] +## Install Docker [ece-install-docker-sles12-cloud] -::::{important} +::::{important} Make sure to use a combination of Linux distribution and Docker version that is supported, following our official [Support matrix](https://www.elastic.co/support/matrix#elastic-cloud-enterprise). Using unsupported combinations can cause multiple issues with you ECE environment, such as failures to create system deployments, to upgrade workload deployments, proxy timeouts, and more. :::: @@ -63,7 +63,7 @@ Make sure to use a combination of Linux distribution and Docker version that is -## Set up OS groups and user [ece_set_up_os_groups_and_user] +## Set up OS groups and user [ece_set_up_os_groups_and_user] 1. If they don’t already exist, create the following OS groups: @@ -80,18 +80,18 @@ Make sure to use a combination of Linux distribution and Docker version that is -## Set up XFS on SLES [ece-xfs-setup-sles12-cloud] +## Set up XFS on SLES [ece-xfs-setup-sles12-cloud] XFS is required to support disk space quotas for Elasticsearch data directories. Some Linux distributions such as RHEL and Rocky Linux already provide XFS as the default file system. On SLES 12 and 15, you need to set up an XFS file system and have quotas enabled. Disk space quotas set a limit on the amount of disk space an Elasticsearch cluster node can use. Currently, quotas are calculated by a static ratio of 1:32, which means that for every 1 GB of RAM a cluster is given, a cluster node is allowed to consume 32 GB of disk space. -::::{note} +::::{note} Using LVM, `mdadm`, or a combination of the two for block device management is possible, but the configuration is not covered here, nor is it provided as part of supporting Elastic Cloud Enterprise. :::: -::::{important} +::::{important} You must use XFS and have quotas enabled on all allocators, otherwise disk usage won’t display correctly. :::: @@ -124,7 +124,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage -## Update the configurations settings [ece-update-config-sles-cloud] +## Update the configurations settings [ece-update-config-sles-cloud] 1. Stop the Docker service: @@ -162,7 +162,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage EOF ``` - ::::{important} + ::::{important} The `net.ipv4.tcp_retries2` setting applies to all TCP connections and affects the reliability of communication with systems other than Elasticsearch clusters too. If your clusters communicate with external systems over a low quality network then you may need to select a higher value for `net.ipv4.tcp_retries2`. :::: @@ -178,7 +178,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage Add the following configuration values to the `/etc/security/limits.conf` file. These values are derived from our experience with the Elastic Cloud hosted offering and should be used for Elastic Cloud Enterprise as well. - ::::{tip} + ::::{tip} If you are using a user name other than `elastic`, adjust the configuration values accordingly. :::: @@ -243,7 +243,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage -## Configure the Docker daemon [ece-configure-docker-daemon-sles12-cloud] +## Configure the Docker daemon [ece-configure-docker-daemon-sles12-cloud] 1. Edit `/etc/docker/daemon.json`, and make sure that the following configuration values are present:
@@ -321,7 +321,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage 5. Reboot your system to ensure that all configuration changes take effect: - ```literal + ```sh sudo reboot ``` @@ -333,7 +333,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage 7. After rebooting, verify that your Docker settings persist as expected: - ```literal + ```sh sudo docker info | grep Root ``` @@ -342,4 +342,3 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage If the command returns `Docker Root Dir: /var/lib/docker`, then you need to troubleshoot the previous configuration steps until the Docker settings are applied successfully before continuing with the installation process. For more information, check [Custom Docker daemon options](https://docs.docker.com/engine/admin/systemd/#/custom-docker-daemon-options) in the Docker documentation. 8. Repeat these steps on other hosts that you want to use with Elastic Cloud Enterprise or follow the steps in the next section to start installing Elastic Cloud Enterprise. - diff --git a/deploy-manage/deploy/cloud-enterprise/configure-host-suse-onprem.md b/deploy-manage/deploy/cloud-enterprise/configure-host-suse-onprem.md index 632b9a011..cd18a0c70 100644 --- a/deploy-manage/deploy/cloud-enterprise/configure-host-suse-onprem.md +++ b/deploy-manage/deploy/cloud-enterprise/configure-host-suse-onprem.md @@ -20,9 +20,9 @@ If you want to install Elastic Cloud Enterprise on your own hosts, the steps for Regardless of which approach you take, the steps in this section need to be performed on every host that you want to use with Elastic Cloud Enterprise. -## Install Docker [ece-install-docker-sles12-onprem] +## Install Docker [ece-install-docker-sles12-onprem] -::::{important} +::::{important} Make sure to use a combination of Linux distribution and Docker version that is supported, following our official [Support matrix](https://www.elastic.co/support/matrix#elastic-cloud-enterprise). Using unsupported combinations can cause multiple issues with you ECE environment, such as failures to create system deployments, to upgrade workload deployments, proxy timeouts, and more. :::: @@ -63,7 +63,7 @@ Make sure to use a combination of Linux distribution and Docker version that is -## Set up OS groups and user [ece_set_up_os_groups_and_user_2] +## Set up OS groups and user [ece_set_up_os_groups_and_user_2] 1. If they don’t already exist, create the following OS groups: @@ -80,18 +80,18 @@ Make sure to use a combination of Linux distribution and Docker version that is -## Set up XFS on SLES [ece-xfs-setup-sles12-onprem] +## Set up XFS on SLES [ece-xfs-setup-sles12-onprem] XFS is required to support disk space quotas for Elasticsearch data directories. Some Linux distributions such as RHEL and Rocky Linux already provide XFS as the default file system. On SLES 12 and 15, you need to set up an XFS file system and have quotas enabled. Disk space quotas set a limit on the amount of disk space an Elasticsearch cluster node can use. Currently, quotas are calculated by a static ratio of 1:32, which means that for every 1 GB of RAM a cluster is given, a cluster node is allowed to consume 32 GB of disk space. -::::{note} +::::{note} Using LVM, `mdadm`, or a combination of the two for block device management is possible, but the configuration is not covered here, nor is it provided as part of supporting Elastic Cloud Enterprise. :::: -::::{important} +::::{important} You must use XFS and have quotas enabled on all allocators, otherwise disk usage won’t display correctly. :::: @@ -124,7 +124,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage -## Update the configurations settings [ece-update-config-sles-onprem] +## Update the configurations settings [ece-update-config-sles-onprem] 1. Stop the Docker service: @@ -162,7 +162,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage EOF ``` - ::::{important} + ::::{important} The `net.ipv4.tcp_retries2` setting applies to all TCP connections and affects the reliability of communication with systems other than Elasticsearch clusters too. If your clusters communicate with external systems over a low quality network then you may need to select a higher value for `net.ipv4.tcp_retries2`. :::: @@ -178,7 +178,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage Add the following configuration values to the `/etc/security/limits.conf` file. These values are derived from our experience with the Elastic Cloud hosted offering and should be used for Elastic Cloud Enterprise as well. - ::::{tip} + ::::{tip} If you are using a user name other than `elastic`, adjust the configuration values accordingly. :::: @@ -243,7 +243,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage -## Configure the Docker daemon [ece-configure-docker-daemon-sles12-onprem] +## Configure the Docker daemon [ece-configure-docker-daemon-sles12-onprem] 1. Edit `/etc/docker/daemon.json`, and make sure that the following configuration values are present:
@@ -321,7 +321,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage 5. Reboot your system to ensure that all configuration changes take effect: - ```literal + ```sh sudo reboot ``` @@ -333,7 +333,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage 7. After rebooting, verify that your Docker settings persist as expected: - ```literal + ```sh sudo docker info | grep Root ``` @@ -342,4 +342,3 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage If the command returns `Docker Root Dir: /var/lib/docker`, then you need to troubleshoot the previous configuration steps until the Docker settings are applied successfully before continuing with the installation process. For more information, check [Custom Docker daemon options](https://docs.docker.com/engine/admin/systemd/#/custom-docker-daemon-options) in the Docker documentation. 8. Repeat these steps on other hosts that you want to use with Elastic Cloud Enterprise or follow the steps in the next section to start installing Elastic Cloud Enterprise. - diff --git a/deploy-manage/deploy/cloud-enterprise/configure-host-ubuntu-cloud.md b/deploy-manage/deploy/cloud-enterprise/configure-host-ubuntu-cloud.md index 28203ca46..86d667482 100644 --- a/deploy-manage/deploy/cloud-enterprise/configure-host-ubuntu-cloud.md +++ b/deploy-manage/deploy/cloud-enterprise/configure-host-ubuntu-cloud.md @@ -13,16 +13,16 @@ The following instructions show you how to prepare your hosts on 20.04 LTS (Foca * [Configure the Docker daemon options](#ece-configure-docker-daemon-ubuntu-cloud) -## Install Docker [ece-install-docker-ubuntu-cloud] +## Install Docker [ece-install-docker-ubuntu-cloud] Install Docker LTS version 24.0 for Ubuntu 20.04 or 22.04. -::::{important} +::::{important} Make sure to use a combination of Linux distribution and Docker version that is supported, following our official [Support matrix](https://www.elastic.co/support/matrix#elastic-cloud-enterprise). Using unsupported combinations can cause multiple issues with you ECE environment, such as failures to create system deployments, to upgrade workload deployments, proxy timeouts, and more. :::: -::::{note} +::::{note} Docker 25 and higher are not compatible with ECE 3.7. :::: @@ -56,18 +56,18 @@ Docker 25 and higher are not compatible with ECE 3.7. -## Set up XFS quotas [ece-xfs-setup-ubuntu-cloud] +## Set up XFS quotas [ece-xfs-setup-ubuntu-cloud] XFS is required to support disk space quotas for Elasticsearch data directories. Some Linux distributions such as RHEL and Rocky Linux already provide XFS as the default file system. On Ubuntu, you need to set up an XFS file system and have quotas enabled. Disk space quotas set a limit on the amount of disk space an Elasticsearch cluster node can use. Currently, quotas are calculated by a static ratio of 1:32, which means that for every 1 GB of RAM a cluster is given, a cluster node is allowed to consume 32 GB of disk space. -::::{note} +::::{note} Using LVM, `mdadm`, or a combination of the two for block device management is possible, but the configuration is not covered here, and it is not supported by Elastic Cloud Enterprise. :::: -::::{important} +::::{important} You must use XFS and have quotas enabled on all allocators, otherwise disk usage won’t display correctly. :::: @@ -101,7 +101,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage -## Update the configurations settings [ece-update-config-ubuntu-cloud] +## Update the configurations settings [ece-update-config-ubuntu-cloud] 1. Stop the Docker service: @@ -139,7 +139,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage EOF ``` - ::::{important} + ::::{important} The `net.ipv4.tcp_retries2` setting applies to all TCP connections and affects the reliability of communication with systems other than Elasticsearch clusters too. If your clusters communicate with external systems over a low quality network then you may need to select a higher value for `net.ipv4.tcp_retries2`. :::: @@ -154,7 +154,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage Add the following configuration values to the `/etc/security/limits.conf` file. These values are derived from our experience with the Elastic Cloud hosted offering and should be used for Elastic Cloud Enterprise as well. - ::::{tip} + ::::{tip} If you are using a user name other than `elastic`, adjust the configuration values accordingly. :::: @@ -219,14 +219,14 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage -## Configure the Docker daemon options [ece-configure-docker-daemon-ubuntu-cloud] +## Configure the Docker daemon options [ece-configure-docker-daemon-ubuntu-cloud] -::::{tip} +::::{tip} Docker creates a bridge IP address that can conflict with IP addresses on your internal network. To avoid an IP address conflict, change the `--bip=172.17.42.1/16` parameter in our examples to something that you know will work. If there is no conflict, you can omit the `--bip` parameter. The `--bip` parameter is internal to the host and can be set to the same IP for each host in the cluster. More information on Docker daemon options can be found in the [dockerd command line reference](https://docs.docker.com/engine/reference/commandline/dockerd/). :::: -::::{tip} +::::{tip} You can specify `--log-opt max-size` and `--log-opt max-file` to define the Docker daemon containers log rotation. :::: @@ -292,13 +292,13 @@ You can specify `--log-opt max-size` and `--log-opt max-file` to define the Dock 6. Reboot your system to ensure that all configuration changes take effect: - ```literal + ```sh sudo reboot ``` 7. After rebooting, verify that your Docker settings persist as expected: - ```literal + ```sh sudo docker info | grep Root ``` @@ -307,4 +307,3 @@ You can specify `--log-opt max-size` and `--log-opt max-file` to define the Dock If the command returns `Docker Root Dir: /var/lib/docker`, then you need to troubleshoot the previous configuration steps until the Docker settings are applied successfully before continuing with the installation process. For more information, check [Custom Docker daemon options](https://docs.docker.com/engine/admin/systemd/#/custom-docker-daemon-options) in the Docker documentation. 8. Repeat these steps on other hosts that you want to use with Elastic Cloud Enterprise or follow the steps in the next section to start installing Elastic Cloud Enterprise. - diff --git a/deploy-manage/deploy/cloud-enterprise/configure-host-ubuntu-onprem.md b/deploy-manage/deploy/cloud-enterprise/configure-host-ubuntu-onprem.md index 302651ef8..c0a23de7d 100644 --- a/deploy-manage/deploy/cloud-enterprise/configure-host-ubuntu-onprem.md +++ b/deploy-manage/deploy/cloud-enterprise/configure-host-ubuntu-onprem.md @@ -13,16 +13,16 @@ The following instructions show you how to prepare your hosts on 20.04 LTS (Foca * [Configure the Docker daemon options](#ece-configure-docker-daemon-ubuntu-onprem) -## Install Docker [ece-install-docker-ubuntu-onprem] +## Install Docker [ece-install-docker-ubuntu-onprem] Install Docker LTS version 24.0 for Ubuntu 20.04 or 22.04. -::::{important} +::::{important} Make sure to use a combination of Linux distribution and Docker version that is supported, following our official [Support matrix](https://www.elastic.co/support/matrix#elastic-cloud-enterprise). Using unsupported combinations can cause multiple issues with you ECE environment, such as failures to create system deployments, to upgrade workload deployments, proxy timeouts, and more. :::: -::::{note} +::::{note} Docker 25 and higher are not compatible with ECE 3.7. :::: @@ -56,18 +56,18 @@ Docker 25 and higher are not compatible with ECE 3.7. -## Set up XFS quotas [ece-xfs-setup-ubuntu-onprem] +## Set up XFS quotas [ece-xfs-setup-ubuntu-onprem] XFS is required to support disk space quotas for Elasticsearch data directories. Some Linux distributions such as RHEL and Rocky Linux already provide XFS as the default file system. On Ubuntu, you need to set up an XFS file system and have quotas enabled. Disk space quotas set a limit on the amount of disk space an Elasticsearch cluster node can use. Currently, quotas are calculated by a static ratio of 1:32, which means that for every 1 GB of RAM a cluster is given, a cluster node is allowed to consume 32 GB of disk space. -::::{note} +::::{note} Using LVM, `mdadm`, or a combination of the two for block device management is possible, but the configuration is not covered here, and it is not supported by Elastic Cloud Enterprise. :::: -::::{important} +::::{important} You must use XFS and have quotas enabled on all allocators, otherwise disk usage won’t display correctly. :::: @@ -101,7 +101,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage -## Update the configurations settings [ece-update-config-ubuntu-onprem] +## Update the configurations settings [ece-update-config-ubuntu-onprem] 1. Stop the Docker service: @@ -139,7 +139,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage EOF ``` - ::::{important} + ::::{important} The `net.ipv4.tcp_retries2` setting applies to all TCP connections and affects the reliability of communication with systems other than Elasticsearch clusters too. If your clusters communicate with external systems over a low quality network then you may need to select a higher value for `net.ipv4.tcp_retries2`. :::: @@ -154,7 +154,7 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage Add the following configuration values to the `/etc/security/limits.conf` file. These values are derived from our experience with the Elastic Cloud hosted offering and should be used for Elastic Cloud Enterprise as well. - ::::{tip} + ::::{tip} If you are using a user name other than `elastic`, adjust the configuration values accordingly. :::: @@ -219,14 +219,14 @@ You must use XFS and have quotas enabled on all allocators, otherwise disk usage -## Configure the Docker daemon options [ece-configure-docker-daemon-ubuntu-onprem] +## Configure the Docker daemon options [ece-configure-docker-daemon-ubuntu-onprem] -::::{tip} +::::{tip} Docker creates a bridge IP address that can conflict with IP addresses on your internal network. To avoid an IP address conflict, change the `--bip=172.17.42.1/16` parameter in our examples to something that you know will work. If there is no conflict, you can omit the `--bip` parameter. The `--bip` parameter is internal to the host and can be set to the same IP for each host in the cluster. More information on Docker daemon options can be found in the [dockerd command line reference](https://docs.docker.com/engine/reference/commandline/dockerd/). :::: -::::{tip} +::::{tip} You can specify `--log-opt max-size` and `--log-opt max-file` to define the Docker daemon containers log rotation. :::: @@ -292,13 +292,13 @@ You can specify `--log-opt max-size` and `--log-opt max-file` to define the Dock 6. Reboot your system to ensure that all configuration changes take effect: - ```literal + ```sh sudo reboot ``` 7. After rebooting, verify that your Docker settings persist as expected: - ```literal + ```sh sudo docker info | grep Root ``` @@ -307,4 +307,3 @@ You can specify `--log-opt max-size` and `--log-opt max-file` to define the Dock If the command returns `Docker Root Dir: /var/lib/docker`, then you need to troubleshoot the previous configuration steps until the Docker settings are applied successfully before continuing with the installation process. For more information, check [Custom Docker daemon options](https://docs.docker.com/engine/admin/systemd/#/custom-docker-daemon-options) in the Docker documentation. 8. Repeat these steps on other hosts that you want to use with Elastic Cloud Enterprise or follow the steps in the next section to start installing Elastic Cloud Enterprise. - diff --git a/deploy-manage/deploy/cloud-on-k8s/logstash-plugins.md b/deploy-manage/deploy/cloud-on-k8s/logstash-plugins.md index cf5820db7..a3736cb6d 100644 --- a/deploy-manage/deploy/cloud-on-k8s/logstash-plugins.md +++ b/deploy-manage/deploy/cloud-on-k8s/logstash-plugins.md @@ -396,7 +396,7 @@ The [`elasticsearch output`](https://www.elastic.co/guide/en/logstash/current/pl You can customize roles in {{es}}. Check out [creating custom roles](../../users-roles/cluster-or-deployment-auth/native.md) -```logstash +```yaml kind: Secret apiVersion: v1 metadata: @@ -418,7 +418,7 @@ stringData: The [`elastic_integration filter`](https://www.elastic.co/guide/en/logstash/current/plugins-filters-elastic_integration.html) plugin allows the use of [`ElasticsearchRef`](configuration-logstash.md#k8s-logstash-esref) and environment variables. -```logstash +```json elastic_integration { pipeline_name => "logstash-pipeline" hosts => [ "${ECK_ES_HOSTS}" ] diff --git a/deploy-manage/monitor/kibana-task-manager-health-monitoring.md b/deploy-manage/monitor/kibana-task-manager-health-monitoring.md index fde3dcb27..6e8e5077c 100644 --- a/deploy-manage/monitor/kibana-task-manager-health-monitoring.md +++ b/deploy-manage/monitor/kibana-task-manager-health-monitoring.md @@ -9,7 +9,7 @@ mapped_pages: # Kibana task manager health monitoring [task-manager-health-monitoring] -::::{warning} +::::{warning} This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. :::: @@ -27,7 +27,7 @@ $ curl -X GET api/task_manager/_health Monitoring the `_health` endpoint of each {{kib}} instance in the cluster is the recommended method of ensuring confidence in mission critical services such as Alerting, Actions, and Reporting. -## Configuring the monitored health statistics [task-manager-configuring-health-monitoring] +## Configuring the monitored health statistics [task-manager-configuring-health-monitoring] The health monitoring API monitors the performance of Task Manager out of the box. However, certain performance considerations are deployment specific and you can configure them. @@ -53,7 +53,7 @@ xpack.task_manager.monitored_task_execution_thresholds: -## Consuming health stats [task-manager-consuming-health-stats] +## Consuming health stats [task-manager-consuming-health-stats] The health API is best consumed by via the `/api/task_manager/_health` endpoint. @@ -79,14 +79,14 @@ By default, the health API runs at a regular cadence, and each time it runs, it This message looks like: -```log +```txt Detected potential performance issue with Task Manager. Set 'xpack.task_manager.monitored_stats_health_verbose_log.enabled: true' in your Kibana.yml to enable debug logging` ``` If this message appears, set [`xpack.task_manager.monitored_stats_health_verbose_log.enabled`](https://www.elastic.co/guide/en/kibana/current/task-manager-settings-kb.html#task-manager-settings) to `true` in your `kibana.yml`. This will start logging the health metrics at either a `warn` or `error` log level, depending on the detected severity of the potential problem. -## Making sense of Task Manager health stats [making-sense-of-task-manager-health-stats] +## Making sense of Task Manager health stats [making-sense-of-task-manager-health-stats] The health monitoring API exposes three sections: `configuration`, `workload` and `runtime`: @@ -103,7 +103,7 @@ The root `status` indicates the `status` of the system overall. The Runtime `status` indicates whether task executions have exceeded any of the [configured health thresholds](#task-manager-configuring-health-monitoring). An `OK` status means none of the threshold have been exceeded. A `Warning` status means that at least one warning threshold has been exceeded. An `Error` status means that at least one error threshold has been exceeded. -::::{important} +::::{important} Some tasks (such as [connectors](../manage-connectors.md)) will incorrectly report their status as successful even if the task failed. The runtime and workload block will return data about success and failures and will not take this into consideration. To get a better sense of action failures, please refer to the [Event log index](../../explore-analyze/alerts/kibana/event-log-index.md) for more accurate context into failures and successes. @@ -114,4 +114,3 @@ To get a better sense of action failures, please refer to the [Event log index]( The Capacity Estimation `status` indicates the sufficiency of the observed capacity. An `OK` status means capacity is sufficient. A `Warning` status means that capacity is sufficient for the scheduled recurring tasks, but non-recurring tasks often cause the cluster to exceed capacity. An `Error` status means that there is insufficient capacity across all types of tasks. By monitoring the `status` of the system overall, and the `status` of specific task types of interest, you can evaluate the health of the {{kib}} Task Management system. - diff --git a/deploy-manage/tools/cross-cluster-replication/ccr-tutorial-initial-setup.md b/deploy-manage/tools/cross-cluster-replication/ccr-tutorial-initial-setup.md index 1bb5907c4..c1030f66a 100644 --- a/deploy-manage/tools/cross-cluster-replication/ccr-tutorial-initial-setup.md +++ b/deploy-manage/tools/cross-cluster-replication/ccr-tutorial-initial-setup.md @@ -70,22 +70,22 @@ mapped_pages: } ``` - ::::{important} + ::::{important} Existing data on the cluster will not be replicated by `_ccr/auto_follow` even though the patterns may match. This function will only replicate newly created backing indices (as part of the data stream). :::: - ::::{important} + ::::{important} Use `leader_index_exclusion_patterns` to avoid recursion. :::: - ::::{tip} + ::::{tip} `follow_index_pattern` allows lowercase characters only. :::: - ::::{tip} + ::::{tip} This step cannot be executed via the {{kib}} UI due to the lack of an exclusion pattern in the UI. Use the API in this step. :::: @@ -93,7 +93,7 @@ mapped_pages: This example uses the input generator to demonstrate the document count in the clusters. Reconfigure this section to suit your own use case. - ```logstash + ```json ### On Logstash server ### ### This is a logstash config file ### input { @@ -111,12 +111,12 @@ mapped_pages: } ``` - ::::{important} + ::::{important} The key point is that when `cluster A` is down, all traffic will be automatically redirected to `cluster B`. Once `cluster A` comes back, traffic is automatically redirected back to `cluster A` again. This is achieved by the option `hosts` where multiple ES cluster endpoints are specified in the array `[clusterA, clusterB]`. :::: - ::::{tip} + ::::{tip} Set up the same password for the same user on both clusters to use this load-balancing feature. :::: @@ -148,5 +148,3 @@ mapped_pages: ```console GET logs*/_search?size=0 ``` - - diff --git a/docset.yml b/docset.yml index 838501ee0..05de5870c 100644 --- a/docset.yml +++ b/docset.yml @@ -493,3 +493,188 @@ subs: icon-bug: "pass:[]" icon-checkInCircleFilled: "pass:[]" icon-warningFilled: "pass:[]" + +external_hosts: + - 50.10 + - aka.ms + - aliyun.com + - amazon.com + - amazonaws.com + - amp.dev + - android.com + - ansible.com + - anthropic.com + - apache.org + - apple.com + - arxiv.org + - atlassian.com + - azure.com + - bouncycastle.org + - cbor.io + - census.gov + - cert-manager.io + - chromium.org + - cisa.gov + - cisecurity.org + - cmu.edu + - cncf.io + - co. + - codesandbox.io + - cohere.com + - columbia.edu + - concourse-ci.org + - contentstack.io + - curl.se + - dbeaver.io + - dbvis.com + - deque.com + - die.net + - digitalocean.com + - direnv.net + - dnschecker.org + - docker.com + - dso.mil + - eicar.org + - ela.st + - elastic-cloud.com + - elasticsearch.org + - elstc.co + - epsg.org + - example.com + - falco.org + - freedesktop.org + - gdal.org + - gin-gonic.com + - git-lfs.com + - github.io + - githubusercontent.com + - go.dev + - godoc.org + - golang.org + - google.com + - google.dev + - googleapis.com + - googleblog.com + - gorillatoolkit.org + - gradle.org + - handlebarsjs.com + - haxx.se + - helm.io + - helm.sh + - heroku.com + - huggingface.co + - ibm.com + - ietf.org + - ijmlc.org + - istio.io + - jaegertracing.io + - java.net + - javadoc.io + - javalin.io + - jenkins.io + - jina.ai + - json.org + - kernel.org + - kubernetes.io + - letsencrypt.org + - linkerd.io + - lmstudio.ai + - loft.sh + - man7.org + - mariadb.org + - markdownguide.org + - maven.org + - maxmind.com + - metacpan.org + - micrometer.io + - microsoft.com + - microstrategy.com + - min.io + - minio.io + - mistral.ai + - mit.edu + - mitre.org + - momentjs.com + - mozilla.org + - mvnrepository.com + - mysql.com + - navattic.com + - nginx.com + - nginx.org + - ngrok.com + - nist.gov + - nlog-project.org + - nodejs.dev + - nodejs.org + - npmjs.com + - ntp.org + - nuget.org + - numeraljs.com + - oasis-open.org + - office.com + - okta.com + - openai.com + - openebs.io + - opengroup.org + - openid.net + - openjdk.org + - openmaptiles.org + - openpolicyagent.org + - openshift.com + - openssl.org + - openstreetmap.org + - opentelemetry.io + - openweathermap.org + - operatorhub.io + - oracle.com + - osquery.io + - outlook.com + - owasp.org + - pagerduty.com + - palletsprojects.com + - pastebin.com + - playwright.dev + - podman.io + - postgresql.org + - pypi.org + - python.org + - qlik.com + - readthedocs.io + - recurly.com + - redhat.com + - rust-lang.org + - salesforce.com + - scikit-learn.org + - sdkman.io + - searchkit.co + - semver.org + - serilog.net + - sigstore.dev + - slack.com + - snowballstem.org + - sonatype.org + - sourceforge.net + - sourcemaps.info + - spring.io + - sql-workbench.eu + - stackexchange.com + - stunnel.org + - swiftype.com + - tableau.com + - talosintelligence.com + - telerik.com + - terraform.io + - trimet.org + - umd.edu + - urlencoder.org + - vaultproject.io + - victorops.com + - virustotal.com + - w3.org + - web.dev + - webhook.site + - wikipedia.org + - wolfi.dev + - wttr.in + - yaml.org + - youtube.com diff --git a/explore-analyze/visualize/maps/maps-connect-to-ems.md b/explore-analyze/visualize/maps/maps-connect-to-ems.md index 524ff0511..988d64ae6 100644 --- a/explore-analyze/visualize/maps/maps-connect-to-ems.md +++ b/explore-analyze/visualize/maps/maps-connect-to-ems.md @@ -47,7 +47,7 @@ curl -I 'https://tiles.maps.elastic.co/v9.0/manifest?elastic_tile_service_tos=ag Server response -```regex +```txt HTTP/2 200 server: BaseHTTP/0.6 Python/3.11.4 date: Mon, 20 Nov 2023 15:08:46 GMT @@ -71,7 +71,7 @@ alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000 :::::: ::::::{tab-item} Request -```regex +```txt Host: tiles.maps.elastic.co User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/119.0 Accept: */* @@ -90,7 +90,7 @@ TE: trailers :::::: ::::::{tab-item} Response -```regex +```txt server: BaseHTTP/0.6 Python/3.11.4 date: Mon, 20 Nov 2023 17:53:10 GMT content-type: application/json; charset=utf-8 @@ -127,7 +127,7 @@ $ curl -I 'https://tiles.maps.elastic.co/data/v3/1/1/0.pbf?elastic_tile_service_ Server response -```regex +```txt HTTP/2 200 content-encoding: gzip content-length: 144075 @@ -153,7 +153,7 @@ alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000 :::::: ::::::{tab-item} Request -```regex +```txt Host: tiles.maps.elastic.co User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/119.0 Accept: */* @@ -170,7 +170,7 @@ TE: trailers :::::: ::::::{tab-item} Response -```regex +```txt content-encoding: gzip content-length: 101691 access-control-allow-origin: * @@ -208,7 +208,7 @@ curl -I 'https://tiles.maps.elastic.co/styles/osm-bright-desaturated/sprite.png' Server response -```regex +```txt HTTP/2 200 content-length: 17181 access-control-allow-origin: * @@ -231,7 +231,7 @@ alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000 :::::: ::::::{tab-item} Request -```regex +```txt Host: tiles.maps.elastic.co User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/119.0 Accept: image/avif,image/webp,*/* @@ -250,7 +250,7 @@ TE: trailers :::::: ::::::{tab-item} Response -```regex +```txt content-length: 17181 access-control-allow-origin: * access-control-allow-methods: GET, OPTIONS, HEAD @@ -290,7 +290,7 @@ curl -I 'https://vector.maps.elastic.co/v9.0/manifest?elastic_tile_service_tos=a Server response -```regex +```txt HTTP/2 200 x-guploader-uploadid: ABPtcPp_BvMdBDO5jVlutETVHmvpOachwjilw4AkIKwMrOQJ4exR9Eln4g0LkW3V_LLSEpvjYLtUtFmO0Uwr61XXUhoP_A x-goog-generation: 1689593295246576 @@ -320,7 +320,7 @@ alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000 :::::: ::::::{tab-item} Request -```regex +```txt Host: vector.maps.elastic.co User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/119.0 Accept: */* @@ -338,7 +338,7 @@ Cache-Control: no-cache :::::: ::::::{tab-item} Response -```regex +```txt x-guploader-uploadid: ABPtcPoUFrCmjBeebnfRxSZp44ZHsZ-_iQg7794RU1Z7Lb2cNNxXsMRkIDa5s7VBEfyehvo-_9rcm1A3HfYW8geguUxKrw x-goog-generation: 1689593295246576 x-goog-metageneration: 1 @@ -381,7 +381,7 @@ curl -I 'https://vector.maps.elastic.co/files/world_countries_v7.topo.json?elast Server response -```regex +```txt HTTP/2 200 x-guploader-uploadid: ABPtcPpmMffchVgfHIr-SSC00WORo145oV-1q0asjqRvjLV_7cIgyfLRfofXV-BG7huMYABFypblcgdgXRBARhpo2c88ow x-goog-generation: 1689593325442971 @@ -411,7 +411,7 @@ alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000 :::::: ::::::{tab-item} Request -```regex +```txt Host: vector.maps.elastic.co User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/119.0 Accept: */* @@ -429,7 +429,7 @@ Cache-Control: no-cache :::::: ::::::{tab-item} Response -```regex +```txt x-guploader-uploadid: ABPtcPqIDSg5tyavvwwtJQa8a8iycoXOCkHBp_2YJbJJnQgb5XMD7nFwRUogg00Ou27VFIs95v7L99OMnvXR1bcb9RW-xQ x-goog-generation: 1689593325442971 x-goog-metageneration: 1 @@ -631,4 +631,3 @@ With {{hosted-ems}} running, add the `map.emsUrl` configuration key in your [kib ### Logging [elastic-maps-server-logging] Logs are generated in [ECS JSON format](https://www.elastic.co/guide/en/ecs/{{ecs_version}}) and emitted to the standard output and to `/var/log/elastic-maps-server/elastic-maps-server.log`. The server won’t rotate the logs automatically but the `logrotate` tool is installed in the image. Mount `/dev/null` to the default log path if you want to disable the output to that file. - diff --git a/manage-data/ingest/transform-enrich/example-parse-logs.md b/manage-data/ingest/transform-enrich/example-parse-logs.md index 84550859e..14f3c06e9 100644 --- a/manage-data/ingest/transform-enrich/example-parse-logs.md +++ b/manage-data/ingest/transform-enrich/example-parse-logs.md @@ -13,7 +13,7 @@ In this example tutorial, you’ll use an [ingest pipeline](ingest-pipelines.md) The logs you want to parse look similar to this: -```log +```txt 212.87.37.154 - - [05/May/2099:16:21:15 +0000] "GET /favicon.ico HTTP/1.1" 200 3638 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36" ``` diff --git a/raw-migrated-files/docs-content/serverless/observability-correlate-application-logs.md b/raw-migrated-files/docs-content/serverless/observability-correlate-application-logs.md index e5c4fd0bf..2c0359b32 100644 --- a/raw-migrated-files/docs-content/serverless/observability-correlate-application-logs.md +++ b/raw-migrated-files/docs-content/serverless/observability-correlate-application-logs.md @@ -9,7 +9,7 @@ The format of your logs (structured or plaintext) influences your log ingestion Logs are typically produced as either plaintext or structured. Plaintext logs contain only text and have no special formatting, for example: -```log +```txt 2019-08-06T12:09:12.375Z INFO:spring-petclinic: Tomcat started on port(s): 8080 (http) with context path, org.springframework.boot.web.embedded.tomcat.TomcatWebServer 2019-08-06T12:09:12.379Z INFO:spring-petclinic: Started PetClinicApplication in 7.095 seconds (JVM running for 9.082), org.springframework.samples.petclinic.PetClinicApplication 2019-08-06T14:08:40.199Z DEBUG:spring-petclinic: init find form, org.springframework.samples.petclinic.owner.OwnerController diff --git a/raw-migrated-files/docs-content/serverless/observability-parse-log-data.md b/raw-migrated-files/docs-content/serverless/observability-parse-log-data.md index 6bae3cc8f..120cc51a7 100644 --- a/raw-migrated-files/docs-content/serverless/observability-parse-log-data.md +++ b/raw-migrated-files/docs-content/serverless/observability-parse-log-data.md @@ -24,7 +24,7 @@ Make your logs more useful by extracting structured fields from your unstructure Follow the steps below to see how the following unstructured log data is indexed by default: -```log +```txt 2023-08-08T13:45:12.123Z WARN 192.168.1.101 Disk usage exceeds 90%. ``` @@ -306,7 +306,7 @@ Check the following common issues and solutions with timestamps: Extracting the `log.level` field lets you filter by severity and focus on critical issues. This section shows you how to extract the `log.level` field from this example log: -```log +```txt 2023-08-08T13:45:12.123Z WARN 192.168.1.101 Disk usage exceeds 90%. ``` @@ -393,7 +393,7 @@ Once you’ve extracted the `log.level` field, you can query for high-severity l Let’s say you have the following logs with varying severities: -```log +```txt 2023-08-08T13:45:12.123Z WARN 192.168.1.101 Disk usage exceeds 90%. 2023-08-08T13:45:14.003Z ERROR 192.168.1.103 Database connection failed. 2023-08-08T13:45:15.004Z DEBUG 192.168.1.104 Debugging connection issue. @@ -474,7 +474,7 @@ The `host.ip` field is part of the [Elastic Common Schema (ECS)](https://www.ela This section shows you how to extract the `host.ip` field from the following example logs and query based on the extracted fields: -```log +```txt 2023-08-08T13:45:12.123Z WARN 192.168.1.101 Disk usage exceeds 90%. 2023-08-08T13:45:14.003Z ERROR 192.168.1.103 Database connection failed. 2023-08-08T13:45:15.004Z DEBUG 192.168.1.104 Debugging connection issue. @@ -742,7 +742,7 @@ By default, an ingest pipeline sends your log data to a single data stream. To s This section shows you how to use a reroute processor to send the high-severity logs (`WARN` or `ERROR`) from the following example logs to a specific data stream and keep the regular logs (`DEBUG` and `INFO`) in the default data stream: -```log +```txt 2023-08-08T13:45:12.123Z WARN 192.168.1.101 Disk usage exceeds 90%. 2023-08-08T13:45:14.003Z ERROR 192.168.1.103 Database connection failed. 2023-08-08T13:45:15.004Z DEBUG 192.168.1.104 Debugging connection issue. diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/search-with-synonyms.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/search-with-synonyms.md index fe904aa7d..22d786b35 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/search-with-synonyms.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/search-with-synonyms.md @@ -16,19 +16,19 @@ In order to use synonyms sets in {{es}}, you need to: * [Configure synonyms token filters and analyzers](../../../solutions/search/full-text/search-with-synonyms.md#synonyms-synonym-token-filters) -## Store your synonyms set [synonyms-store-synonyms] +## Store your synonyms set [synonyms-store-synonyms] Your synonyms sets need to be stored in {{es}} so your analyzers can refer to them. There are three ways to store your synonyms sets: -### Synonyms API [synonyms-store-synonyms-api] +### Synonyms API [synonyms-store-synonyms-api] You can use the [synonyms APIs](https://www.elastic.co/guide/en/elasticsearch/reference/current/synonyms-apis.html) to manage synonyms sets. This is the most flexible approach, as it allows to dynamically define and modify synonyms sets. Changes in your synonyms sets will automatically reload the associated analyzers. -### Synonyms File [synonyms-store-synonyms-file] +### Synonyms File [synonyms-store-synonyms-file] You can store your synonyms set in a file. @@ -36,7 +36,7 @@ A synonyms set file needs to be uploaded to all your cluster nodes, and be locat An example synonyms file: -```synonyms +```markdown # Blank lines and lines starting with pound are comments. # Explicit mappings match any token sequence on the left hand side of "=>" @@ -77,28 +77,28 @@ When a synonyms set is updated, search analyzers that use it need to be refreshe This manual syncing and reloading makes this approach less flexible than using the [synonyms API](../../../solutions/search/full-text/search-with-synonyms.md#synonyms-store-synonyms-api). -### Inline [synonyms-store-synonyms-inline] +### Inline [synonyms-store-synonyms-inline] You can test your synonyms by adding them directly inline in your token filter definition. -::::{warning} +::::{warning} Inline synonyms are not recommended for production usage. A large number of inline synonyms increases cluster size unnecessarily and can lead to performance issues. :::: -### Configure synonyms token filters and analyzers [synonyms-synonym-token-filters] +### Configure synonyms token filters and analyzers [synonyms-synonym-token-filters] Once your synonyms sets are created, you can start configuring your token filters and analyzers to use them. -::::{warning} +::::{warning} Synonyms sets must exist before they can be added to indices. If an index is created referencing a nonexistent synonyms set, the index will remain in a partially created and inoperable state. The only way to recover from this scenario is to ensure the synonyms set exists then either delete and re-create the index, or close and re-open the index. :::: -::::{warning} +::::{warning} Invalid synonym rules can cause errors when applying analyzer changes. For reloadable analyzers, this prevents reloading and applying changes. You must correct errors in the synonym rules and reload the analyzer. An index with invalid synonym rules cannot be reopened, making it inoperable when: @@ -118,7 +118,7 @@ An index with invalid synonym rules cannot be reopened, making it inoperable whe Check each synonym token filter documentation for configuration details and instructions on adding it to an analyzer. -### Test your analyzer [synonyms-test-analyzer] +### Test your analyzer [synonyms-test-analyzer] You can test an analyzer configuration without modifying your index settings. Use the [analyze API](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-analyze.html) to test your analyzer chain: @@ -138,7 +138,7 @@ GET /_analyze ``` -### Apply synonyms at index or search time [synonyms-apply-synonyms] +### Apply synonyms at index or search time [synonyms-apply-synonyms] Analyzers can be applied at [index time or search time](../../../manage-data/data-store/text-analysis/index-search-analysis.md). @@ -184,4 +184,3 @@ The following example adds `my_analyzer` as a search analyzer to the `title` fie } } ``` - diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/semantic-search-inference.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/semantic-search-inference.md index d6201c012..ca386c997 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/semantic-search-inference.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/semantic-search-inference.md @@ -1082,7 +1082,7 @@ GET cohere-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `cohere-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "cohere-embeddings", @@ -1145,7 +1145,7 @@ GET elser-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `cohere-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "elser-embeddings", @@ -1203,7 +1203,7 @@ GET hugging-face-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `hugging-face-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "hugging-face-embeddings", @@ -1270,7 +1270,7 @@ GET openai-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `openai-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "openai-embeddings", @@ -1328,7 +1328,7 @@ GET azure-openai-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `azure-openai-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "azure-openai-embeddings", @@ -1386,7 +1386,7 @@ GET azure-ai-studio-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `azure-ai-studio-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "azure-ai-studio-embeddings", @@ -1502,7 +1502,7 @@ GET mistral-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `mistral-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "mistral-embeddings", @@ -1560,7 +1560,7 @@ GET amazon-bedrock-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `amazon-bedrock-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "amazon-bedrock-embeddings", @@ -1618,7 +1618,7 @@ GET alibabacloud-ai-search-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `alibabacloud-ai-search-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "alibabacloud-ai-search-embeddings", diff --git a/raw-migrated-files/observability-docs/observability/apm-common-problems.md b/raw-migrated-files/observability-docs/observability/apm-common-problems.md index 7e2e411ae..20d18876e 100644 --- a/raw-migrated-files/observability-docs/observability/apm-common-problems.md +++ b/raw-migrated-files/observability-docs/observability/apm-common-problems.md @@ -127,7 +127,7 @@ I/O Timeouts can occur when your timeout settings across the stack are not confi You may see an error like the one below in the {{apm-agent}} logs, and/or a similar error on the APM Server side: -```logs +```txt [ElasticAPM] APM Server responded with an error: "read tcp 123.34.22.313:8200->123.34.22.40:41602: i/o timeout" ``` @@ -156,7 +156,7 @@ The symptom of a mapping explosion is that transactions and spans are not indexe In the agent logs, you won’t see a sign of failures as the APM server asynchronously sends the data it received from the agents to {{es}}. However, the APM server and {{es}} log a warning like this: -```logs +```txt {\"type\":\"illegal_argument_exception\",\"reason\":\"Limit of total fields [1000] in [INDEX_NAME] has been exceeded\"} ``` diff --git a/raw-migrated-files/observability-docs/observability/application-logs.md b/raw-migrated-files/observability-docs/observability/application-logs.md index f20c7f73a..81df8cc23 100644 --- a/raw-migrated-files/observability-docs/observability/application-logs.md +++ b/raw-migrated-files/observability-docs/observability/application-logs.md @@ -9,7 +9,7 @@ The format of your logs (structured or plaintext) influences your log ingestion Logs are typically produced as either plaintext or structured. Plaintext logs contain only text and have no special formatting, for example: -```log +```txt 2019-08-06T12:09:12.375Z INFO:spring-petclinic: Tomcat started on port(s): 8080 (http) with context path, org.springframework.boot.web.embedded.tomcat.TomcatWebServer 2019-08-06T12:09:12.379Z INFO:spring-petclinic: Started PetClinicApplication in 7.095 seconds (JVM running for 9.082), org.springframework.samples.petclinic.PetClinicApplication 2019-08-06T14:08:40.199Z DEBUG:spring-petclinic: init find form, org.springframework.samples.petclinic.owner.OwnerController diff --git a/raw-migrated-files/observability-docs/observability/logs-parse.md b/raw-migrated-files/observability-docs/observability/logs-parse.md index 0802eef08..0cffc9899 100644 --- a/raw-migrated-files/observability-docs/observability/logs-parse.md +++ b/raw-migrated-files/observability-docs/observability/logs-parse.md @@ -16,7 +16,7 @@ Make your logs more useful by extracting structured fields from your unstructure Follow the steps below to see how the following unstructured log data is indexed by default: -```log +```txt 2023-08-08T13:45:12.123Z WARN 192.168.1.101 Disk usage exceeds 90%. ``` @@ -291,7 +291,7 @@ Check the following common issues and solutions with timestamps: Extracting the `log.level` field lets you filter by severity and focus on critical issues. This section shows you how to extract the `log.level` field from this example log: -```log +```txt 2023-08-08T13:45:12.123Z WARN 192.168.1.101 Disk usage exceeds 90%. ``` @@ -378,7 +378,7 @@ Once you’ve extracted the `log.level` field, you can query for high-severity l Let’s say you have the following logs with varying severities: -```log +```txt 2023-08-08T13:45:12.123Z WARN 192.168.1.101 Disk usage exceeds 90%. 2023-08-08T13:45:14.003Z ERROR 192.168.1.103 Database connection failed. 2023-08-08T13:45:15.004Z DEBUG 192.168.1.104 Debugging connection issue. @@ -459,7 +459,7 @@ The `host.ip` field is part of the [Elastic Common Schema (ECS)](https://www.ela This section shows you how to extract the `host.ip` field from the following example logs and query based on the extracted fields: -```log +```txt 2023-08-08T13:45:12.123Z WARN 192.168.1.101 Disk usage exceeds 90%. 2023-08-08T13:45:14.003Z ERROR 192.168.1.103 Database connection failed. 2023-08-08T13:45:15.004Z DEBUG 192.168.1.104 Debugging connection issue. @@ -727,7 +727,7 @@ By default, an ingest pipeline sends your log data to a single data stream. To s This section shows you how to use a reroute processor to send the high-severity logs (`WARN` or `ERROR`) from the following example logs to a specific data stream and keep the regular logs (`DEBUG` and `INFO`) in the default data stream: -```log +```txt 2023-08-08T13:45:12.123Z WARN 192.168.1.101 Disk usage exceeds 90%. 2023-08-08T13:45:14.003Z ERROR 192.168.1.103 Database connection failed. 2023-08-08T13:45:15.004Z DEBUG 192.168.1.104 Debugging connection issue. diff --git a/solutions/observability/apps/configure-apm-agent-central-configuration.md b/solutions/observability/apps/configure-apm-agent-central-configuration.md index 492b4acc1..834532d52 100644 --- a/solutions/observability/apps/configure-apm-agent-central-configuration.md +++ b/solutions/observability/apps/configure-apm-agent-central-configuration.md @@ -58,13 +58,13 @@ You may see either of the following HTTP 403 errors from APM Server when it atte APM agent log: -```log +```txt "Your Elasticsearch configuration does not support agent config queries. Check your configurations at `output.elasticsearch` or `apm-server.agent.config.elasticsearch`." ``` APM Server log: -```log +```txt rejecting fetch request: no valid elasticsearch config ``` @@ -76,4 +76,3 @@ To fix this error, ensure that APM Server has all the required privileges. For m #### HTTP 401 errors [_http_401_errors] If you get an HTTP 401 error from APM Server, make sure that you’re using an API key that is configured to **Beats**. For details on how to create and configure a compatible API key, refer to [Create an API key for writing events](grant-access-using-api-keys.md#apm-beats-api-key-publish). - diff --git a/solutions/search/semantic-search/cohere-es.md b/solutions/search/semantic-search/cohere-es.md index 74e3b5ddf..f2d434f53 100644 --- a/solutions/search/semantic-search/cohere-es.md +++ b/solutions/search/semantic-search/cohere-es.md @@ -309,10 +309,9 @@ for document in response.documents: The response will look similar to this: -```consol-result +```console-result Query: What is biosimilarity? Response: Biosimilarity is based on the comparability concept, which has been used successfully for several decades to ensure close similarity of a biological product before and after a manufacturing change. Over the last 10 years, experience with biosimilars has shown that even complex biotechnology-derived proteins can be copied successfully. Sources: Interchangeability of Biosimilars: A European Perspective: (...) ``` - diff --git a/solutions/search/semantic-search/semantic-search-elser.md b/solutions/search/semantic-search/semantic-search-elser.md index b05ea6c16..565ee0c10 100644 --- a/solutions/search/semantic-search/semantic-search-elser.md +++ b/solutions/search/semantic-search/semantic-search-elser.md @@ -164,7 +164,7 @@ GET my-index/_search The result is the top 10 documents that are closest in meaning to your query text from the `my-index` index sorted by their relevancy. The result also contains the extracted tokens for each of the relevant search results with their weights. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). It is possible to exclude tokens from source, refer to [this section](../vector/sparse-vector-elser.md#save-space) to learn more. -```consol-result +```console-result "hits": { "total": { "value": 10000, diff --git a/solutions/search/semantic-search/semantic-search-inference.md b/solutions/search/semantic-search/semantic-search-inference.md index 5928409d9..e79753328 100644 --- a/solutions/search/semantic-search/semantic-search-inference.md +++ b/solutions/search/semantic-search/semantic-search-inference.md @@ -1086,7 +1086,7 @@ GET cohere-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `cohere-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "cohere-embeddings", @@ -1149,7 +1149,7 @@ GET elser-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `cohere-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "elser-embeddings", @@ -1207,7 +1207,7 @@ GET hugging-face-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `hugging-face-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "hugging-face-embeddings", @@ -1274,7 +1274,7 @@ GET openai-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `openai-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "openai-embeddings", @@ -1332,7 +1332,7 @@ GET azure-openai-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `azure-openai-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "azure-openai-embeddings", @@ -1390,7 +1390,7 @@ GET azure-ai-studio-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `azure-ai-studio-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "azure-ai-studio-embeddings", @@ -1506,7 +1506,7 @@ GET mistral-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `mistral-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "mistral-embeddings", @@ -1564,7 +1564,7 @@ GET amazon-bedrock-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `amazon-bedrock-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "amazon-bedrock-embeddings", @@ -1622,7 +1622,7 @@ GET alibabacloud-ai-search-embeddings/_search As a result, you receive the top 10 documents that are closest in meaning to the query from the `alibabacloud-ai-search-embeddings` index sorted by their proximity to the query: -```consol-result +```console-result "hits": [ { "_index": "alibabacloud-ai-search-embeddings", diff --git a/solutions/search/vector/sparse-vector-elser.md b/solutions/search/vector/sparse-vector-elser.md index 5ce91d5fd..e523f751d 100644 --- a/solutions/search/vector/sparse-vector-elser.md +++ b/solutions/search/vector/sparse-vector-elser.md @@ -164,7 +164,7 @@ GET my-index/_search The result is the top 10 documents that are closest in meaning to your query text from the `my-index` index sorted by their relevancy. The result also contains the extracted tokens for each of the relevant search results with their weights. Tokens are learned associations capturing relevance, they are not synonyms. To learn more about what tokens are, refer to [this page](../../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md#elser-tokens). It is possible to exclude tokens from source, refer to [this section](#save-space) to learn more. -```consol-result +```console-result "hits": { "total": { "value": 10000, diff --git a/troubleshoot/elasticsearch/high-jvm-memory-pressure.md b/troubleshoot/elasticsearch/high-jvm-memory-pressure.md index 3ada700bf..fa0dc6e8e 100644 --- a/troubleshoot/elasticsearch/high-jvm-memory-pressure.md +++ b/troubleshoot/elasticsearch/high-jvm-memory-pressure.md @@ -51,7 +51,7 @@ JVM Memory Pressure = `used_in_bytes` / `max_in_bytes` As memory usage increases, garbage collection becomes more frequent and takes longer. You can track the frequency and length of garbage collection events in [`elasticsearch.log`](../../deploy-manage/monitor/logging-configuration/elasticsearch-log4j-configuration-self-managed.md). For example, the following event states {{es}} spent more than 50% (21 seconds) of the last 40 seconds performing garbage collection. -```log +```txt [timestamp_short_interval_from_last][INFO ][o.e.m.j.JvmGcMonitorService] [node_id] [gc][number] overhead, spent [21s] collecting in the last [40s] ``` diff --git a/troubleshoot/ingest/fleet/common-problems.md b/troubleshoot/ingest/fleet/common-problems.md index faed3c5d7..2870141a9 100644 --- a/troubleshoot/ingest/fleet/common-problems.md +++ b/troubleshoot/ingest/fleet/common-problems.md @@ -248,7 +248,7 @@ You will also need to set `ssl.verification_mode: none` in the Output settings i To enroll in {{fleet}}, {{agent}} must connect to the {{fleet-server}} instance. If the agent is unable to connect, you see the following failure: -```output +```txt fail to enroll: fail to execute request to {fleet-server}:Post http://fleet-server:8220/api/fleet/agents/enroll?: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) ``` diff --git a/troubleshoot/observability/apm-agent-go/apm-go-agent.md b/troubleshoot/observability/apm-agent-go/apm-go-agent.md index c42db51fd..3fb0301a1 100644 --- a/troubleshoot/observability/apm-agent-go/apm-go-agent.md +++ b/troubleshoot/observability/apm-agent-go/apm-go-agent.md @@ -28,7 +28,7 @@ With logging enabled, use [`ELASTIC_APM_LOG_LEVEL`](https://www.elastic.co/guide Be sure to execute a few requests to your application before posting your log files. Each request should add lines similar to these in the logs: -```log +```txt {"level":"debug","time":"2020-07-23T11:46:32+08:00","message":"sent request with 100 transactions, 0 spans, 0 errors, 0 metricsets"} ``` @@ -42,4 +42,3 @@ In the unlikely event the agent causes disruptions to a production application, If you have access to [dynamic configuration](https://www.elastic.co/guide/en/apm/agent/go/current/configuration.html#dynamic-configuration), you can disable the recording of events by setting [`ELASTIC_APM_RECORDING`](https://www.elastic.co/guide/en/apm/agent/go/current/configuration.html#config-recording) to `false`. When changed at runtime from a supported source, there’s no need to restart your application. If that doesn’t work, or you don’t have access to dynamic configuration, you can disable the agent by setting [`ELASTIC_APM_ACTIVE`](https://www.elastic.co/guide/en/apm/agent/go/current/configuration.html#config-active) to `false`. Restart your application for the changes to apply. - diff --git a/troubleshoot/observability/troubleshoot-your-universal-profiling-agent-deployment.md b/troubleshoot/observability/troubleshoot-your-universal-profiling-agent-deployment.md index fff4b465d..49c901a61 100644 --- a/troubleshoot/observability/troubleshoot-your-universal-profiling-agent-deployment.md +++ b/troubleshoot/observability/troubleshoot-your-universal-profiling-agent-deployment.md @@ -13,7 +13,7 @@ You can use the Universal Profiling Agent logs to find errors. The following is an example of a *healthy* Universal Profiling Agent output: -```logs +```txt time="..." level=info msg="Starting Prodfiler Host Agent v2.4.0 (revision develop-5cce978a, build timestamp 12345678910)" time="..." level=info msg="Interpreter tracers: perl,php,python,hotspot,ruby,v8" time="..." level=info msg="Automatically determining environment and machine ID ..." @@ -34,7 +34,7 @@ time="..." level=info msg="Attached sched monitor" A Universal Profiling Agent deployment is working if the output of the following command is empty: -```logs +```txt head host-agent.log -n 15 | grep "level=error" ``` @@ -44,25 +44,25 @@ If running this command outputs error-level logs, the following are possible cau If the Universal Profiling Agent is running on an unsupported kernel version, the following is logged: - ```logs + ```txt Universal Profiling Agent requires kernel version 4.19 or newer but got 3.10.0 ``` If eBPF features are not available in the kernel, the Universal Profiling Agent fails to start, and one of the following is logged: - ```logs + ```txt Failed to probe eBPF syscall ``` or - ```logs + ```txt Failed to probe tracepoint ``` * The Universal Profiling Agent is not able to connect to {{ecloud}}. In this case, a similar message to the following is logged: - ```logs + ```txt Failed to setup gRPC connection (retrying...): context deadline exceeded ``` @@ -70,13 +70,13 @@ If running this command outputs error-level logs, the following are possible cau * The secret token is not valid, or it has been changed. In this case, the Universal Profiling Agent gent shuts down, and logs a similar message to the following: - ```logs + ```txt rpc error: code = Unauthenticated desc = authentication failed ``` * The Universal Profiling Agent is unable to send data to your deployment. In this case, a similar message to the following is logged: - ```logs + ```txt Failed to report hostinfo (retrying...): rpc error: code = Unimplemented desc = unknown service collectionagent.CollectionAgent" ``` @@ -84,7 +84,7 @@ If running this command outputs error-level logs, the following are possible cau * The collector (part of the backend in {{ecloud}} that receives data from the Universal Profiling Agent) ran out of memory. In this case, a similar message to the following is logged: - ```logs + ```txt Error: failed to invoke XXX(): Unavailable rpc error: code = Unavailable desc = unexpected HTTP status code received from server: 502 (Bad Gateway); transport: received unexpected content-type "application/json; charset=UTF-8" ``` @@ -94,7 +94,7 @@ If running this command outputs error-level logs, the following are possible cau * The Universal Profiling Agent is incompatible with the {{stack}} version. In this case, the following message is logged: - ```logs + ```txt rpc error: code = FailedPrecondition desc= HostAgent version is unsupported, please upgrade to the latest version ``` @@ -102,7 +102,7 @@ If running this command outputs error-level logs, the following are possible cau * You are using a Universal Profling Agent from a newer {{stack}} version, configured to connect to an older {{stack}} version cluster. In this case, the following message is logged: - ```logs + ```txt rpc error: code = FailedPrecondition desc= Backend is incompatible with HostAgent, please check your configuration ``` @@ -230,4 +230,3 @@ In the support request, specify if your issue deals with the Universal Profiling ## Send feedback [profiling-send-feedback] If troubleshooting and support are not fixing your issues, or you have any other feedback that you want to share about the product, send the Universal Profiling team an email at `profiling-feedback@elastic.co`. - From 486c20ae561e3be8493f97ff33d6508810ea21a8 Mon Sep 17 00:00:00 2001 From: florent-leborgne Date: Wed, 5 Feb 2025 11:22:46 +0100 Subject: [PATCH 05/15] [E&A] Refine more pages in querying section (#313) * refine querying section * remove empty front matter * remove deleted file from toc * Apply suggestions from code review Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> --------- Co-authored-by: Brandon Morelli Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> --- explore-analyze/query-filter/languages.md | 5 - ...g_concepts_across_sql_and_elasticsearch.md | 22 -- .../query-filter/languages/sql-concepts.md | 21 ++ .../languages/sql-functions-datetime.md | 2 +- .../languages/sql-getting-started.md | 1 + .../{_api_usage.md => sql-jdbc-api-usage.md} | 0 explore-analyze/query-filter/tools/console.md | 229 +++++++++++++++++- .../query-filter/tools/saved-queries.md | 29 ++- .../query-filter/tools/search-profiler.md | 2 +- .../scripting/modules-scripting-security.md | 2 +- explore-analyze/toc.yml | 4 +- .../visualize/maps/maps-connect-to-ems.md | 2 +- .../maps/reverse-geocoding-tutorial.md | 2 +- .../kibana/kibana/save-load-delete-query.md | 27 --- raw-migrated-files/toc.yml | 1 - 15 files changed, 281 insertions(+), 68 deletions(-) delete mode 100644 explore-analyze/query-filter/languages/_mapping_concepts_across_sql_and_elasticsearch.md rename explore-analyze/query-filter/languages/{_api_usage.md => sql-jdbc-api-usage.md} (100%) delete mode 100644 raw-migrated-files/kibana/kibana/save-load-delete-query.md diff --git a/explore-analyze/query-filter/languages.md b/explore-analyze/query-filter/languages.md index 832fe2ebb..f70fa8c96 100644 --- a/explore-analyze/query-filter/languages.md +++ b/explore-analyze/query-filter/languages.md @@ -1,8 +1,3 @@ ---- -mapped_pages: - - https://www.elastic.co/guide/en/elasticsearch/reference/current/search-analyze.html ---- - # Query languages [search-analyze-query-languages] {{es}} provides a number of query languages for interacting with your data. diff --git a/explore-analyze/query-filter/languages/_mapping_concepts_across_sql_and_elasticsearch.md b/explore-analyze/query-filter/languages/_mapping_concepts_across_sql_and_elasticsearch.md deleted file mode 100644 index 1d9d1bb2c..000000000 --- a/explore-analyze/query-filter/languages/_mapping_concepts_across_sql_and_elasticsearch.md +++ /dev/null @@ -1,22 +0,0 @@ ---- -mapped_pages: - - https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html ---- - -# Mapping concepts across SQL and Elasticsearch [_mapping_concepts_across_sql_and_es] - -While SQL and {{es}} have different terms for the way the data is organized (and different semantics), essentially their purpose is the same. - -So let’s start from the bottom; these roughly are: - -| SQL | {{es}} | Description | -| --- | --- | --- | -| `column` | `field` | In both cases, at the lowest level, data is stored in *named* entries, of a variety of [data types](sql-data-types.md), containing *one* value. SQL calls such an entry a *column* while {{es}} a *field*.Notice that in {{es}} a field can contain *multiple* values of the same type (essentially a list) while in SQL, a *column* can contain *exactly* one value of said type.Elasticsearch SQL will do its best to preserve the SQL semantic and, depending on the query, reject those that return fields with more than one value. | -| `row` | `document` | `Column`s and `field`s do *not* exist by themselves; they are part of a `row` or a `document`. The two have slightly different semantics: a `row` tends to be *strict* (and have more enforcements) while a `document` tends to be a bit more flexible or loose (while still having a structure). | -| `table` | `index` | The target against which queries, whether in SQL or {{es}} get executed against. | -| `schema` | *implicit* | In RDBMS, `schema` is mainly a namespace of tables and typically used as a security boundary. {{es}} does not provide an equivalent concept for it. However when security is enabled, {{es}} automatically applies the security enforcement so that a role sees only the data it is allowed to (in SQL jargon, its *schema*). | -| `catalog` or `database` | `cluster` instance | In SQL, `catalog` or `database` are used interchangeably and represent a set of schemas that is, a number of tables.In {{es}} the set of indices available are grouped in a `cluster`. The semantics also differ a bit; a `database` is essentially yet another namespace (which can have some implications on the way data is stored) while an {{es}} `cluster` is a runtime instance, or rather a set of at least one {{es}} instance (typically running distributed).In practice this means that while in SQL one can potentially have multiple catalogs inside an instance, in {{es}} one is restricted to only *one*. | -| `cluster` | `cluster` (federated) | Traditionally in SQL, *cluster* refers to a single RDBMS instance which contains a number of `catalog`s or `database`s (see above). The same word can be reused inside {{es}} as well however its semantic clarified a bit.
While RDBMS tend to have only one running instance, on a single machine (*not* distributed), {{es}} goes the opposite way and by default, is distributed and multi-instance.
Further more, an {{es}} `cluster` can be connected to other `cluster`s in a *federated* fashion thus `cluster` means:
single cluster::Multiple {{es}} instances typically distributed across machines, running within the same namespace.multiple clusters::Multiple clusters, each with its own namespace, connected to each other in a federated setup (see [{{ccs-cap}}](../../../solutions/search/cross-cluster-search.md)). | - -As one can see while the mapping between the concepts are not exactly one to one and the semantics somewhat different, there are more things in common than differences. In fact, thanks to SQL declarative nature, many concepts can move across {{es}} transparently and the terminology of the two likely to be used interchangeably throughout the rest of the material. - diff --git a/explore-analyze/query-filter/languages/sql-concepts.md b/explore-analyze/query-filter/languages/sql-concepts.md index b7d1081eb..ac6888d16 100644 --- a/explore-analyze/query-filter/languages/sql-concepts.md +++ b/explore-analyze/query-filter/languages/sql-concepts.md @@ -1,6 +1,8 @@ --- +navigation_title: Conventions mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-concepts.html + - https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html --- # Conventions and Terminology [sql-concepts] @@ -14,4 +16,23 @@ This documentation while trying to be complete, does assume the reader has *basi As a general rule, Elasticsearch SQL as the name indicates provides a SQL interface to {{es}}. As such, it follows the SQL terminology and conventions first, whenever possible. However the backing engine itself is {{es}} for which Elasticsearch SQL was purposely created hence why features or concepts that are not available, or cannot be mapped correctly, in SQL appear in Elasticsearch SQL. Last but not least, Elasticsearch SQL tries to obey the [principle of least surprise](https://en.wikipedia.org/wiki/Principle_of_least_astonishment), though as all things in the world, everything is relative. +## Mapping concepts across SQL and Elasticsearch [_mapping_concepts_across_sql_and_es] + +While SQL and {{es}} have different terms for the way the data is organized (and different semantics), essentially their purpose is the same. + +So let’s start from the bottom; these roughly are: + +| SQL | {{es}} | Description | +| --- | --- | --- | +| `column` | `field` | In both cases, at the lowest level, data is stored in *named* entries, of a variety of [data types](sql-data-types.md), containing *one* value. SQL calls such an entry a *column* while {{es}} a *field*.Notice that in {{es}} a field can contain *multiple* values of the same type (essentially a list) while in SQL, a *column* can contain *exactly* one value of said type.Elasticsearch SQL will do its best to preserve the SQL semantic and, depending on the query, reject those that return fields with more than one value. | +| `row` | `document` | `Column`s and `field`s do *not* exist by themselves; they are part of a `row` or a `document`. The two have slightly different semantics: a `row` tends to be *strict* (and have more enforcements) while a `document` tends to be a bit more flexible or loose (while still having a structure). | +| `table` | `index` | The target against which queries, whether in SQL or {{es}} get executed against. | +| `schema` | *implicit* | In RDBMS, `schema` is mainly a namespace of tables and typically used as a security boundary. {{es}} does not provide an equivalent concept for it. However when security is enabled, {{es}} automatically applies the security enforcement so that a role sees only the data it is allowed to (in SQL jargon, its *schema*). | +| `catalog` or `database` | `cluster` instance | In SQL, `catalog` or `database` are used interchangeably and represent a set of schemas that is, a number of tables.In {{es}} the set of indices available are grouped in a `cluster`. The semantics also differ a bit; a `database` is essentially yet another namespace (which can have some implications on the way data is stored) while an {{es}} `cluster` is a runtime instance, or rather a set of at least one {{es}} instance (typically running distributed).In practice this means that while in SQL one can potentially have multiple catalogs inside an instance, in {{es}} one is restricted to only *one*. | +| `cluster` | `cluster` (federated) | Traditionally in SQL, *cluster* refers to a single RDBMS instance which contains a number of `catalog`s or `database`s (see above). The same word can be reused inside {{es}} as well however its semantic clarified a bit.
While RDBMS tend to have only one running instance, on a single machine (*not* distributed), {{es}} goes the opposite way and by default, is distributed and multi-instance.
Further more, an {{es}} `cluster` can be connected to other `cluster`s in a *federated* fashion thus `cluster` means:
single cluster::Multiple {{es}} instances typically distributed across machines, running within the same namespace.multiple clusters::Multiple clusters, each with its own namespace, connected to each other in a federated setup (see [{{ccs-cap}}](../../../solutions/search/cross-cluster-search.md)). | + +As one can see while the mapping between the concepts are not exactly one to one and the semantics somewhat different, there are more things in common than differences. In fact, thanks to SQL declarative nature, many concepts can move across {{es}} transparently and the terminology of the two likely to be used interchangeably throughout the rest of the material. + + + diff --git a/explore-analyze/query-filter/languages/sql-functions-datetime.md b/explore-analyze/query-filter/languages/sql-functions-datetime.md index 0873da698..574b50d07 100644 --- a/explore-analyze/query-filter/languages/sql-functions-datetime.md +++ b/explore-analyze/query-filter/languages/sql-functions-datetime.md @@ -1038,7 +1038,7 @@ TO_CHAR( **Output**: string -**Description**: Returns the date/datetime/time as a string using the format specified in the 2nd argument. The formatting pattern conforms to [PostgreSQL Template Patterns for Date/Time Formatting](https://www.postgresql.org/docs/13/functions-formatting.md). +**Description**: Returns the date/datetime/time as a string using the format specified in the 2nd argument. The formatting pattern conforms to [PostgreSQL Template Patterns for Date/Time Formatting](https://www.postgresql.org/docs/13/functions-formatting.html). ::::{note} If the 1st argument is of type `time`, then the pattern specified by the 2nd argument cannot contain date related units (e.g. *dd*, *MM*, *YYYY*, etc.). If it contains such units an error is returned.
The result of the patterns `TZ` and `tz` (time zone abbreviations) in some cases differ from the results returned by the `TO_CHAR` in PostgreSQL. The reason is that the time zone abbreviations specified by the JDK are different from the ones specified by PostgreSQL. This function might show an actual time zone abbreviation instead of the generic `LMT` or empty string or offset returned by the PostgreSQL implementation. The summer/daylight markers might also differ between the two implementations (e.g. will show `HT` instead of `HST` for Hawaii).
The `FX`, `TM`, `SP` pattern modifiers are not supported and will show up as `FX`, `TM`, `SP` literals in the output. diff --git a/explore-analyze/query-filter/languages/sql-getting-started.md b/explore-analyze/query-filter/languages/sql-getting-started.md index 068dc56ad..e6974afe8 100644 --- a/explore-analyze/query-filter/languages/sql-getting-started.md +++ b/explore-analyze/query-filter/languages/sql-getting-started.md @@ -1,4 +1,5 @@ --- +navigation_title: Getting started mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-getting-started.html --- diff --git a/explore-analyze/query-filter/languages/_api_usage.md b/explore-analyze/query-filter/languages/sql-jdbc-api-usage.md similarity index 100% rename from explore-analyze/query-filter/languages/_api_usage.md rename to explore-analyze/query-filter/languages/sql-jdbc-api-usage.md diff --git a/explore-analyze/query-filter/tools/console.md b/explore-analyze/query-filter/tools/console.md index d1130186a..180b3b196 100644 --- a/explore-analyze/query-filter/tools/console.md +++ b/explore-analyze/query-filter/tools/console.md @@ -1,10 +1,11 @@ --- +navigation_title: Console mapped_urls: - https://www.elastic.co/guide/en/kibana/current/console-kibana.html - https://www.elastic.co/guide/en/cloud-enterprise/current/ece-api-console.html --- -# Console +# Run API requests with Console [console-kibana] % What needs to be done: Refine @@ -19,4 +20,228 @@ mapped_urls: $$$configuring-console$$$ -$$$import-export-console-requests$$$ \ No newline at end of file +$$$import-export-console-requests$$$ + + +**Console** is an interactive UI for sending requests to [{{es}} APIs](https://www.elastic.co/guide/en/elasticsearch/reference/current/rest-apis.html) and [{{kib}} APIs](https://www.elastic.co/docs/api) and viewing their responses. + +:::{image} ../../../images/kibana-console.png +:alt: Console +:class: screenshot +::: + +To go to **Console**, find **Dev Tools** in the navigation menu or use the [global search bar](../../../get-started/the-stack.md#kibana-navigation-search). + +You can also find Console directly on certain Search solution and Elasticsearch serverless project pages, where you can expand it from the footer. This Console, called **Persistent Console**, has the same capabilities and shares the same history as the Console in **Dev Tools**. + +:::{image} ../../../images/kibana-persistent-console.png +:alt: Console +:class: screenshot +::: + + +## Write requests [console-api] + +**Console** accepts commands in a simplified HTTP request syntax. For example, the following `GET` request calls the {es} `_search` API: + +```js +GET /_search +{ + "query": { + "match_all": {} + } +} +``` + +Here is the equivalent command in cURL: + +```bash +curl -XGET "http://localhost:9200/_search" -d' +{ + "query": { + "match_all": {} + } +}' +``` + +Prepend requests to a {{kib}} API endpoint with `kbn:` + +```bash +GET kbn:/api/index_management/indices +``` + + +### Autocomplete [console-autocomplete] + +When you’re typing a command, **Console** makes context-sensitive suggestions. These suggestions show you the parameters for each API and speed up your typing. + +You can configure your preferences for autocomplete in the [Console settings](../../../explore-analyze/query-filter/tools/console.md#configuring-console). + + +### Comments [console-comments] + +You can write comments or temporarily disable parts of a request by using double forward slashes or pound signs to create single-line comments. + +```js +# This request searches all of your indices. +GET /_search +{ + // The query parameter indicates query context. + "query": { + "match_all": {} // Matches all documents. + } +} +``` + +You can also use a forward slash followed by an asterisk to mark the beginning of multi-line comments. An asterisk followed by a forward slash marks the end. + +```js +GET /_search +{ + "query": { + /*"match_all": { + "boost": 1.2 + }*/ + "match_none": {} + } +} +``` + + +### Variables [console-variables] + +Click **Variables** to create, edit, and delete variables. + +:::{image} ../../../images/kibana-variables.png +:alt: Variables +:class: screenshot +::: + +You can refer to these variables in the paths and bodies of your requests. Each variable can be referenced multiple times. + +```js +GET ${pathVariable} +{ + "query": { + "match": { + "${bodyNameVariable}": "${bodyValueVariable}" + } + } +} +``` + +By default, variables in the body may be substituted as a boolean, number, array, or object by removing nearby quotes instead of a string with surrounding quotes. Triple quotes overwrite this default behavior and enforce simple replacement as a string. + +```js +GET /locations/_search +{ + "query": { + "bool": { + "must": { + "match": { + // ${shopName} shall be replaced as a string if the variable exists. + "shop.name": """${shopName}""" + } + }, + "filter": { + "geo_distance": { + "distance": "12km", + // "${pinLocation}" may be substituted with an array such as [-70, 40]. + "pin.location": "${pinLocation}" + } + } + } + } +} +``` + + +### Auto-formatting [auto-formatting] + +The auto-formatting capability can help you format requests to be more readable. Select one or more requests that you want to format, open the contextual menu, and then select **Auto indent**. + + +### Keyboard shortcuts [keyboard-shortcuts] + +Go to line number +: `Ctrl/Cmd` + `L` + +Auto-indent current request +: `Ctrl/Cmd` + `I` + +Jump to next request end +: `Ctrl/Cmd` + `↓` + +Jump to previous request end +: `Ctrl/Cmd` + `↑` + +Open documentation for current request +: `Ctrl/Cmd` + `/` + +Run current request +: `Ctrl/Cmd` + `Enter` + +Apply current or topmost term in autocomplete menu +: `Enter` or `Tab` + +Close autocomplete menu +: `Esc` + +Navigate items in autocomplete menu +: `↓` + `↑` + + +### View API docs [console-view-api] + +To view the documentation for an API endpoint, select the request, then open the contextual menu and select **Open API reference**. + + +## Run requests [console-request] + +When you’re ready to run a request, select the request, and click the play button. + +The result of the request execution is displayed in the response panel, where you can see: + +* the JSON response +* the HTTP status code corresponding to the request +* The execution time, in ms. + +::::{tip} +You can select multiple requests and submit them together. **Console** executes the requests one by one. Submitting multiple requests is helpful when you’re debugging an issue or trying query combinations in multiple scenarios. +:::: + + + +## Import and export requests [import-export-console-requests] + +You can export requests: + +* **to a TXT file**, by using the **Export requests** button. When using this method, all content of the input panel is copied, including comments, requests, and payloads. All of the formatting is preserved and allows you to re-import the file later, or to a different environment, using the **Import requests** button. + + ::::{tip} + When importing a TXT file containing Console requests, the current content of the input panel is replaced. Export it first if you don’t want to lose it, or find it in the **History** tab if you already ran the requests. + :::: + +* by copying them individually as **curl**, **JavaScript**, or **Python**. To do this, select a request, then open the contextual menu and select **Copy as**. When using this action, requests are copied individually to your clipboard. You can save your favorite language to make the copy action faster the next time you use it. + + When running copied requests from an external environment, you’ll need to add [authentication information](https://www.elastic.co/docs/api/doc/kibana/authentication) to the request. + + + +## Get your request history [console-history] + +**Console** maintains a list of the last 500 requests that you tried to execute. To view them, open the **History** tab. + +You can run a request from your history again by selecting the request and clicking **Add and run**. If you want to add it back to the Console input panel without running it yet, click **Add** instead. It is added to the editor at the current cursor position. + + +## Configure Console settings [configuring-console] + +Go to the **Config** tab of **Console** to customize its display, autocomplete, and accessibility settings. + + +## Disable Console [disable-console] + +If you don’t want to use **Console**, you can disable it by setting `console.ui.enabled` to `false` in your `kibana.yml` configuration file. Changing this setting causes the server to regenerate assets on the next startup, which might cause a delay before pages start being served. + +You can also choose to only disable the persistent console that shows in the footer of several Kibana pages. To do that, go to **Stack Management** > **Advanced Settings**, and turn off the `devTools:enablePersistentConsole` setting. diff --git a/explore-analyze/query-filter/tools/saved-queries.md b/explore-analyze/query-filter/tools/saved-queries.md index 2d95506ff..3faca9c41 100644 --- a/explore-analyze/query-filter/tools/saved-queries.md +++ b/explore-analyze/query-filter/tools/saved-queries.md @@ -1,14 +1,37 @@ --- mapped_urls: - - https://www.elastic.co/guide/en/elasticsearch/reference/current/search-analyze.html - https://www.elastic.co/guide/en/kibana/current/save-load-delete-query.html --- -# Saved queries +# Saved queries [save-load-delete-query] % What needs to be done: Refine % Use migrated content from existing pages that map to this page: % - [ ] ./raw-migrated-files/elasticsearch/elasticsearch-reference/search-analyze.md -% - [ ] ./raw-migrated-files/kibana/kibana/save-load-delete-query.md \ No newline at end of file +% - [ ] ./raw-migrated-files/kibana/kibana/save-load-delete-query.md + +Have you ever built a query that you wanted to reuse? With saved queries, you can save your query text, filters, and time range for reuse anywhere a query bar is present. + +For example, suppose you’re in **Discover**, and you’ve put time into building a query that includes query input text, multiple filters, and a specific time range. Save this query, and you can embed the search results in dashboards, use them as a foundation for building a visualization, and share them in a link or CVS form. + +Saved queries are different than [saved Discover sessions](/explore-analyze/discover/save-open-search.md), which include the **Discover** configuration—selected columns in the document table, sort order, and {{data-source}}—in addition to the query. Discover sessions are primarily used for adding search results to a dashboard. + +## Saved query access [_saved_query_access] + +If you have insufficient privileges to manage saved queries, you will be unable to load or save queries from the saved query management popover. For more information, see [Granting access to Kibana](../../../deploy-manage/users-roles/cluster-or-deployment-auth/built-in-roles.md) + + +## Save a query [_save_a_query] + +1. Once you’ve built a query worth saving, click the save query icon ![save query icon](../../../images/kibana-saved-query-icon.png ""). +2. In the menu, select the item to save the query. +3. Enter a unique name. +4. Choose whether to include or exclude filters and a time range. By default, filters are automatically included, but the time filter is not. +5. Save the query. +6. To load a saved query, select it in the **Saved query** menu. + + The query text, filters, and time range are updated and your data refreshed. If you’re loading a saved query that did not include the filters or time range, those components remain as-is. + +7. To add filters and clear saved queries, use the **Saved query** menu. \ No newline at end of file diff --git a/explore-analyze/query-filter/tools/search-profiler.md b/explore-analyze/query-filter/tools/search-profiler.md index 610ec2124..655493279 100644 --- a/explore-analyze/query-filter/tools/search-profiler.md +++ b/explore-analyze/query-filter/tools/search-profiler.md @@ -12,7 +12,7 @@ The **{{searchprofiler}}** tool can transform this JSON output into a visualizat ## Get started [search-profiler-getting-started] -1. Find the **{{searchprofiler}}** by navigating to the **Developer tools** page using the navigation menu or the [global search field](../../../get-started/the-stack.md#kibana-navigation-search). +Find the **{{searchprofiler}}** by navigating to the **Developer tools** page using the navigation menu or the [global search field](../../../get-started/the-stack.md#kibana-navigation-search). **{{searchprofiler}}** displays the names of the indices searched, the shards in each index, and how long it took for the query to complete. To try it out, replace the default `match_all` query with the query you want to profile, and then click **Profile**. diff --git a/explore-analyze/scripting/modules-scripting-security.md b/explore-analyze/scripting/modules-scripting-security.md index bb2cfe71c..47040180c 100644 --- a/explore-analyze/scripting/modules-scripting-security.md +++ b/explore-analyze/scripting/modules-scripting-security.md @@ -9,7 +9,7 @@ Painless and {{es}} implement layers of security to build a defense in depth str Painless uses a fine-grained allowlist. Anything that is not part of the allowlist results in a compilation error. This capability is the first layer of security in a defense in depth strategy for scripting. -The second layer of security is the [Java Security Manager](https://www.oracle.com/java/technologies/javase/seccodeguide.md). As part of its startup sequence, {{es}} enables the Java Security Manager to limit the actions that portions of the code can take. [Painless](modules-scripting-painless.md) uses the Java Security Manager as an additional layer of defense to prevent scripts from doing things like writing files and listening to sockets. +The second layer of security is the [Java Security Manager](https://www.oracle.com/java/technologies/javase/seccodeguide.html). As part of its startup sequence, {{es}} enables the Java Security Manager to limit the actions that portions of the code can take. [Painless](modules-scripting-painless.md) uses the Java Security Manager as an additional layer of defense to prevent scripts from doing things like writing files and listening to sockets. {{es}} uses [seccomp](https://en.wikipedia.org/wiki/Seccomp) in Linux, [Seatbelt](https://www.chromium.org/developers/design-documents/sandbox/osx-sandboxing-design) in macOS, and [ActiveProcessLimit](https://msdn.microsoft.com/en-us/library/windows/desktop/ms684147) on Windows as additional security layers to prevent {{es}} from forking or running other processes. diff --git a/explore-analyze/toc.yml b/explore-analyze/toc.yml index 63bbfd13b..2ad008c3c 100644 --- a/explore-analyze/toc.yml +++ b/explore-analyze/toc.yml @@ -24,8 +24,6 @@ toc: - file: query-filter/languages/sql-overview.md - file: query-filter/languages/sql-getting-started.md - file: query-filter/languages/sql-concepts.md - children: - - file: query-filter/languages/_mapping_concepts_across_sql_and_elasticsearch.md - file: query-filter/languages/sql-security.md - file: query-filter/languages/sql-rest.md children: @@ -41,7 +39,7 @@ toc: - file: query-filter/languages/sql-cli.md - file: query-filter/languages/sql-jdbc.md children: - - file: query-filter/languages/_api_usage.md + - file: query-filter/languages/sql-jdbc-api-usage.md - file: query-filter/languages/sql-odbc.md children: - file: query-filter/languages/sql-odbc-installation.md diff --git a/explore-analyze/visualize/maps/maps-connect-to-ems.md b/explore-analyze/visualize/maps/maps-connect-to-ems.md index 988d64ae6..8617d56c7 100644 --- a/explore-analyze/visualize/maps/maps-connect-to-ems.md +++ b/explore-analyze/visualize/maps/maps-connect-to-ems.md @@ -556,7 +556,7 @@ If you cannot connect to Elastic Maps Service from the {{kib}} server or browser | `ssl.certificateAuthorities` | Paths to one or more PEM-encoded X.509 certificate authority (CA) certificates that make up a trusted certificate chain for {{hosted-ems}}. This chain is used by the {{hosted-ems}} to establish trust when receiving inbound SSL/TLS connections from end users. [Equivalent {{kib}} setting](../../../deploy-manage/deploy/self-managed/configure.md#server-ssl-certificateAuthorities). | | `ssl.key`, `ssl.certificate`, and `ssl.keyPassphrase` | Location of yor SSL key and certificate files and the password that decrypts the private key that is specified via `ssl.key`. This password is optional, as the key may not be encrypted. [Equivalent {{kib}} setting](../../../deploy-manage/deploy/self-managed/configure.md#server-ssl-cert-key). | | `ssl.supportedProtocols` | An array of supported protocols with versions.Valid protocols: `TLSv1`, `TLSv1.1`, `TLSv1.2`. **Default: `TLSv1.1`, `TLSv1.2`**. [Equivalent {{kib}} setting](../../../deploy-manage/deploy/self-managed/configure.md#server-ssl-supportedProtocols). | -| `ssl.cipherSuites` | Details on the format, and the valid options, are available via the[OpenSSL cipher list format documentation](https://www.openssl.org/docs/man1.1.1/man1/ciphers.md#CIPHER-LIST-FORMAT).**Default: `TLS_AES_256_GCM_SHA384 TLS_CHACHA20_POLY1305_SHA256 TLS_AES_128_GCM_SHA256 ECDHE-RSA-AES128-GCM-SHA256, ECDHE-ECDSA-AES128-GCM-SHA256, ECDHE-RSA-AES256-GCM-SHA384, ECDHE-ECDSA-AES256-GCM-SHA384, DHE-RSA-AES128-GCM-SHA256, ECDHE-RSA-AES128-SHA256, DHE-RSA-AES128-SHA256, ECDHE-RSA-AES256-SHA384, DHE-RSA-AES256-SHA384, ECDHE-RSA-AES256-SHA256, DHE-RSA-AES256-SHA256, HIGH,!aNULL, !eNULL, !EXPORT, !DES, !RC4, !MD5, !PSK, !SRP, !CAMELLIA`**. [Equivalent {{kib}} setting](../../../deploy-manage/deploy/self-managed/configure.md#server-ssl-cipherSuites). | +| `ssl.cipherSuites` | Details on the format, and the valid options, are available via the[OpenSSL cipher list format documentation](https://www.openssl.org/docs/man1.1.1/man1/ciphers.html#CIPHER-LIST-FORMAT).**Default: `TLS_AES_256_GCM_SHA384 TLS_CHACHA20_POLY1305_SHA256 TLS_AES_128_GCM_SHA256 ECDHE-RSA-AES128-GCM-SHA256, ECDHE-ECDSA-AES128-GCM-SHA256, ECDHE-RSA-AES256-GCM-SHA384, ECDHE-ECDSA-AES256-GCM-SHA384, DHE-RSA-AES128-GCM-SHA256, ECDHE-RSA-AES128-SHA256, DHE-RSA-AES128-SHA256, ECDHE-RSA-AES256-SHA384, DHE-RSA-AES256-SHA384, ECDHE-RSA-AES256-SHA256, DHE-RSA-AES256-SHA256, HIGH,!aNULL, !eNULL, !EXPORT, !DES, !RC4, !MD5, !PSK, !SRP, !CAMELLIA`**. [Equivalent {{kib}} setting](../../../deploy-manage/deploy/self-managed/configure.md#server-ssl-cipherSuites). | #### Bind-mounted configuration [elastic-maps-server-bind-mount-config] diff --git a/explore-analyze/visualize/maps/reverse-geocoding-tutorial.md b/explore-analyze/visualize/maps/reverse-geocoding-tutorial.md index c93ab7720..6d7dfac37 100644 --- a/explore-analyze/visualize/maps/reverse-geocoding-tutorial.md +++ b/explore-analyze/visualize/maps/reverse-geocoding-tutorial.md @@ -44,7 +44,7 @@ CSAs generally share the same telecom providers and ad networks. New fast food f To get the CSA boundary data: -1. Go to the [Census Bureau’s website](https://www.census.gov/geographies/mapping-files/time-series/geo/carto-boundary-file.md) and download the `cb_2018_us_csa_500k.zip` file. +1. Go to the [Census Bureau’s website](https://www.census.gov/geographies/mapping-files/time-series/geo/carto-boundary-file.html) and download the `cb_2018_us_csa_500k.zip` file. 2. Uncompress the zip file. 3. In Kibana, go to **Maps**. 4. Click **Create map**. diff --git a/raw-migrated-files/kibana/kibana/save-load-delete-query.md b/raw-migrated-files/kibana/kibana/save-load-delete-query.md deleted file mode 100644 index 4f9469a20..000000000 --- a/raw-migrated-files/kibana/kibana/save-load-delete-query.md +++ /dev/null @@ -1,27 +0,0 @@ -# Save a query [save-load-delete-query] - -Have you ever built a query that you wanted to reuse? With saved queries, you can save your query text, filters, and time range for reuse anywhere a query bar is present. - -For example, suppose you’re in **Discover**, and you’ve put time into building a query that includes query input text, multiple filters, and a specific time range. Save this query, and you can embed the search results in dashboards, use them as a foundation for building a visualization, and share them in a link or CVS form. - -Saved queries are different than [saved Discover sessions](../../../explore-analyze/discover/save-open-search.md), which include the **Discover** configuration—selected columns in the document table, sort order, and {{data-source}}—in addition to the query. Discover sessions are primarily used for adding search results to a dashboard. - -## Saved query access [_saved_query_access] - -If you have insufficient privileges to manage saved queries, you will be unable to load or save queries from the saved query management popover. For more information, see [Granting access to Kibana](../../../deploy-manage/users-roles/cluster-or-deployment-auth/built-in-roles.md) - - -## Save a query [_save_a_query] - -1. Once you’ve built a query worth saving, click the save query icon ![save query icon](../../../images/kibana-saved-query-icon.png ""). -2. In the menu, select the item to save the query. -3. Enter a unique name. -4. Choose whether to include or exclude filters and a time range. By default, filters are automatically included, but the time filter is not. -5. Save the query. -6. To load a saved query, select it in the **Saved query** menu. - - The query text, filters, and time range are updated and your data refreshed. If you’re loading a saved query that did not include the filters or time range, those components remain as-is. - -7. To add filters and clear saved queries, use the **Saved query** menu. - - diff --git a/raw-migrated-files/toc.yml b/raw-migrated-files/toc.yml index 67f4bb0ac..ab854a034 100644 --- a/raw-migrated-files/toc.yml +++ b/raw-migrated-files/toc.yml @@ -705,7 +705,6 @@ toc: - file: kibana/kibana/reporting-production-considerations.md - file: kibana/kibana/role-mappings.md - file: kibana/kibana/sample-data.md - - file: kibana/kibana/save-load-delete-query.md - file: kibana/kibana/saved-object-ids.md - file: kibana/kibana/search-ai-assistant.md - file: kibana/kibana/secure-reporting.md From b6fb95e8b937e67790743e44d81352da5e27841a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Edu=20Gonz=C3=A1lez=20de=20la=20Herr=C3=A1n?= <25320357+eedugon@users.noreply.github.com> Date: Wed, 5 Feb 2025 11:37:33 +0100 Subject: [PATCH 06/15] Deploy and manage -> Monitoring post migration initial cleaning (#314) * applies added to all monitoring files * renaming monitoing files stage 1 * fixed frontmatter in two files * extra files renamed in monitoring * heroku documents deleted * heroku URLs added to relevant mappings * cleanup * fix anchor * ga identifiers for versions pre 9.x removed * duplicate overview resolved * some navigation titles updated --------- Co-authored-by: Brandon Morelli Co-authored-by: shainaraskas --- ...ud-hosted-deployment-billing-dimensions.md | 2 +- .../ece-configuring-ece-create-templates.md | 2 +- .../system-deployments-configuration.md | 2 +- .../cloud-on-k8s/manage-compute-resources.md | 2 +- .../elastic-cloud/azure-native-isv-service.md | 2 +- .../ec-customize-deployment-components.md | 2 +- .../ech-customize-deployment-components.md | 2 +- .../deploy/elastic-cloud/ech-restrictions.md | 4 +- .../elastic-cloud/manage-deployments.md | 2 +- .../restrictions-known-problems.md | 2 +- deploy-manage/deploy/self-managed/access.md | 2 +- .../deploy/self-managed/configure.md | 6 +- deploy-manage/monitor.md | 6 + deploy-manage/monitor/autoops.md | 2 + .../autoops/ec-autoops-deployment-view.md | 2 + .../autoops/ec-autoops-dismiss-event.md | 2 + .../autoops/ec-autoops-event-settings.md | 2 + .../monitor/autoops/ec-autoops-events.md | 2 + .../monitor/autoops/ec-autoops-faq.md | 2 + .../autoops/ec-autoops-how-to-access.md | 2 + .../monitor/autoops/ec-autoops-index-view.md | 2 + .../monitor/autoops/ec-autoops-nodes-view.md | 2 + .../ec-autoops-notifications-settings.md | 2 + .../autoops/ec-autoops-overview-view.md | 2 + .../monitor/autoops/ec-autoops-regions.md | 2 + .../monitor/autoops/ec-autoops-shards-view.md | 2 + .../autoops/ec-autoops-template-optimizer.md | 2 + .../kibana-task-manager-health-monitoring.md | 4 +- .../monitor/logging-configuration.md | 7 + .../auditing-search-queries.md | 5 + ...elating-kibana-elasticsearch-audit-logs.md | 5 + .../elasticsearch-audit-events.md | 5 + .../elasticsearch-deprecation-logs.md | 5 + ...search-log4j-configuration-self-managed.md | 4 +- ...-audit-logs-in-orchestrated-deployments.md | 4 + .../enabling-elasticsearch-audit-logs.md | 5 + .../enabling-kibana-audit-logs.md | 5 + ...les.md => kibana-log-settings-examples.md} | 2 + ...md => kibana-logging-cli-configuration.md} | 2 + .../logging-configuration/kibana-logging.md | 7 +- .../logfile-audit-events-ignore-policies.md | 5 + .../logfile-audit-output.md | 7 + .../security-event-audit-logging.md | 7 + .../update-elasticsearch-logging-levels.md | 5 + deploy-manage/monitor/monitoring-data.md | 11 +- ...ss-performance-metrics-on-elastic-cloud.md | 2 + .../monitor/monitoring-data/beats-page.md | 5 + ...g-monitoring-data-streams-elastic-agent.md | 5 + ...ig-monitoring-data-streams-metricbeat-8.md | 5 + ...ndices-metricbeat-7-internal-collection.md | 5 + .../configure-stack-monitoring-alerts.md | 7 +- ...ring-data-streamsindices-for-monitoring.md | 5 + .../monitoring-data/ec-memory-pressure.md | 4 + .../ec-saas-metrics-accessing.md | 7 +- .../monitoring-data/ec-vcpu-boost-instance.md | 3 + .../monitoring-data/ech-memory-pressure.md | 35 ----- .../ech-saas-metrics-accessing.md | 127 ------------------ .../ech-vcpu-boost-instance.md | 49 ------- .../monitoring-data/elasticsearch-metrics.md | 5 + .../monitor/monitoring-data/kibana-alerts.md | 7 + .../monitor/monitoring-data/kibana-page.md | 5 + .../monitor/monitoring-data/logstash-page.md | 5 + .../monitor-troubleshooting.md | 3 + .../visualizing-monitoring-data.md | 5 + deploy-manage/monitor/monitoring-overview.md | 7 - deploy-manage/monitor/orchestrators.md | 6 + .../ece-monitoring-ece-access.md | 2 + .../ece-monitoring-ece-set-retention.md | 2 + .../orchestrators/ece-platform-monitoring.md | 4 +- .../orchestrators/ece-proxy-log-fields.md | 2 + .../eck-metrics-configuration.md | 2 + .../k8s-enabling-metrics-endpoint.md | 2 + .../k8s-prometheus-requirements.md | 2 + .../k8s-securing-metrics-endpoint.md | 2 + deploy-manage/monitor/stack-monitoring.md | 5 + .../collecting-log-data-with-filebeat.md | 4 +- ...ting-monitoring-data-with-elastic-agent.md | 6 +- ...lecting-monitoring-data-with-metricbeat.md | 6 +- .../ece-restrictions-monitoring.md | 2 + ...deployments.md => ece-stack-monitoring.md} | 9 +- ...deployments.md => eck-stack-monitoring.md} | 3 + ...s.md => elastic-cloud-stack-monitoring.md} | 3 + .../elasticsearch-monitoring-self-managed.md | 3 + .../{http-exporter.md => es-http-exporter.md} | 2 + ...ods.md => es-legacy-collection-methods.md} | 6 +- ...local-exporter.md => es-local-exporter.md} | 2 + .../es-monitoring-collectors.md | 2 + .../es-monitoring-exporters.md | 8 +- .../{pause-export.md => es-pause-export.md} | 2 + .../stack-monitoring/k8s_audit_logging.md | 2 + ...ternal_monitoring_elasticsearch_cluster.md | 3 + .../stack-monitoring/k8s_how_it_works.md | 2 + .../k8s_override_the_beats_pod_template.md | 2 + .../stack-monitoring/k8s_when_to_use_it.md | 2 + ...ring-data.md => kibana-monitoring-data.md} | 8 +- ....md => kibana-monitoring-elastic-agent.md} | 6 +- ...-kibana.md => kibana-monitoring-legacy.md} | 6 +- ...eat.md => kibana-monitoring-metricbeat.md} | 6 +- .../kibana-monitoring-self-managed.md | 12 +- deploy-manage/toc.yml | 31 ++--- .../manage-users-roles.md | 2 +- .../cloud-on-k8s/k8s-advanced-topics.md | 2 +- .../ece-monitoring-deployments.md | 2 +- .../cloud-heroku/ech-add-user-settings.md | 2 +- .../cloud-heroku/ech-config-change-errors.md | 4 +- .../cloud/cloud-heroku/ech-configure.md | 2 +- .../ech-enable-logging-and-monitoring.md | 10 +- .../cloud-heroku/ech-manage-apm-settings.md | 2 +- .../ech-manage-kibana-settings.md | 2 +- .../cloud-heroku/ech-monitoring-setup.md | 4 +- .../cloud/cloud-heroku/ech-monitoring.md | 18 +-- .../cloud/cloud-heroku/ech-planning.md | 2 +- .../ech-saas-metrics-accessing.md | 4 +- .../echscenario_why_is_my_node_unavailable.md | 2 +- .../cloud/cloud/ec-add-user-settings.md | 2 +- .../cloud/cloud/ec-config-change-errors.md | 2 +- .../cloud/ec-enable-logging-and-monitoring.md | 10 +- .../cloud/cloud/ec-manage-apm-settings.md | 2 +- .../cloud/cloud/ec-manage-kibana-settings.md | 2 +- .../cloud/cloud/ec-monitoring-setup.md | 2 +- .../cloud/cloud/ec-monitoring.md | 2 +- .../cloud/cloud/ec-prepare-production.md | 2 +- .../cloud/cloud/ec-saas-metrics-accessing.md | 4 +- .../ec-scenario_why_are_shards_unavailable.md | 2 +- .../ec-scenario_why_is_my_node_unavailable.md | 2 +- .../how-monitoring-works.md | 2 +- .../monitor-elasticsearch-cluster.md | 2 +- .../monitoring-production.md | 10 +- .../kibana/kibana/logging-settings.md | 6 +- ...upgrade-elastic-stack-for-elastic-cloud.md | 2 +- .../observability/apps/monitor-apm-server.md | 2 +- .../apps/monitor-fleet-managed-apm-server.md | 2 +- ...rnal-collection-to-send-monitoring-data.md | 2 +- .../use-metricbeat-to-send-monitoring-data.md | 2 +- troubleshoot/elasticsearch/high-cpu-usage.md | 2 +- troubleshoot/kibana/access.md | 2 +- troubleshoot/kibana/error-server-not-ready.md | 2 +- 137 files changed, 401 insertions(+), 347 deletions(-) rename deploy-manage/monitor/logging-configuration/{log-settings-examples.md => kibana-log-settings-examples.md} (99%) rename deploy-manage/monitor/logging-configuration/{_cli_configuration.md => kibana-logging-cli-configuration.md} (97%) delete mode 100644 deploy-manage/monitor/monitoring-data/ech-memory-pressure.md delete mode 100644 deploy-manage/monitor/monitoring-data/ech-saas-metrics-accessing.md delete mode 100644 deploy-manage/monitor/monitoring-data/ech-vcpu-boost-instance.md delete mode 100644 deploy-manage/monitor/monitoring-overview.md rename deploy-manage/monitor/stack-monitoring/{enable-stack-monitoring-on-ece-deployments.md => ece-stack-monitoring.md} (95%) rename deploy-manage/monitor/stack-monitoring/{enable-stack-monitoring-on-eck-deployments.md => eck-stack-monitoring.md} (97%) rename deploy-manage/monitor/stack-monitoring/{stack-monitoring-on-elastic-cloud-deployments.md => elastic-cloud-stack-monitoring.md} (96%) rename deploy-manage/monitor/stack-monitoring/{http-exporter.md => es-http-exporter.md} (99%) rename deploy-manage/monitor/stack-monitoring/{legacy-collection-methods.md => es-legacy-collection-methods.md} (97%) rename deploy-manage/monitor/stack-monitoring/{local-exporter.md => es-local-exporter.md} (99%) rename deploy-manage/monitor/stack-monitoring/{pause-export.md => es-pause-export.md} (97%) rename deploy-manage/monitor/stack-monitoring/{monitoring-data.md => kibana-monitoring-data.md} (96%) rename deploy-manage/monitor/stack-monitoring/{monitoring-elastic-agent.md => kibana-monitoring-elastic-agent.md} (90%) rename deploy-manage/monitor/stack-monitoring/{monitoring-kibana.md => kibana-monitoring-legacy.md} (94%) rename deploy-manage/monitor/stack-monitoring/{monitoring-metricbeat.md => kibana-monitoring-metricbeat.md} (97%) diff --git a/deploy-manage/cloud-organization/billing/cloud-hosted-deployment-billing-dimensions.md b/deploy-manage/cloud-organization/billing/cloud-hosted-deployment-billing-dimensions.md index df2eb4eb0..23a8c7369 100644 --- a/deploy-manage/cloud-organization/billing/cloud-hosted-deployment-billing-dimensions.md +++ b/deploy-manage/cloud-organization/billing/cloud-hosted-deployment-billing-dimensions.md @@ -57,7 +57,7 @@ Data transfer out of deployments and between nodes of the cluster is hard to con The largest contributor to inter-node data transfer is usually shard movement between nodes in a cluster. The only way to prevent shard movement is by having a single node in a single availability zone. This solution is only possible for clusters up to 64GB RAM and is not recommended as it creates a risk of data loss. [Oversharding](https://www.elastic.co/guide/en/elasticsearch/reference/current/avoid-oversharding.html) can cause excessive shard movement. Avoiding oversharding can also help control costs and improve performance. Note that creating snapshots generates inter-node data transfer. The *storage* cost of snapshots is detailed later in this document. -The exact root cause of unusual data transfer is not always something we can identify as it can have many causes, some of which are out of our control and not associated with Cloud configuration changes. It may help to [enable monitoring](../../monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) and examine index and shard activity on your cluster. +The exact root cause of unusual data transfer is not always something we can identify as it can have many causes, some of which are out of our control and not associated with Cloud configuration changes. It may help to [enable monitoring](../../monitor/stack-monitoring/elastic-cloud-stack-monitoring.md) and examine index and shard activity on your cluster. ## Storage [storage] diff --git a/deploy-manage/deploy/cloud-enterprise/ece-configuring-ece-create-templates.md b/deploy-manage/deploy/cloud-enterprise/ece-configuring-ece-create-templates.md index 6924741d5..dfa651cea 100644 --- a/deploy-manage/deploy/cloud-enterprise/ece-configuring-ece-create-templates.md +++ b/deploy-manage/deploy/cloud-enterprise/ece-configuring-ece-create-templates.md @@ -72,7 +72,7 @@ Before you start creating your own deployment templates, you should have: [tagge 9. On this page you can [configure index management](ece-configure-templates-index-management.md) by assigning attributes to each of the data nodes in the deployment template. In Kibana, you can configure an index lifecycle management (ILM) policy, based on the node attributes, to control how data moves across the nodes in your deployment. 10. Select **Stack features**. 11. You can select a [snapshot repository](../../tools/snapshot-and-restore/cloud-enterprise.md) to be used by default for deployment backups. -12. You can choose to [enable logging and monitoring](../../monitor/stack-monitoring/enable-stack-monitoring-on-ece-deployments.md) by default, so that deployment logs and metrics are send to a dedicated monitoring deployment, and so that additional log types, retention options, and Kibana visualizations are available on all deployments created using this template. +12. You can choose to [enable logging and monitoring](../../monitor/stack-monitoring/ece-stack-monitoring.md) by default, so that deployment logs and metrics are send to a dedicated monitoring deployment, and so that additional log types, retention options, and Kibana visualizations are available on all deployments created using this template. 13. Select **Extensions**. 14. Select any Elasticsearch extensions that you would like to be available automatically to all deployments created using the template. 15. Select **Save and create template**. diff --git a/deploy-manage/deploy/cloud-enterprise/system-deployments-configuration.md b/deploy-manage/deploy/cloud-enterprise/system-deployments-configuration.md index 7c45c2c23..d76a85d1a 100644 --- a/deploy-manage/deploy/cloud-enterprise/system-deployments-configuration.md +++ b/deploy-manage/deploy/cloud-enterprise/system-deployments-configuration.md @@ -70,7 +70,7 @@ In the case of the `admin-console-elasticsearch` and `security` system deploymen The `logging-and-metrics` cluster is different since, as an ECE admin, you likely want to provide users with access to the cluster in order to troubleshoot issues without your assistance, for example. In order to manage access to that cluster, you can configure roles that will provide access to the relevant indices, map those to users, and manage access to Kibana by leveraging the Elastic security integration with your authentication provider, such as LDAP, SAML, or AD. To configure one of those security realms, check [LDAP](../../users-roles/cluster-or-deployment-auth/ldap.md), [Active Directory](../../users-roles/cluster-or-deployment-auth/active-directory.md) or [SAML](../../users-roles/cluster-or-deployment-auth/saml.md). ::::{note} -The `logging-and-metrics` cluster is only intended for troubleshooting ECE deployment issues. If your use case involves modifying or normalizing logs from {{es}} or {{kib}}, use a separate [dedicated monitoring deployment](../../monitor/stack-monitoring/enable-stack-monitoring-on-ece-deployments.md) instead. +The `logging-and-metrics` cluster is only intended for troubleshooting ECE deployment issues. If your use case involves modifying or normalizing logs from {{es}} or {{kib}}, use a separate [dedicated monitoring deployment](../../monitor/stack-monitoring/ece-stack-monitoring.md) instead. :::: diff --git a/deploy-manage/deploy/cloud-on-k8s/manage-compute-resources.md b/deploy-manage/deploy/cloud-on-k8s/manage-compute-resources.md index 1bf9a189d..544cb4ac2 100644 --- a/deploy-manage/deploy/cloud-on-k8s/manage-compute-resources.md +++ b/deploy-manage/deploy/cloud-on-k8s/manage-compute-resources.md @@ -329,7 +329,7 @@ To avoid this, explicitly define the requests and limits mandated by your enviro #### Monitoring Elasticsearch CPU using Stack Monitoring [k8s-monitor-compute-resources-stack-monitoring] -If [Stack Monitoring](../../monitor/stack-monitoring/enable-stack-monitoring-on-eck-deployments.md) is enabled, the pressure applied by the CPU cgroup controller to an Elasticsearch node can be evaluated from the **Stack Monitoring** page in Kibana. +If [Stack Monitoring](../../monitor/stack-monitoring/eck-stack-monitoring.md) is enabled, the pressure applied by the CPU cgroup controller to an Elasticsearch node can be evaluated from the **Stack Monitoring** page in Kibana. 1. On the **Stack Monitoring** page select the Elasticsearch node you want to monitor. 2. Select the **Advanced** tab. diff --git a/deploy-manage/deploy/elastic-cloud/azure-native-isv-service.md b/deploy-manage/deploy/elastic-cloud/azure-native-isv-service.md index abbd39da9..a1b18bcaf 100644 --- a/deploy-manage/deploy/elastic-cloud/azure-native-isv-service.md +++ b/deploy-manage/deploy/elastic-cloud/azure-native-isv-service.md @@ -324,7 +324,7 @@ $$$azure-integration-modify-deployment$$$How can I modify my {{ecloud}} deployme * [Update {{stack}} user settings](edit-stack-settings.md) in the component YML files. * [Add or remove custom plugins](add-plugins-extensions.md). * [Configure IP filtering](../../security/traffic-filtering.md). - * [Monitor your {{ecloud}} deployment](../../monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) to ensure it remains healthy. + * [Monitor your {{ecloud}} deployment](../../monitor/stack-monitoring/elastic-cloud-stack-monitoring.md) to ensure it remains healthy. * Add or remove API keys to use the [REST API](https://www.elastic.co/guide/en/cloud/current/ec-restful-api.html). * [And more](cloud-hosted.md) diff --git a/deploy-manage/deploy/elastic-cloud/ec-customize-deployment-components.md b/deploy-manage/deploy/elastic-cloud/ec-customize-deployment-components.md index 01765f415..dca388774 100644 --- a/deploy-manage/deploy/elastic-cloud/ec-customize-deployment-components.md +++ b/deploy-manage/deploy/elastic-cloud/ec-customize-deployment-components.md @@ -15,7 +15,7 @@ Autoscaling reduces some of the manual effort required to manage a deployment by ## {{es}} [ec-cluster-size] -Depending upon how much data you have and what queries you plan to run, you need to select a cluster size that fits your needs. There is no silver bullet for deciding how much memory you need other than simply testing it. The [cluster performance metrics](../../monitor/stack-monitoring.md) in the [Elasticsearch Service Console](https://cloud.elastic.co?page=docs&placement=docs-body) can tell you if your cluster is sized appropriately. You can also [enable deployment monitoring](../../monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) for more detailed performance metrics. Fortunately, you can change the amount of memory allocated to the cluster later without any downtime for HA deployments. +Depending upon how much data you have and what queries you plan to run, you need to select a cluster size that fits your needs. There is no silver bullet for deciding how much memory you need other than simply testing it. The [cluster performance metrics](../../monitor/stack-monitoring.md) in the [Elasticsearch Service Console](https://cloud.elastic.co?page=docs&placement=docs-body) can tell you if your cluster is sized appropriately. You can also [enable deployment monitoring](../../monitor/stack-monitoring/elastic-cloud-stack-monitoring.md) for more detailed performance metrics. Fortunately, you can change the amount of memory allocated to the cluster later without any downtime for HA deployments. To change a cluster’s topology, from deployment management, select **Edit deployment** from the **Actions** dropdown. Next, select a storage and RAM setting from the **Size per zone** drop-down list, and save your changes. When downsizing the cluster, make sure to have enough resources to handle the current load, otherwise your cluster will be under stress. diff --git a/deploy-manage/deploy/elastic-cloud/ech-customize-deployment-components.md b/deploy-manage/deploy/elastic-cloud/ech-customize-deployment-components.md index ddb9f7663..7022c679e 100644 --- a/deploy-manage/deploy/elastic-cloud/ech-customize-deployment-components.md +++ b/deploy-manage/deploy/elastic-cloud/ech-customize-deployment-components.md @@ -15,7 +15,7 @@ Autoscaling reduces some of the manual effort required to manage a deployment by ### {{es}} [ech-cluster-size] -Depending upon how much data you have and what queries you plan to run, you need to select a cluster size that fits your needs. There is no silver bullet for deciding how much memory you need other than simply testing it. The [cluster performance metrics](../../monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) in the [Elasticsearch Add-On for Heroku console](https://cloud.elastic.co?page=docs&placement=docs-body) can tell you if your cluster is sized appropriately. You can also [enable deployment monitoring](../../monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) for more detailed performance metrics. Fortunately, you can change the amount of memory allocated to the cluster later without any downtime for HA deployments. +Depending upon how much data you have and what queries you plan to run, you need to select a cluster size that fits your needs. There is no silver bullet for deciding how much memory you need other than simply testing it. The [cluster performance metrics](../../monitor/stack-monitoring/elastic-cloud-stack-monitoring.md) in the [Elasticsearch Add-On for Heroku console](https://cloud.elastic.co?page=docs&placement=docs-body) can tell you if your cluster is sized appropriately. You can also [enable deployment monitoring](../../monitor/stack-monitoring/elastic-cloud-stack-monitoring.md) for more detailed performance metrics. Fortunately, you can change the amount of memory allocated to the cluster later without any downtime for HA deployments. To change a cluster’s topology, from deployment management, select **Edit deployment** from the **Actions** dropdown. Next, select a storage and RAM setting from the **Size per zone** drop-down list, and save your changes. When downsizing the cluster, make sure to have enough resources to handle the current load, otherwise your cluster will be under stress. diff --git a/deploy-manage/deploy/elastic-cloud/ech-restrictions.md b/deploy-manage/deploy/elastic-cloud/ech-restrictions.md index 316c84efd..6c14ecf9b 100644 --- a/deploy-manage/deploy/elastic-cloud/ech-restrictions.md +++ b/deploy-manage/deploy/elastic-cloud/ech-restrictions.md @@ -19,7 +19,7 @@ When using Elasticsearch Add-On for Heroku, there are some limitations you shoul * [Regions and Availability Zones](#ech-regions-and-availability-zone) * [Known problems](#ech-known-problems) -For limitations related to logging and monitoring, check the [Restrictions and limitations](../../monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) section of the logging and monitoring page. +For limitations related to logging and monitoring, check the [Restrictions and limitations](../../monitor/stack-monitoring/elastic-cloud-stack-monitoring.md) section of the logging and monitoring page. Occasionally, we also publish information about [Known problems](#ech-known-problems) with our Elasticsearch Add-On for Heroku or the Elastic Stack. @@ -53,7 +53,7 @@ Generally, if a feature is shown as available in the [Elasticsearch Add-On for H * Elasticsearch plugins, are not enabled by default for security purposes. Please reach out to support if you would like to enable Elasticsearch plugins support on your account. * Some Elasticsearch plugins do not apply to Elasticsearch Add-On for Heroku. For example, you won’t ever need to change discovery, as Elasticsearch Add-On for Heroku handles how nodes discover one another. * In Elasticsearch 5.0 and later, site plugins are no longer supported. This change does not affect the site plugins Elasticsearch Add-On for Heroku might provide out of the box, such as Kopf or Head, since these site plugins are serviced by our proxies and not Elasticsearch itself. -* In Elasticsearch 5.0 and later, site plugins such as Kopf and Paramedic are no longer provided. We recommend that you use our [cluster performance metrics](../../monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md), [X-Pack monitoring features](../../monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) and Kibana’s (6.3+) [Index Management UI](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-mgmt.html) if you want more detailed information or perform index management actions. +* In Elasticsearch 5.0 and later, site plugins such as Kopf and Paramedic are no longer provided. We recommend that you use our [cluster performance metrics](../../monitor/stack-monitoring/elastic-cloud-stack-monitoring.md), [X-Pack monitoring features](../../monitor/stack-monitoring/elastic-cloud-stack-monitoring.md) and Kibana’s (6.3+) [Index Management UI](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-mgmt.html) if you want more detailed information or perform index management actions. ## Private Link and SSO to Kibana URLs [ech-restrictions-traffic-filters-kibana-sso] diff --git a/deploy-manage/deploy/elastic-cloud/manage-deployments.md b/deploy-manage/deploy/elastic-cloud/manage-deployments.md index 2bf452927..735a8e32d 100644 --- a/deploy-manage/deploy/elastic-cloud/manage-deployments.md +++ b/deploy-manage/deploy/elastic-cloud/manage-deployments.md @@ -8,7 +8,7 @@ mapped_pages: Sometimes you might need to make changes to the entire deployment, a specific component, or just a single data tier. * Make adjustments to specific deployment components, such as an [Integrations Server](manage-integrations-server.md), [APM & Fleet Server](switch-from-apm-to-integrations-server-payload.md#ec-manage-apm-and-fleet), [Enterprise Search](https://www.elastic.co/guide/en/cloud/current/ec-enable-enterprise-search.html), [Watcher](../../../explore-analyze/alerts/watcher.md), or [Kibana](access-kibana.md#ec-enable-kibana2). -* [Enable logging and monitoring](../../monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) of the deployment performance. +* [Enable logging and monitoring](../../monitor/stack-monitoring/elastic-cloud-stack-monitoring.md) of the deployment performance. * [Disable a data tier](../../../manage-data/lifecycle/index-lifecycle-management.md). * [Restart](../../maintenance/start-stop-services/restart-cloud-hosted-deployment.md), [stop routing](../../maintenance/ece/start-stop-routing-requests.md), or [delete your deployment](../../uninstall/delete-a-cloud-deployment.md). * [Upgrade the Elastic Stack version](../../upgrade/deployment-or-cluster.md) for the deployment. diff --git a/deploy-manage/deploy/elastic-cloud/restrictions-known-problems.md b/deploy-manage/deploy/elastic-cloud/restrictions-known-problems.md index 50a541168..c238c7b68 100644 --- a/deploy-manage/deploy/elastic-cloud/restrictions-known-problems.md +++ b/deploy-manage/deploy/elastic-cloud/restrictions-known-problems.md @@ -21,7 +21,7 @@ When using Elasticsearch Service, there are some limitations you should be aware * [Regions and Availability Zones](#ec-regions-and-availability-zone) * [Known problems](#ec-known-problems) -For limitations related to logging and monitoring, check the [Restrictions and limitations](../../monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ec-restrictions-monitoring) section of the logging and monitoring page. +For limitations related to logging and monitoring, check the [Restrictions and limitations](../../monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ec-restrictions-monitoring) section of the logging and monitoring page. Occasionally, we also publish information about [Known problems](#ec-known-problems) with our Elasticsearch Service or the Elastic Stack. diff --git a/deploy-manage/deploy/self-managed/access.md b/deploy-manage/deploy/self-managed/access.md index fe6bf6e92..a493ffe37 100644 --- a/deploy-manage/deploy/self-managed/access.md +++ b/deploy-manage/deploy/self-managed/access.md @@ -67,7 +67,7 @@ Troubleshoot the `Kibana Server is not Ready yet` error. These {{kib}}-backing indices must also not have [index settings](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-get-settings.html) flagging `read_only_allow_delete` or `write` [index blocks](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-blocks.html). 3. [Shut down all {{kib}} nodes](../../maintenance/start-stop-services/start-stop-kibana.md). -4. Choose any {{kib}} node, then update the config to set the [debug logging](../../monitor/logging-configuration/log-settings-examples.md#change-overall-log-level). +4. Choose any {{kib}} node, then update the config to set the [debug logging](../../monitor/logging-configuration/kibana-log-settings-examples.md#change-overall-log-level). 5. [Start the node](../../maintenance/start-stop-services/start-stop-kibana.md), then check the start-up debug logs for `ERROR` messages or other start-up issues. For example: diff --git a/deploy-manage/deploy/self-managed/configure.md b/deploy-manage/deploy/self-managed/configure.md index 501069c3a..0aa97dbc9 100644 --- a/deploy-manage/deploy/self-managed/configure.md +++ b/deploy-manage/deploy/self-managed/configure.md @@ -196,10 +196,10 @@ $$$logging-root-appenders$$$ `logging.root.appenders` : A list of logging appenders to forward the root level logger instance to. By default `root` is configured with the `default` appender that logs to stdout with a `pattern` layout. This is the configuration that all custom loggers will use unless they’re re-configured explicitly. You can override the default behavior by configuring a different [appender](../../monitor/logging-configuration/kibana-logging.md#logging-appenders) to apply to `root`. $$$logging-root-level$$$ `logging.root.level` ![logo cloud](https://doc-icons.s3.us-east-2.amazonaws.com/logo_cloud.svg "Supported on {{ess}}") -: Level at which a log record should be logged. Supported levels are: *all*, *fatal*, *error*, *warn*, *info*, *debug*, *trace*, *off*. Levels are ordered from *all* (highest) to *off* and a log record will be logged it its level is higher than or equal to the level of its logger, otherwise the log record is ignored. Use this value to [change the overall log level](../../monitor/logging-configuration/log-settings-examples.md#change-overall-log-level). **Default: `info`**. +: Level at which a log record should be logged. Supported levels are: *all*, *fatal*, *error*, *warn*, *info*, *debug*, *trace*, *off*. Levels are ordered from *all* (highest) to *off* and a log record will be logged it its level is higher than or equal to the level of its logger, otherwise the log record is ignored. Use this value to [change the overall log level](../../monitor/logging-configuration/kibana-log-settings-examples.md#change-overall-log-level). **Default: `info`**. ::::{tip} - Set to `all` to log all events, including system usage information and all requests. Set to `off` to silence all logs. You can also use the logging [cli commands](../../monitor/logging-configuration/_cli_configuration.md#logging-cli-migration) to set log level to `verbose` or silence all logs. + Set to `all` to log all events, including system usage information and all requests. Set to `off` to silence all logs. You can also use the logging [cli commands](../../monitor/logging-configuration/kibana-logging-cli-configuration.md#logging-cli-migration) to set log level to `verbose` or silence all logs. :::: @@ -220,7 +220,7 @@ $$$logging-root-level$$$ `logging.root.level` ![logo cloud](https://doc-icons.s3 $$$logging-loggers$$$ `logging.loggers[]` -: Allows you to [customize a specific logger instance](../../monitor/logging-configuration/log-settings-examples.md#customize-specific-log-records). +: Allows you to [customize a specific logger instance](../../monitor/logging-configuration/kibana-log-settings-examples.md#customize-specific-log-records). `logging.appenders[]` : [Appenders](../../monitor/logging-configuration/kibana-logging.md#logging-appenders) define how and where log messages are displayed (eg. **stdout** or console) and stored (eg. file on the disk). diff --git a/deploy-manage/monitor.md b/deploy-manage/monitor.md index 1a607aa46..a2d231bcc 100644 --- a/deploy-manage/monitor.md +++ b/deploy-manage/monitor.md @@ -2,6 +2,12 @@ mapped_urls: - https://www.elastic.co/guide/en/elasticsearch/reference/current/monitor-elasticsearch-cluster.html - https://www.elastic.co/guide/en/elasticsearch/reference/current/secure-monitoring.html +applies: + serverless: all + hosted: all + ece: all + eck: all + stack: all --- # Monitoring diff --git a/deploy-manage/monitor/autoops.md b/deploy-manage/monitor/autoops.md index 63e19c082..0c8f17f71 100644 --- a/deploy-manage/monitor/autoops.md +++ b/deploy-manage/monitor/autoops.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-autoops.html +applies: + hosted: all --- # AutoOps [ec-autoops] diff --git a/deploy-manage/monitor/autoops/ec-autoops-deployment-view.md b/deploy-manage/monitor/autoops/ec-autoops-deployment-view.md index 9f15ae799..26ac5ec0d 100644 --- a/deploy-manage/monitor/autoops/ec-autoops-deployment-view.md +++ b/deploy-manage/monitor/autoops/ec-autoops-deployment-view.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-autoops-deployment-view.html +applies: + hosted: all --- # Deployment [ec-autoops-deployment-view] diff --git a/deploy-manage/monitor/autoops/ec-autoops-dismiss-event.md b/deploy-manage/monitor/autoops/ec-autoops-dismiss-event.md index 9f0c308e3..ddff08d80 100644 --- a/deploy-manage/monitor/autoops/ec-autoops-dismiss-event.md +++ b/deploy-manage/monitor/autoops/ec-autoops-dismiss-event.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-autoops-dismiss-event.html +applies: + hosted: all --- # Dismiss Events [ec-autoops-dismiss-event] diff --git a/deploy-manage/monitor/autoops/ec-autoops-event-settings.md b/deploy-manage/monitor/autoops/ec-autoops-event-settings.md index 32f4e1437..baa32c7a6 100644 --- a/deploy-manage/monitor/autoops/ec-autoops-event-settings.md +++ b/deploy-manage/monitor/autoops/ec-autoops-event-settings.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-autoops-event-settings.html +applies: + hosted: all --- # Events Settings [ec-autoops-event-settings] diff --git a/deploy-manage/monitor/autoops/ec-autoops-events.md b/deploy-manage/monitor/autoops/ec-autoops-events.md index 313c62bff..66f390d73 100644 --- a/deploy-manage/monitor/autoops/ec-autoops-events.md +++ b/deploy-manage/monitor/autoops/ec-autoops-events.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-autoops-events.html +applies: + hosted: all --- # AutoOps events [ec-autoops-events] diff --git a/deploy-manage/monitor/autoops/ec-autoops-faq.md b/deploy-manage/monitor/autoops/ec-autoops-faq.md index 0cf1eff85..b568084df 100644 --- a/deploy-manage/monitor/autoops/ec-autoops-faq.md +++ b/deploy-manage/monitor/autoops/ec-autoops-faq.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-autoops-faq.html +applies: + hosted: all --- # AutoOps FAQ [ec-autoops-faq] diff --git a/deploy-manage/monitor/autoops/ec-autoops-how-to-access.md b/deploy-manage/monitor/autoops/ec-autoops-how-to-access.md index fd0bf60c2..ffa3d70a6 100644 --- a/deploy-manage/monitor/autoops/ec-autoops-how-to-access.md +++ b/deploy-manage/monitor/autoops/ec-autoops-how-to-access.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-autoops-how-to-access.html +applies: + hosted: all --- # How to access AutoOps [ec-autoops-how-to-access] diff --git a/deploy-manage/monitor/autoops/ec-autoops-index-view.md b/deploy-manage/monitor/autoops/ec-autoops-index-view.md index c2f10acef..d6d477d43 100644 --- a/deploy-manage/monitor/autoops/ec-autoops-index-view.md +++ b/deploy-manage/monitor/autoops/ec-autoops-index-view.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-autoops-index-view.html +applies: + hosted: all --- # Indices [ec-autoops-index-view] diff --git a/deploy-manage/monitor/autoops/ec-autoops-nodes-view.md b/deploy-manage/monitor/autoops/ec-autoops-nodes-view.md index 47b4ec3d3..ebd0d0f02 100644 --- a/deploy-manage/monitor/autoops/ec-autoops-nodes-view.md +++ b/deploy-manage/monitor/autoops/ec-autoops-nodes-view.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-autoops-nodes-view.html +applies: + hosted: all --- # Nodes [ec-autoops-nodes-view] diff --git a/deploy-manage/monitor/autoops/ec-autoops-notifications-settings.md b/deploy-manage/monitor/autoops/ec-autoops-notifications-settings.md index 0143922b1..a98231d86 100644 --- a/deploy-manage/monitor/autoops/ec-autoops-notifications-settings.md +++ b/deploy-manage/monitor/autoops/ec-autoops-notifications-settings.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-autoops-notifications-settings.html +applies: + hosted: all --- # Notifications settings [ec-autoops-notifications-settings] diff --git a/deploy-manage/monitor/autoops/ec-autoops-overview-view.md b/deploy-manage/monitor/autoops/ec-autoops-overview-view.md index 1479e4c1e..99e9037b9 100644 --- a/deploy-manage/monitor/autoops/ec-autoops-overview-view.md +++ b/deploy-manage/monitor/autoops/ec-autoops-overview-view.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-autoops-overview-view.html +applies: + hosted: all --- # Overview [ec-autoops-overview-view] diff --git a/deploy-manage/monitor/autoops/ec-autoops-regions.md b/deploy-manage/monitor/autoops/ec-autoops-regions.md index ee1eeecba..eae21ab9a 100644 --- a/deploy-manage/monitor/autoops/ec-autoops-regions.md +++ b/deploy-manage/monitor/autoops/ec-autoops-regions.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-autoops-regions.html +applies: + hosted: all --- # AutoOps regions [ec-autoops-regions] diff --git a/deploy-manage/monitor/autoops/ec-autoops-shards-view.md b/deploy-manage/monitor/autoops/ec-autoops-shards-view.md index 64b898316..1b5421a90 100644 --- a/deploy-manage/monitor/autoops/ec-autoops-shards-view.md +++ b/deploy-manage/monitor/autoops/ec-autoops-shards-view.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-autoops-shards-view.html +applies: + hosted: all --- # Shards [ec-autoops-shards-view] diff --git a/deploy-manage/monitor/autoops/ec-autoops-template-optimizer.md b/deploy-manage/monitor/autoops/ec-autoops-template-optimizer.md index 75e939acc..036541b2a 100644 --- a/deploy-manage/monitor/autoops/ec-autoops-template-optimizer.md +++ b/deploy-manage/monitor/autoops/ec-autoops-template-optimizer.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-autoops-template-optimizer.html +applies: + hosted: all --- # Template Optimizer [ec-autoops-template-optimizer] diff --git a/deploy-manage/monitor/kibana-task-manager-health-monitoring.md b/deploy-manage/monitor/kibana-task-manager-health-monitoring.md index 6e8e5077c..0e15d3849 100644 --- a/deploy-manage/monitor/kibana-task-manager-health-monitoring.md +++ b/deploy-manage/monitor/kibana-task-manager-health-monitoring.md @@ -1,7 +1,9 @@ --- -navigation_title: "Health monitoring" +navigation_title: "Kibana task manager monitoring" mapped_pages: - https://www.elastic.co/guide/en/kibana/current/task-manager-health-monitoring.html +applies: + stack: preview --- diff --git a/deploy-manage/monitor/logging-configuration.md b/deploy-manage/monitor/logging-configuration.md index 6d2a21122..80a04a2b2 100644 --- a/deploy-manage/monitor/logging-configuration.md +++ b/deploy-manage/monitor/logging-configuration.md @@ -1,3 +1,10 @@ +--- +applies: + hosted: all + ece: all + eck: all + stack: all +--- # Logging configuration % What needs to be done: Write from scratch diff --git a/deploy-manage/monitor/logging-configuration/auditing-search-queries.md b/deploy-manage/monitor/logging-configuration/auditing-search-queries.md index 03aec6569..f91abe8fb 100644 --- a/deploy-manage/monitor/logging-configuration/auditing-search-queries.md +++ b/deploy-manage/monitor/logging-configuration/auditing-search-queries.md @@ -1,6 +1,11 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/auditing-search-queries.html +applies: + hosted: all + ece: all + eck: all + stack: all --- # Auditing search queries [auditing-search-queries] diff --git a/deploy-manage/monitor/logging-configuration/correlating-kibana-elasticsearch-audit-logs.md b/deploy-manage/monitor/logging-configuration/correlating-kibana-elasticsearch-audit-logs.md index 4cdc2c8ff..467d76aa5 100644 --- a/deploy-manage/monitor/logging-configuration/correlating-kibana-elasticsearch-audit-logs.md +++ b/deploy-manage/monitor/logging-configuration/correlating-kibana-elasticsearch-audit-logs.md @@ -1,6 +1,11 @@ --- mapped_pages: - https://www.elastic.co/guide/en/kibana/current/xpack-security-audit-logging.html +applies: + hosted: all + ece: all + eck: all + stack: all --- # Correlating Kibana and Elasticsearch audit logs [xpack-security-audit-logging] diff --git a/deploy-manage/monitor/logging-configuration/elasticsearch-audit-events.md b/deploy-manage/monitor/logging-configuration/elasticsearch-audit-events.md index 1e4da56ad..cbca6ce14 100644 --- a/deploy-manage/monitor/logging-configuration/elasticsearch-audit-events.md +++ b/deploy-manage/monitor/logging-configuration/elasticsearch-audit-events.md @@ -1,6 +1,11 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/audit-event-types.html +applies: + hosted: all + ece: all + eck: all + stack: all --- # Elasticsearch audit events [audit-event-types] diff --git a/deploy-manage/monitor/logging-configuration/elasticsearch-deprecation-logs.md b/deploy-manage/monitor/logging-configuration/elasticsearch-deprecation-logs.md index d86fe96e1..e79f2bfc5 100644 --- a/deploy-manage/monitor/logging-configuration/elasticsearch-deprecation-logs.md +++ b/deploy-manage/monitor/logging-configuration/elasticsearch-deprecation-logs.md @@ -1,6 +1,11 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/logging.html +applies: + hosted: all + ece: all + eck: all + stack: all --- # Elasticsearch deprecation logs [logging] diff --git a/deploy-manage/monitor/logging-configuration/elasticsearch-log4j-configuration-self-managed.md b/deploy-manage/monitor/logging-configuration/elasticsearch-log4j-configuration-self-managed.md index 5138d9d19..bafe2b82c 100644 --- a/deploy-manage/monitor/logging-configuration/elasticsearch-log4j-configuration-self-managed.md +++ b/deploy-manage/monitor/logging-configuration/elasticsearch-log4j-configuration-self-managed.md @@ -1,9 +1,11 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/logging.html +applies: + stack: all --- -# Elasticsearch log4j configuration (self-managed) [logging] +# Elasticsearch log4j configuration [logging] You can use {{es}}'s application logs to monitor your cluster and diagnose issues. If you run {{es}} as a service, the default location of the logs varies based on your platform and installation method: diff --git a/deploy-manage/monitor/logging-configuration/enabling-audit-logs-in-orchestrated-deployments.md b/deploy-manage/monitor/logging-configuration/enabling-audit-logs-in-orchestrated-deployments.md index b64e75d28..06e2ffd2d 100644 --- a/deploy-manage/monitor/logging-configuration/enabling-audit-logs-in-orchestrated-deployments.md +++ b/deploy-manage/monitor/logging-configuration/enabling-audit-logs-in-orchestrated-deployments.md @@ -3,6 +3,10 @@ mapped_urls: - https://www.elastic.co/guide/en/cloud-enterprise/current/ece-enable-auditing.html - https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s_audit_logging.html - https://www.elastic.co/guide/en/cloud/current/ec-enable-logging-and-monitoring.html#ec-enable-audit-logs +applies: + hosted: all + ece: all + eck: all --- # Enabling audit logs in orchestrated deployments diff --git a/deploy-manage/monitor/logging-configuration/enabling-elasticsearch-audit-logs.md b/deploy-manage/monitor/logging-configuration/enabling-elasticsearch-audit-logs.md index 6ca3ed0ec..462a6ef47 100644 --- a/deploy-manage/monitor/logging-configuration/enabling-elasticsearch-audit-logs.md +++ b/deploy-manage/monitor/logging-configuration/enabling-elasticsearch-audit-logs.md @@ -1,6 +1,11 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/enable-audit-logging.html +applies: + hosted: all + ece: all + eck: all + stack: all --- # Enabling elasticsearch audit logs [enable-audit-logging] diff --git a/deploy-manage/monitor/logging-configuration/enabling-kibana-audit-logs.md b/deploy-manage/monitor/logging-configuration/enabling-kibana-audit-logs.md index 3d91280a5..e3ace4b82 100644 --- a/deploy-manage/monitor/logging-configuration/enabling-kibana-audit-logs.md +++ b/deploy-manage/monitor/logging-configuration/enabling-kibana-audit-logs.md @@ -1,6 +1,11 @@ --- mapped_pages: - https://www.elastic.co/guide/en/kibana/current/xpack-security-audit-logging.html +applies: + hosted: all + ece: all + eck: all + stack: all --- # Enabling Kibana audit logs [xpack-security-audit-logging] diff --git a/deploy-manage/monitor/logging-configuration/log-settings-examples.md b/deploy-manage/monitor/logging-configuration/kibana-log-settings-examples.md similarity index 99% rename from deploy-manage/monitor/logging-configuration/log-settings-examples.md rename to deploy-manage/monitor/logging-configuration/kibana-log-settings-examples.md index aa7dfc035..226346121 100644 --- a/deploy-manage/monitor/logging-configuration/log-settings-examples.md +++ b/deploy-manage/monitor/logging-configuration/kibana-log-settings-examples.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/kibana/current/log-settings-examples.html +applies: + stack: all --- # Examples [log-settings-examples] diff --git a/deploy-manage/monitor/logging-configuration/_cli_configuration.md b/deploy-manage/monitor/logging-configuration/kibana-logging-cli-configuration.md similarity index 97% rename from deploy-manage/monitor/logging-configuration/_cli_configuration.md rename to deploy-manage/monitor/logging-configuration/kibana-logging-cli-configuration.md index f16638fa4..b207bd7eb 100644 --- a/deploy-manage/monitor/logging-configuration/_cli_configuration.md +++ b/deploy-manage/monitor/logging-configuration/kibana-logging-cli-configuration.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/kibana/current/_cli_configuration.html +applies: + stack: all --- # Cli configuration [_cli_configuration] diff --git a/deploy-manage/monitor/logging-configuration/kibana-logging.md b/deploy-manage/monitor/logging-configuration/kibana-logging.md index deca01a3a..8118fcab5 100644 --- a/deploy-manage/monitor/logging-configuration/kibana-logging.md +++ b/deploy-manage/monitor/logging-configuration/kibana-logging.md @@ -1,8 +1,13 @@ --- mapped_pages: - https://www.elastic.co/guide/en/kibana/current/logging-configuration.html +applies: + stack: all --- +% this might not be valid for all deployment types. needs review. +% certain topics like LEVELS are valid for all deployment types, but not all + # Kibana logging [logging-configuration] The {{kib}} logging system has three main components: *loggers*, *appenders* and *layouts*. These components allow us to log messages according to message type and level, to control how these messages are formatted and where the final logs will be displayed or stored. @@ -32,7 +37,7 @@ A log record will be logged by the logger if its level is higher than or equal t Logging set at a plugin level is always respected, regardless of the `root` logger level. In other words, if root logger is set to fatal and pluginA logging is set to `debug`, debug logs are only shown for pluginA, with other logs only reporting on `fatal`. -The *all* and *off* levels can only be used in configuration and are handy shortcuts that allow you to log every log record or disable logging entirely for a specific logger. These levels can also be specified using [cli arguments](_cli_configuration.md#logging-cli-migration). +The *all* and *off* levels can only be used in configuration and are handy shortcuts that allow you to log every log record or disable logging entirely for a specific logger. These levels can also be specified using [cli arguments](kibana-logging-cli-configuration.md#logging-cli-migration). ## Layouts [logging-layouts] diff --git a/deploy-manage/monitor/logging-configuration/logfile-audit-events-ignore-policies.md b/deploy-manage/monitor/logging-configuration/logfile-audit-events-ignore-policies.md index 19ab65fc3..f7a164d4e 100644 --- a/deploy-manage/monitor/logging-configuration/logfile-audit-events-ignore-policies.md +++ b/deploy-manage/monitor/logging-configuration/logfile-audit-events-ignore-policies.md @@ -1,6 +1,11 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/audit-log-ignore-policy.html +applies: + hosted: all + ece: all + eck: all + stack: all --- # Logfile audit events ignore policies [audit-log-ignore-policy] diff --git a/deploy-manage/monitor/logging-configuration/logfile-audit-output.md b/deploy-manage/monitor/logging-configuration/logfile-audit-output.md index 68f98eff1..6ee7bbf23 100644 --- a/deploy-manage/monitor/logging-configuration/logfile-audit-output.md +++ b/deploy-manage/monitor/logging-configuration/logfile-audit-output.md @@ -1,8 +1,15 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/audit-log-output.html +applies: + hosted: all + ece: all + eck: all + stack: all --- +% evaluate the applies section + # Logfile audit output [audit-log-output] The `logfile` audit output is the only output for auditing. It writes data to the `_audit.json` file in the logs directory. diff --git a/deploy-manage/monitor/logging-configuration/security-event-audit-logging.md b/deploy-manage/monitor/logging-configuration/security-event-audit-logging.md index 9685480bc..7d12c21f7 100644 --- a/deploy-manage/monitor/logging-configuration/security-event-audit-logging.md +++ b/deploy-manage/monitor/logging-configuration/security-event-audit-logging.md @@ -1,3 +1,10 @@ +--- +applies: + hosted: all + ece: all + eck: all + stack: all +--- # Security event audit logging % What needs to be done: Write from scratch diff --git a/deploy-manage/monitor/logging-configuration/update-elasticsearch-logging-levels.md b/deploy-manage/monitor/logging-configuration/update-elasticsearch-logging-levels.md index e25780af6..f7ae66e00 100644 --- a/deploy-manage/monitor/logging-configuration/update-elasticsearch-logging-levels.md +++ b/deploy-manage/monitor/logging-configuration/update-elasticsearch-logging-levels.md @@ -1,6 +1,11 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/logging.html +applies: + hosted: all + ece: all + eck: all + stack: all --- # Update Elasticsearch logging levels [logging] diff --git a/deploy-manage/monitor/monitoring-data.md b/deploy-manage/monitor/monitoring-data.md index 8413ab1f7..240512af5 100644 --- a/deploy-manage/monitor/monitoring-data.md +++ b/deploy-manage/monitor/monitoring-data.md @@ -1,7 +1,16 @@ +--- +applies: + hosted: all + ece: all + eck: all + stack: all +--- # Managing monitoring data +% Probably ALL THIS NEEDS TO BE UNDER STACK MONITORING + % What needs to be done: Write from scratch % GitHub issue: https://github.com/elastic/docs-projects/issues/350 -% Scope notes: we can review the name of this section... \ No newline at end of file +% Scope notes: we can review the name of this section... diff --git a/deploy-manage/monitor/monitoring-data/access-performance-metrics-on-elastic-cloud.md b/deploy-manage/monitor/monitoring-data/access-performance-metrics-on-elastic-cloud.md index e62052455..7b56e1240 100644 --- a/deploy-manage/monitor/monitoring-data/access-performance-metrics-on-elastic-cloud.md +++ b/deploy-manage/monitor/monitoring-data/access-performance-metrics-on-elastic-cloud.md @@ -2,6 +2,8 @@ mapped_urls: - https://www.elastic.co/guide/en/cloud/current/ec-saas-metrics-accessing.html - https://www.elastic.co/guide/en/cloud-heroku/current/ech-saas-metrics-accessing.html +applies: + hosted: all --- # Access performance metrics on Elastic Cloud diff --git a/deploy-manage/monitor/monitoring-data/beats-page.md b/deploy-manage/monitor/monitoring-data/beats-page.md index 0db60e043..c65f2bd2d 100644 --- a/deploy-manage/monitor/monitoring-data/beats-page.md +++ b/deploy-manage/monitor/monitoring-data/beats-page.md @@ -2,6 +2,11 @@ navigation_title: "Beats Metrics" mapped_pages: - https://www.elastic.co/guide/en/kibana/current/beats-page.html +applies: + hosted: all + ece: all + eck: all + stack: all --- diff --git a/deploy-manage/monitor/monitoring-data/config-monitoring-data-streams-elastic-agent.md b/deploy-manage/monitor/monitoring-data/config-monitoring-data-streams-elastic-agent.md index ed96335eb..e228fb578 100644 --- a/deploy-manage/monitor/monitoring-data/config-monitoring-data-streams-elastic-agent.md +++ b/deploy-manage/monitor/monitoring-data/config-monitoring-data-streams-elastic-agent.md @@ -1,6 +1,11 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/config-monitoring-data-streams-elastic-agent.html +applies: + hosted: all + ece: all + eck: all + stack: all --- # Configuring data streams created by Elastic Agent [config-monitoring-data-streams-elastic-agent] diff --git a/deploy-manage/monitor/monitoring-data/config-monitoring-data-streams-metricbeat-8.md b/deploy-manage/monitor/monitoring-data/config-monitoring-data-streams-metricbeat-8.md index 4f6262800..d267e42bf 100644 --- a/deploy-manage/monitor/monitoring-data/config-monitoring-data-streams-metricbeat-8.md +++ b/deploy-manage/monitor/monitoring-data/config-monitoring-data-streams-metricbeat-8.md @@ -1,6 +1,11 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/config-monitoring-data-streams-metricbeat-8.html +applies: + hosted: all + ece: all + eck: all + stack: all --- # Configuring data streams created by Metricbeat 8 [config-monitoring-data-streams-metricbeat-8] diff --git a/deploy-manage/monitor/monitoring-data/config-monitoring-indices-metricbeat-7-internal-collection.md b/deploy-manage/monitor/monitoring-data/config-monitoring-indices-metricbeat-7-internal-collection.md index c380d1807..81c4589d2 100644 --- a/deploy-manage/monitor/monitoring-data/config-monitoring-indices-metricbeat-7-internal-collection.md +++ b/deploy-manage/monitor/monitoring-data/config-monitoring-indices-metricbeat-7-internal-collection.md @@ -1,6 +1,11 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/config-monitoring-indices-metricbeat-7-internal-collection.html +applies: + hosted: all + ece: all + eck: all + stack: all --- # Configuring indices created by Metricbeat 7 or internal collection [config-monitoring-indices-metricbeat-7-internal-collection] diff --git a/deploy-manage/monitor/monitoring-data/configure-stack-monitoring-alerts.md b/deploy-manage/monitor/monitoring-data/configure-stack-monitoring-alerts.md index 30fa91112..1709caff0 100644 --- a/deploy-manage/monitor/monitoring-data/configure-stack-monitoring-alerts.md +++ b/deploy-manage/monitor/monitoring-data/configure-stack-monitoring-alerts.md @@ -1,13 +1,18 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-cluster-health-notifications.html +applies: + hosted: all --- +% NEEDS MERGING WITH kibana-alerts.md +% this one is written for Elastic Cloud but needs to be generic, except if it's really about Elastic cloud. + # Configure Stack monitoring alerts [ec-cluster-health-notifications] You can configure Stack monitoring alerts to be sent to you by email when health related events occur in your deployments. To set up email notifications: -1. [Enable logging and monitoring](../stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) on deployments for which you want to receive notifications. You need to enable only metrics data being shipped for the notifications to work. +1. [Enable logging and monitoring](../stack-monitoring/elastic-cloud-stack-monitoring.md) on deployments for which you want to receive notifications. You need to enable only metrics data being shipped for the notifications to work. 2. In Kibana, configure the email connector to [send email from Elastic Cloud](https://www.elastic.co/guide/en/kibana/current/email-action-type.html#elasticcloud). If you want to use the preconfigured `Elastic-Cloud-SMTP` connector in Elastic Cloud, then you can skip this step. 3. From the Kibana main menu, go to **Stack Monitoring**. On this page you can find a summary of monitoring metrics for your deployment as well as any alerts. 4. Select **Enter setup mode**. diff --git a/deploy-manage/monitor/monitoring-data/configuring-data-streamsindices-for-monitoring.md b/deploy-manage/monitor/monitoring-data/configuring-data-streamsindices-for-monitoring.md index 6a3bd103a..e43b9c0a3 100644 --- a/deploy-manage/monitor/monitoring-data/configuring-data-streamsindices-for-monitoring.md +++ b/deploy-manage/monitor/monitoring-data/configuring-data-streamsindices-for-monitoring.md @@ -1,6 +1,11 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/config-monitoring-indices.html +applies: + hosted: all + ece: all + eck: all + stack: all --- # Configuring data streams/indices for monitoring [config-monitoring-indices] diff --git a/deploy-manage/monitor/monitoring-data/ec-memory-pressure.md b/deploy-manage/monitor/monitoring-data/ec-memory-pressure.md index 4a3b62fbf..5e3d94bf0 100644 --- a/deploy-manage/monitor/monitoring-data/ec-memory-pressure.md +++ b/deploy-manage/monitor/monitoring-data/ec-memory-pressure.md @@ -1,6 +1,10 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-memory-pressure.html + - https://www.elastic.co/guide/en/cloud-heroku/current/ech-memory-pressure.html +applies: + hosted: all + ece: all --- # JVM memory pressure indicator [ec-memory-pressure] diff --git a/deploy-manage/monitor/monitoring-data/ec-saas-metrics-accessing.md b/deploy-manage/monitor/monitoring-data/ec-saas-metrics-accessing.md index 26d58bb8b..4990f942b 100644 --- a/deploy-manage/monitor/monitoring-data/ec-saas-metrics-accessing.md +++ b/deploy-manage/monitor/monitoring-data/ec-saas-metrics-accessing.md @@ -1,13 +1,16 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-saas-metrics-accessing.html + - https://www.elastic.co/guide/en/cloud-heroku/current/ech-saas-metrics-accessing.html +applies: + hosted: all --- # Access performance metrics [ec-saas-metrics-accessing] Cluster performance metrics are available directly in the [Elasticsearch Service Console](https://cloud.elastic.co?page=docs&placement=docs-body). The graphs on this page include a subset of Elasticsearch Service-specific performance metrics. -For advanced views or production monitoring, [enable logging and monitoring](../stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md). The monitoring application provides more advanced views for Elasticsearch and JVM metrics, and includes a configurable retention period. +For advanced views or production monitoring, [enable logging and monitoring](../stack-monitoring/elastic-cloud-stack-monitoring.md). The monitoring application provides more advanced views for Elasticsearch and JVM metrics, and includes a configurable retention period. To access cluster performance metrics: @@ -30,7 +33,7 @@ The following metrics are available: Shows the maximum usage of the CPU resources assigned to your Elasticsearch cluster, as a percentage. CPU resources are relative to the size of your cluster, so that a cluster with 32GB of RAM gets assigned twice as many CPU resources as a cluster with 16GB of RAM. All clusters are guaranteed their share of CPU resources, as Elasticsearch Service infrastructure does not overcommit any resources. CPU credits permit boosting the performance of smaller clusters temporarily, so that CPU usage can exceed 100%. ::::{tip} -This chart reports the maximum CPU values over the sampling period. [Logs and Metrics](../stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) ingested into [Stack Monitoring](visualizing-monitoring-data.md)'s "CPU Usage" instead reflects the average CPU over the sampling period. Therefore, you should not expect the two graphs to look exactly the same. When investigating [CPU-related performance issues](../../../troubleshoot/monitoring/performance.md), you should default to [Stack Monitoring](visualizing-monitoring-data.md). +This chart reports the maximum CPU values over the sampling period. [Logs and Metrics](../stack-monitoring/elastic-cloud-stack-monitoring.md) ingested into [Stack Monitoring](visualizing-monitoring-data.md)'s "CPU Usage" instead reflects the average CPU over the sampling period. Therefore, you should not expect the two graphs to look exactly the same. When investigating [CPU-related performance issues](../../../troubleshoot/monitoring/performance.md), you should default to [Stack Monitoring](visualizing-monitoring-data.md). :::: diff --git a/deploy-manage/monitor/monitoring-data/ec-vcpu-boost-instance.md b/deploy-manage/monitor/monitoring-data/ec-vcpu-boost-instance.md index 162e660e6..b72326108 100644 --- a/deploy-manage/monitor/monitoring-data/ec-vcpu-boost-instance.md +++ b/deploy-manage/monitor/monitoring-data/ec-vcpu-boost-instance.md @@ -1,6 +1,9 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud/current/ec-vcpu-boost-instance.html + - https://www.elastic.co/guide/en/cloud-heroku/current/ech-vcpu-boost-instance.html +applies: + hosted: all --- # vCPU boosting and credits [ec-vcpu-boost-instance] diff --git a/deploy-manage/monitor/monitoring-data/ech-memory-pressure.md b/deploy-manage/monitor/monitoring-data/ech-memory-pressure.md deleted file mode 100644 index 93636ef70..000000000 --- a/deploy-manage/monitor/monitoring-data/ech-memory-pressure.md +++ /dev/null @@ -1,35 +0,0 @@ ---- -mapped_pages: - - https://www.elastic.co/guide/en/cloud-heroku/current/ech-memory-pressure.html ---- - -# JVM memory pressure indicator [ech-memory-pressure] - -In addition to the more detailed [cluster performance metrics](../stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md), the [Elasticsearch Add-On for Heroku console](https://cloud.elastic.co?page=docs&placement=docs-body) also includes a JVM memory pressure indicator for each node in your cluster. This indicator can help you to determine when you need to upgrade to a larger cluster. - -The percentage number used in the JVM memory pressure indicator is actually the fill rate of the old generation pool. For a detailed explanation of why this metric is used, check [Understanding Memory Pressure](https://www.elastic.co/blog/found-understanding-memory-pressure-indicator/). - -:::{image} ../../../images/cloud-heroku-memory-pressure-indicator.png -:alt: Memory pressure indicator -::: - - -## JVM memory pressure levels [ech-memory-pressure-levels] - -When the JVM memory pressure reaches 75%, the indicator turns red. At this level, garbage collection becomes more frequent as the memory usage increases, potentially impacting the performance of your cluster. As long as the cluster performance suits your needs, JVM memory pressure above 75% is not a problem in itself, but there is not much spare memory capacity. Review the [common causes of high JVM memory usage](#ech-memory-pressure-causes) to determine your best course of action. - -When the JVM memory pressure indicator rises above 95%, {{es}}'s [real memory circuit breaker](https://www.elastic.co/guide/en/elasticsearch/reference/current/circuit-breaker.html#parent-circuit-breaker) triggers to prevent your instance from running out of memory. This situation can reduce the stability of your cluster and the integrity of your data. Unless you expect the load to drop soon, we recommend that you resize to a larger cluster before you reach this level of memory pressure. Even if you’re planning to optimize your memory usage, it is best to resize the cluster first. Resizing the cluster to increase capacity can give you more time to apply other changes, and also provides the cluster with more resource for when those changes are applied. - - -## Common causes of high JVM memory usage [ech-memory-pressure-causes] - -The two most common reasons for a high JVM memory pressure reading are: - -**1. Having too many shards per node** - -If JVM memory pressure above 75% is a frequent occurrence, the cause is often having too many shards per node relative to the amount of available memory. You can lower the JVM memory pressure by reducing the number of shards or upgrading to a larger cluster. For guidelines, check [How to size your shards](https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html). - -**2. Running expensive queries** - -If JVM memory pressure above 75% happens only occasionally, this is often due to expensive queries. Queries that have a very large request size, that involve aggregations with a large volume of buckets, or that involve sorting on a non-optimized field, can all cause temporary spikes in JVM memory usage. To resolve this problem, consider optimizing your queries or upgrading to a larger cluster. - diff --git a/deploy-manage/monitor/monitoring-data/ech-saas-metrics-accessing.md b/deploy-manage/monitor/monitoring-data/ech-saas-metrics-accessing.md deleted file mode 100644 index ad331febd..000000000 --- a/deploy-manage/monitor/monitoring-data/ech-saas-metrics-accessing.md +++ /dev/null @@ -1,127 +0,0 @@ ---- -mapped_pages: - - https://www.elastic.co/guide/en/cloud-heroku/current/ech-saas-metrics-accessing.html ---- - -# Access performance metrics [ech-saas-metrics-accessing] - -Cluster performance metrics are available directly in the [Elasticsearch Add-On for Heroku console](https://cloud.elastic.co?page=docs&placement=docs-body). The graphs on this page include a subset of Elasticsearch Add-On for Heroku-specific performance metrics. - -For advanced views or production monitoring, [enable logging and monitoring](../stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md). The monitoring application provides more advanced views for Elasticsearch and JVM metrics, and includes a configurable retention period. - -To access cluster performance metrics: - -1. Log in to the [Elasticsearch Add-On for Heroku console](https://cloud.elastic.co?page=docs&placement=docs-body). -2. On the deployments page, select your deployment. - - Narrow your deployments by name, ID, or choose from several other filters. To customize your view, use a combination of filters, or change the format from a grid to a list. For example, you might want to select **Is unhealthy** and **Has master problems** to get a short list of deployments that need attention. - -3. From your deployment menu, go to the **Performance** page. - -The following metrics are available: - - -### CPU usage [echcpu_usage] - -:::{image} ../../../images/cloud-heroku-metrics-cpu-usage.png -:alt: Graph showing CPU usage -::: - -Shows the maximum usage of the CPU resources assigned to your Elasticsearch cluster, as a percentage. CPU resources are relative to the size of your cluster, so that a cluster with 32GB of RAM gets assigned twice as many CPU resources as a cluster with 16GB of RAM. All clusters are guaranteed their share of CPU resources, as Elasticsearch Add-On for Heroku infrastructure does not overcommit any resources. CPU credits permit boosting the performance of smaller clusters temporarily, so that CPU usage can exceed 100%. - - -### CPU credits [echcpu_credits] - -:::{image} ../../../images/cloud-heroku-metrics-cpu-credits.png -:alt: Graph showing available CPU credits -::: - -Shows your remaining CPU credits, measured in seconds of CPU time. CPU credits enable the boosting of CPU resources assigned to your cluster to improve performance temporarily when it is needed most. For more details check [How to use vCPU to boost your instance](ech-vcpu-boost-instance.md). - - -### Number of requests [echnumber_of_requests] - -:::{image} ../../../images/cloud-heroku-metrics-number-of-requests.png -:alt: Graph showing the number of requests -::: - -Shows the number of requests that your cluster receives per second, separated into search requests and requests to index documents. This metric provides a good indicator of the volume of work that your cluster typically handles over time which, together with other performance metrics, helps you determine if your cluster is sized correctly. Also lets you check if there is a sudden increase in the volume of user requests that might explain an increase in response times. - - -### Search response times [echsearch_response_times] - -:::{image} ../../../images/cloud-heroku-metrics-search-response-times.png -:alt: Graph showing search response times -::: - -Indicates the amount of time that it takes for your Elasticsearch cluster to complete a search query, in milliseconds. Response times won’t tell you about the cause of a performance issue, but they are often a first indicator that something is amiss with the performance of your Elasticsearch cluster. - - -### Index response times [echindex_response_times] - -:::{image} ../../../images/cloud-heroku-metrics-index-response-times.png -:alt: Graph showing index response times -::: - -Indicates the amount of time that it takes for your Elasticsearch cluster to complete an indexing operation, in milliseconds. Response times won’t tell you about the cause of a performance issue, but they are often a first indicator that something is amiss with the performance of your Elasticsearch cluster. - - -### Memory pressure per node [echmemory_pressure_per_node] - -:::{image} ../../../images/cloud-heroku-metrics-memory-pressure-per-node.png -:alt: Graph showing memory pressure per node -::: - -Indicates the total memory used by the JVM heap over time. We’ve configured {{es}}'s garbage collector to keep memory usage below 75% for heaps of 8GB or larger. For heaps smaller than 8GB, the threshold is 85%. If memory pressure consistently remains above this threshold, you might need to resize your cluster or reduce memory consumption. Check [how high memory pressure can cause performance issues](../../../troubleshoot/monitoring/high-memory-pressure.md). - - -### GC overhead per node [echgc_overhead_per_node] - -:::{image} ../../../images/cloud-heroku-metrics-gc-overhead-per-node.png -:alt: Graph showing the garbage collection overhead per node -::: - -Indicates the overhead involved in JVM garbage collection to reclaim memory. - - -## Tips for working with performance metrics [echtips_for_working_with_performance_metrics] - -Performance correlates directly with resources assigned to your cluster, and many of these metrics will show some sort of correlation with each other when you are trying to determine the cause of a performance issue. Take a look at some of the scenarios included in this section to learn how you can determine the cause of performance issues. - -It is not uncommon for performance issues on Elasticsearch Add-On for Heroku to be caused by an undersized cluster that cannot cope with the workload it is being asked to handle. If your cluster performance metrics often shows high CPU usage or excessive memory pressure, consider increasing the size of your cluster soon to improve performance. This is especially true for clusters that regularly reach 100% of CPU usage or that suffer out-of-memory failures; it is better to resize your cluster early when it is not yet maxed out than to have to resize a cluster that is already overwhelmed. [Changing the configuration of your cluster](../../deploy/elastic-cloud/cloud-hosted.md) may add some overhead if data needs to be migrated to the new nodes, which can increase the load on a cluster further and delay configuration changes. - -To help diagnose high CPU usage you can also use the Elasticsearch [nodes hot threads API](https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html), which identifies the threads on each node that have the highest CPU usage or that have been executing for a longer than normal period of time. - -::::{tip} -Got an overwhelmed cluster that needs to be upsized? [Try enabling maintenance mode first](https://www.elastic.co/guide/en/cloud-heroku/current/ech-upgrading-v5.html#ech-maintenance-mode-routing). It will likely help with configuration changes. -:::: - - -Work with the metrics shown in **Cluster Performance Metrics** section to help you find the information you need: - -* Hover on any part of a graph to get additional information. For example, hovering on a section of a graph that shows response times reveals the percentile that responses fall into at that point in time: - - :::{image} ../../../images/cloud-heroku-metrics-hover.png - :alt: Hover over the metric graph - ::: - -* Zoom in on a graph by drawing a rectangle to select a specific time window. As you zoom in one metric, other performance metrics change to show data for the same time window. - - :::{image} ../../../images/cloud-heroku-metrics-zoom.png - :alt: Zoom the metric graph - ::: - -* Pan around with ![Pan in a metric graph](../../../images/cloud-heroku-metrics-pan.png "") to make sure that you can get the right parts of a metric graph as you zoom in. -* Reset the metric graph axes with ![Reset the metric graph](../../../images/cloud-heroku-metrics-reset.png ""), which returns the graphs to their original scale. - -Cluster performance metrics are shown per node and are color-coded to indicate which running Elasticsearch instance they belong to. - - -## Cluster restarts after out-of-memory failures [echcluster_restarts_after_out_of_memory_failures] - -For clusters that suffer out-of-memory failures, it can be difficult to determine whether the clusters are in a completely healthy state afterwards. For this reason, Elasticsearch Add-On for Heroku automatically reboots clusters that suffer out-of-memory failures. - -You will receive an email notification to let you know that a restart occurred. For repeated alerts, the emails are aggregated so that you do not receive an excessive number of notifications. Either [resizing your cluster to reduce memory pressure](../../deploy/elastic-cloud/ech-customize-deployment-components.md#ech-cluster-size) or reducing the workload that a cluster is being asked to handle can help avoid these cluster restarts. - - - diff --git a/deploy-manage/monitor/monitoring-data/ech-vcpu-boost-instance.md b/deploy-manage/monitor/monitoring-data/ech-vcpu-boost-instance.md deleted file mode 100644 index 895ec40af..000000000 --- a/deploy-manage/monitor/monitoring-data/ech-vcpu-boost-instance.md +++ /dev/null @@ -1,49 +0,0 @@ ---- -mapped_pages: - - https://www.elastic.co/guide/en/cloud-heroku/current/ech-vcpu-boost-instance.html ---- - -# vCPU boosting and credits [ech-vcpu-boost-instance] - -Elastic Cloud allows smaller instance sizes to get temporarily boosted vCPU when under heavy load. vCPU boosting is governed by vCPU credits that instances can earn over time when vCPU usage is less than the assigned amount. - - -## How does vCPU boosting work? [echhow_does_vcpu_boosting_work] - -Based on the instance size, the vCPU resources assigned to your instance can be boosted to improve performance temporarily, by using vCPU credits. If credits are available, Elastic Cloud will automatically boost your instance when under heavy load. Boosting is available depending on the instance size: - -* Instance sizes up to and including 12 GB of RAM get boosted. The boosted vCPU value is `16 * vCPU ratio`, the vCPU ratios are dependent on the [hardware profile](../../deploy/elastic-cloud/ech-reference-hardware.md#ech-getting-started-configurations) selected. If an instance is eligible for boosting, the Elastic Cloud console will display **Up to 2.5 vCPU**, depending on the hardware profile selected. The baseline, or unboosted, vCPU value is calculated as: `RAM size * vCPU ratio`. -* Instance sizes bigger than 12 GB of RAM do not get boosted. The vCPU value is displayed in the Elastic Cloud console and calculated as follows: `RAM size * vCPU ratio`. - - -## What are vCPU credits? [echwhat_are_vcpu_credits] - -[vCPU](https://www.elastic.co/guide/en/elastic-stack-glossary/current/terms.html#glossary-vcpu) credits enable a smaller instance to perform as if it were assigned the vCPU resources of a larger instance, but only for a limited time. vCPU credits are available only on smaller instances up to and including 8 GB of RAM. - -vCPU credits persist through cluster restarts, but they are tied to your existing instance nodes. Operations that create new instance nodes will lose existing vCPU credits. This happens when you resize your instance, or if Elastic performs system maintenance on your nodes. - - -## How to earn vCPU credits? [echhow_to_earn_vcpu_credits] - -When you initially create an instance, you receive a credit of 60 seconds worth of vCPU time. You can accumulate additional credits when your vCPU usage is less than what your instance is assigned. At most, you can accumulate one hour worth of additional vCPU time per GB of RAM for your instance. - -For example: An instance with 4 GB of RAM, can at most accumulate four hours worth of additional vCPU time and can consume all of these vCPU credits within four hours when loaded heavily with requests. - -If you observe declining performance on a smaller instance over time, you might have depleted your vCPU credits. In this case, increase the size of your cluster to handle the workload with consistent performance. - -For more information, check [Elasticsearch Service default provider instance configurations](../../deploy/elastic-cloud/ech-reference-hardware.md#ech-getting-started-configurations). - - -## Where to check vCPU credits status? [echwhere_to_check_vcpu_credits_status] - -You can check the **Monitoring > Performance > CPU Credits** section of the [Elasticsearch Add-On for Heroku console](https://cloud.elastic.co?page=docs&placement=docs-body), and find the related metrics: - -:::{image} ../../../images/cloud-heroku-metrics-credits.png -:alt: CPU usage versus CPU credits over time -::: - - -## What to do if my vCPU credits get depleted constantly? [echwhat_to_do_if_my_vcpu_credits_get_depleted_constantly] - -If you need your cluster to be able to sustain a certain level of performance, you cannot rely on CPU boosting to handle the workload except temporarily. To ensure that performance can be sustained, consider increasing the size of your cluster. Read [this page](../../../troubleshoot/monitoring/performance.md) for more guidance. - diff --git a/deploy-manage/monitor/monitoring-data/elasticsearch-metrics.md b/deploy-manage/monitor/monitoring-data/elasticsearch-metrics.md index 29d869a8e..bb53190d6 100644 --- a/deploy-manage/monitor/monitoring-data/elasticsearch-metrics.md +++ b/deploy-manage/monitor/monitoring-data/elasticsearch-metrics.md @@ -2,6 +2,11 @@ navigation_title: "{{es}} Metrics" mapped_pages: - https://www.elastic.co/guide/en/kibana/current/elasticsearch-metrics.html +applies: + hosted: all + ece: all + eck: all + stack: all --- diff --git a/deploy-manage/monitor/monitoring-data/kibana-alerts.md b/deploy-manage/monitor/monitoring-data/kibana-alerts.md index fcd729e92..5aa523890 100644 --- a/deploy-manage/monitor/monitoring-data/kibana-alerts.md +++ b/deploy-manage/monitor/monitoring-data/kibana-alerts.md @@ -1,8 +1,15 @@ --- mapped_pages: - https://www.elastic.co/guide/en/kibana/current/kibana-alerts.html +applies: + hosted: all + ece: all + eck: all + stack: all --- +% NEEDS TO BE MERGED WITH configure-stack-monitoring-alerts.md + # Kibana alerts [kibana-alerts] The {{stack}} {monitor-features} provide [Alerting rules](../../../explore-analyze/alerts/kibana.md) out-of-the box to notify you of potential issues in the {{stack}}. These rules are preconfigured based on the best practices recommended by Elastic. However, you can tailor them to meet your specific needs. diff --git a/deploy-manage/monitor/monitoring-data/kibana-page.md b/deploy-manage/monitor/monitoring-data/kibana-page.md index 4cc9cdb26..79b0a8bb9 100644 --- a/deploy-manage/monitor/monitoring-data/kibana-page.md +++ b/deploy-manage/monitor/monitoring-data/kibana-page.md @@ -2,6 +2,11 @@ navigation_title: "{{kib}} Metrics" mapped_pages: - https://www.elastic.co/guide/en/kibana/current/kibana-page.html +applies: + hosted: all + ece: all + eck: all + stack: all --- diff --git a/deploy-manage/monitor/monitoring-data/logstash-page.md b/deploy-manage/monitor/monitoring-data/logstash-page.md index 4a896750b..440228333 100644 --- a/deploy-manage/monitor/monitoring-data/logstash-page.md +++ b/deploy-manage/monitor/monitoring-data/logstash-page.md @@ -2,6 +2,11 @@ navigation_title: "Logstash Metrics" mapped_pages: - https://www.elastic.co/guide/en/kibana/current/logstash-page.html +applies: + hosted: all + ece: all + eck: all + stack: all --- diff --git a/deploy-manage/monitor/monitoring-data/monitor-troubleshooting.md b/deploy-manage/monitor/monitoring-data/monitor-troubleshooting.md index 010b0611a..fbd94e37b 100644 --- a/deploy-manage/monitor/monitoring-data/monitor-troubleshooting.md +++ b/deploy-manage/monitor/monitoring-data/monitor-troubleshooting.md @@ -2,8 +2,11 @@ navigation_title: "Troubleshooting" mapped_pages: - https://www.elastic.co/guide/en/kibana/current/monitor-troubleshooting.html +applies: + stack: all --- +% this page probably needs to be moved # Troubleshooting [monitor-troubleshooting] diff --git a/deploy-manage/monitor/monitoring-data/visualizing-monitoring-data.md b/deploy-manage/monitor/monitoring-data/visualizing-monitoring-data.md index 5cac99ea2..364b99dfa 100644 --- a/deploy-manage/monitor/monitoring-data/visualizing-monitoring-data.md +++ b/deploy-manage/monitor/monitoring-data/visualizing-monitoring-data.md @@ -1,6 +1,11 @@ --- mapped_pages: - https://www.elastic.co/guide/en/kibana/current/xpack-monitoring.html +applies: + hosted: all + ece: all + eck: all + stack: all --- # Visualizing monitoring data [xpack-monitoring] diff --git a/deploy-manage/monitor/monitoring-overview.md b/deploy-manage/monitor/monitoring-overview.md deleted file mode 100644 index dd77e09a5..000000000 --- a/deploy-manage/monitor/monitoring-overview.md +++ /dev/null @@ -1,7 +0,0 @@ -# Monitoring overview - -% What needs to be done: Write from scratch - -% GitHub issue: https://github.com/elastic/docs-projects/issues/350 - -% Scope notes: Write an overview about monitoring (maybe this is similar than the landing page). \ No newline at end of file diff --git a/deploy-manage/monitor/orchestrators.md b/deploy-manage/monitor/orchestrators.md index 9f6e51709..2d671dbd2 100644 --- a/deploy-manage/monitor/orchestrators.md +++ b/deploy-manage/monitor/orchestrators.md @@ -1,3 +1,9 @@ +--- +applies: + ece: all + eck: all +--- + # Monitoring Orchestrators % What needs to be done: Write from scratch diff --git a/deploy-manage/monitor/orchestrators/ece-monitoring-ece-access.md b/deploy-manage/monitor/orchestrators/ece-monitoring-ece-access.md index 3e06693b6..b369d1369 100644 --- a/deploy-manage/monitor/orchestrators/ece-monitoring-ece-access.md +++ b/deploy-manage/monitor/orchestrators/ece-monitoring-ece-access.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud-enterprise/current/ece-monitoring-ece-access.html +applies: + ece: all --- # Access logs and metrics [ece-monitoring-ece-access] diff --git a/deploy-manage/monitor/orchestrators/ece-monitoring-ece-set-retention.md b/deploy-manage/monitor/orchestrators/ece-monitoring-ece-set-retention.md index 7105de040..af3a7f194 100644 --- a/deploy-manage/monitor/orchestrators/ece-monitoring-ece-set-retention.md +++ b/deploy-manage/monitor/orchestrators/ece-monitoring-ece-set-retention.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud-enterprise/current/ece-monitoring-ece-set-retention.html +applies: + ece: all --- # Set the retention period for logging and metrics indices [ece-monitoring-ece-set-retention] diff --git a/deploy-manage/monitor/orchestrators/ece-platform-monitoring.md b/deploy-manage/monitor/orchestrators/ece-platform-monitoring.md index a73d3e077..c2a69e58f 100644 --- a/deploy-manage/monitor/orchestrators/ece-platform-monitoring.md +++ b/deploy-manage/monitor/orchestrators/ece-platform-monitoring.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud-enterprise/current/ece-monitoring-ece.html +applies: + ece: all --- # ECE platform monitoring [ece-monitoring-ece] @@ -10,7 +12,7 @@ Elastic Cloud Enterprise by default collects monitoring data for your installati * Logs for all core services that are a part of Elastic Cloud Enterprise and monitoring metrics for core Elastic Cloud Enterprise services, system processes on the host, and any third-party software * Logs and monitoring metrics for Elasticsearch clusters and for Kibana instances -These monitoring indices are collected in addition to the [monitoring you might have enabled for specific clusters](../stack-monitoring/enable-stack-monitoring-on-ece-deployments.md), which also provides monitoring metrics that you can access in Kibana (note that the `logging-and-metrics` deployment is used for monitoring data from system deployments only; for non-system deployments, monitoring data must be sent to a deployment other than `logging-and-metrics`). +These monitoring indices are collected in addition to the [monitoring you might have enabled for specific clusters](../stack-monitoring/ece-stack-monitoring.md), which also provides monitoring metrics that you can access in Kibana (note that the `logging-and-metrics` deployment is used for monitoring data from system deployments only; for non-system deployments, monitoring data must be sent to a deployment other than `logging-and-metrics`). In this section: diff --git a/deploy-manage/monitor/orchestrators/ece-proxy-log-fields.md b/deploy-manage/monitor/orchestrators/ece-proxy-log-fields.md index 5fde6bc67..d80f84bbe 100644 --- a/deploy-manage/monitor/orchestrators/ece-proxy-log-fields.md +++ b/deploy-manage/monitor/orchestrators/ece-proxy-log-fields.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud-enterprise/current/ece-proxy-log-fields.html +applies: + ece: all --- # Proxy Log Fields [ece-proxy-log-fields] diff --git a/deploy-manage/monitor/orchestrators/eck-metrics-configuration.md b/deploy-manage/monitor/orchestrators/eck-metrics-configuration.md index 223a4743c..ee351b7ad 100644 --- a/deploy-manage/monitor/orchestrators/eck-metrics-configuration.md +++ b/deploy-manage/monitor/orchestrators/eck-metrics-configuration.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-configure-operator-metrics.html +applies: + eck: all --- # ECK metrics configuration [k8s-configure-operator-metrics] diff --git a/deploy-manage/monitor/orchestrators/k8s-enabling-metrics-endpoint.md b/deploy-manage/monitor/orchestrators/k8s-enabling-metrics-endpoint.md index e11e4b23f..2c8ac4285 100644 --- a/deploy-manage/monitor/orchestrators/k8s-enabling-metrics-endpoint.md +++ b/deploy-manage/monitor/orchestrators/k8s-enabling-metrics-endpoint.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-enabling-the-metrics-endpoint.html +applies: + eck: all --- # Enabling the metrics endpoint [k8s-enabling-the-metrics-endpoint] diff --git a/deploy-manage/monitor/orchestrators/k8s-prometheus-requirements.md b/deploy-manage/monitor/orchestrators/k8s-prometheus-requirements.md index 964e60919..3e9bb2d39 100644 --- a/deploy-manage/monitor/orchestrators/k8s-prometheus-requirements.md +++ b/deploy-manage/monitor/orchestrators/k8s-prometheus-requirements.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-prometheus-requirements.html +applies: + eck: all --- # Prometheus requirements [k8s-prometheus-requirements] diff --git a/deploy-manage/monitor/orchestrators/k8s-securing-metrics-endpoint.md b/deploy-manage/monitor/orchestrators/k8s-securing-metrics-endpoint.md index da923c22e..8177a7697 100644 --- a/deploy-manage/monitor/orchestrators/k8s-securing-metrics-endpoint.md +++ b/deploy-manage/monitor/orchestrators/k8s-securing-metrics-endpoint.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-securing-the-metrics-endpoint.html +applies: + eck: all --- # Securing the metrics endpoint [k8s-securing-the-metrics-endpoint] diff --git a/deploy-manage/monitor/stack-monitoring.md b/deploy-manage/monitor/stack-monitoring.md index 17f524092..1a8e02612 100644 --- a/deploy-manage/monitor/stack-monitoring.md +++ b/deploy-manage/monitor/stack-monitoring.md @@ -3,6 +3,11 @@ mapped_urls: - https://www.elastic.co/guide/en/elasticsearch/reference/current/monitoring-overview.html - https://www.elastic.co/guide/en/elasticsearch/reference/current/how-monitoring-works.html - https://www.elastic.co/guide/en/cloud/current/ec-monitoring.html +applies: + hosted: all + ece: all + eck: all + stack: all --- # Stack Monitoring diff --git a/deploy-manage/monitor/stack-monitoring/collecting-log-data-with-filebeat.md b/deploy-manage/monitor/stack-monitoring/collecting-log-data-with-filebeat.md index cf06e7425..d703bb6df 100644 --- a/deploy-manage/monitor/stack-monitoring/collecting-log-data-with-filebeat.md +++ b/deploy-manage/monitor/stack-monitoring/collecting-log-data-with-filebeat.md @@ -2,6 +2,8 @@ navigation_title: "Collecting log data with {{filebeat}}" mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/configuring-filebeat.html +applies: + stack: all --- @@ -116,5 +118,5 @@ If you’re using {{agent}}, do not deploy {{filebeat}} for log collection. Inst If you want to use the **Monitoring** UI in {{kib}}, there must also be `.monitoring-*` indices. Those indices are generated when you collect metrics about {{stack}} products. For example, see [Collecting monitoring data with {{metricbeat}}](collecting-monitoring-data-with-metricbeat.md). :::: -10. [View the monitoring data in {{kib}}](monitoring-data.md). +10. [View the monitoring data in {{kib}}](kibana-monitoring-data.md). diff --git a/deploy-manage/monitor/stack-monitoring/collecting-monitoring-data-with-elastic-agent.md b/deploy-manage/monitor/stack-monitoring/collecting-monitoring-data-with-elastic-agent.md index 2db0f2e4e..d84d877df 100644 --- a/deploy-manage/monitor/stack-monitoring/collecting-monitoring-data-with-elastic-agent.md +++ b/deploy-manage/monitor/stack-monitoring/collecting-monitoring-data-with-elastic-agent.md @@ -2,6 +2,8 @@ navigation_title: "Collecting monitoring data with {{agent}}" mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/configuring-elastic-agent.html +applies: + stack: all --- @@ -9,7 +11,7 @@ mapped_pages: # Collecting monitoring data with Elastic Agent [configuring-elastic-agent] -In 8.5 and later, you can use {{agent}} to collect data about {{es}} and ship it to the monitoring cluster, rather than [using {{metricbeat}}](collecting-monitoring-data-with-metricbeat.md) or routing it through exporters as described in [Legacy collection methods](legacy-collection-methods.md). +In 8.5 and later, you can use {{agent}} to collect data about {{es}} and ship it to the monitoring cluster, rather than [using {{metricbeat}}](collecting-monitoring-data-with-metricbeat.md) or routing it through exporters as described in [Legacy collection methods](es-legacy-collection-methods.md). ## Prerequisites [_prerequisites_11] @@ -49,4 +51,4 @@ To collect {{es}} monitoring data, add an {{es}} integration to an {{agent}} and 2. Follow the steps in the **Add agent** flyout to download, install, and enroll the {{agent}}. Make sure you choose the agent policy you created earlier. 9. Wait a minute or two until incoming data is confirmed. -10. [View the monitoring data in {{kib}}](monitoring-data.md). +10. [View the monitoring data in {{kib}}](kibana-monitoring-data.md). diff --git a/deploy-manage/monitor/stack-monitoring/collecting-monitoring-data-with-metricbeat.md b/deploy-manage/monitor/stack-monitoring/collecting-monitoring-data-with-metricbeat.md index 16f2113b4..753fefcfe 100644 --- a/deploy-manage/monitor/stack-monitoring/collecting-monitoring-data-with-metricbeat.md +++ b/deploy-manage/monitor/stack-monitoring/collecting-monitoring-data-with-metricbeat.md @@ -2,6 +2,8 @@ navigation_title: "Collecting monitoring data with {{metricbeat}}" mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/configuring-metricbeat.html +applies: + stack: all --- @@ -9,7 +11,7 @@ mapped_pages: # Collecting monitoring data with Metricbeat [configuring-metricbeat] -In 6.5 and later, you can use {{metricbeat}} to collect data about {{es}} and ship it to the monitoring cluster, rather than routing it through exporters as described in [Legacy collection methods](legacy-collection-methods.md). +In 6.5 and later, you can use {{metricbeat}} to collect data about {{es}} and ship it to the monitoring cluster, rather than routing it through exporters as described in [Legacy collection methods](es-legacy-collection-methods.md). Want to use {{agent}} instead? Refer to [Collecting monitoring data with {{agent}}](collecting-monitoring-data-with-elastic-agent.md). @@ -103,5 +105,5 @@ Want to use {{agent}} instead? Refer to [Collecting monitoring data with {{agent For more information about these configuration options, see [Configure the {{es}} output](https://www.elastic.co/guide/en/beats/metricbeat/current/elasticsearch-output.html). 6. [Start {{metricbeat}}](https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-starting.html) on each node. -7. [View the monitoring data in {{kib}}](monitoring-data.md). +7. [View the monitoring data in {{kib}}](kibana-monitoring-data.md). diff --git a/deploy-manage/monitor/stack-monitoring/ece-restrictions-monitoring.md b/deploy-manage/monitor/stack-monitoring/ece-restrictions-monitoring.md index a11e1d8de..06391bd4e 100644 --- a/deploy-manage/monitor/stack-monitoring/ece-restrictions-monitoring.md +++ b/deploy-manage/monitor/stack-monitoring/ece-restrictions-monitoring.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud-enterprise/current/ece-restrictions-monitoring.html +applies: + ece: all --- # Restrictions and limitations [ece-restrictions-monitoring] diff --git a/deploy-manage/monitor/stack-monitoring/enable-stack-monitoring-on-ece-deployments.md b/deploy-manage/monitor/stack-monitoring/ece-stack-monitoring.md similarity index 95% rename from deploy-manage/monitor/stack-monitoring/enable-stack-monitoring-on-ece-deployments.md rename to deploy-manage/monitor/stack-monitoring/ece-stack-monitoring.md index d6c6f0be2..5aa4f0c63 100644 --- a/deploy-manage/monitor/stack-monitoring/enable-stack-monitoring-on-ece-deployments.md +++ b/deploy-manage/monitor/stack-monitoring/ece-stack-monitoring.md @@ -1,11 +1,14 @@ --- +navigation_title: "Elastic Cloud Enterprise (ECE)" mapped_pages: - https://www.elastic.co/guide/en/cloud-enterprise/current/ece-enable-logging-and-monitoring.html +applies: + ece: all --- # Enable stack monitoring on ECE deployments [ece-enable-logging-and-monitoring] -The deployment logging and monitoring feature lets you monitor your deployment in [Kibana](../../../get-started/the-stack.md) by shipping logs and metrics to a monitoring deployment. You can: +The deployment logging and monitoring feature lets you monitor your deployment in [Kibana](/get-started/the-stack.md) by shipping logs and metrics to a monitoring deployment. You can: * View your deployment’s health and performance in real time and analyze past cluster, index, and node metrics. * View your deployment’s logs to debug issues, discover slow queries, surface deprecations, and analyze access to your deployment. @@ -53,7 +56,7 @@ When you enable monitoring in Elastic Cloud Enterprise, your monitoring indices $$$ece-logging-and-monitoring-retention-7$$$ When you enable self-monitoring in Elastic Cloud Enterprise, your monitoring indices are retained for a certain period by default. After the retention period has passed, the monitoring indices are deleted automatically. Monitoring data is retained for three days by default or as specified by the [`xpack.monitoring.history.duration` user setting](https://www.elastic.co/guide/en/cloud-enterprise/current/ece-change-user-settings-examples.html#xpack-monitoring-history-duration). -To retain monitoring indices as is without deleting them automatically, you must disable the [cleaner service](local-exporter.md#local-exporter-cleaner) by adding a disabled local exporter in your cluster settings. +To retain monitoring indices as is without deleting them automatically, you must disable the [cleaner service](es-local-exporter.md#local-exporter-cleaner) by adding a disabled local exporter in your cluster settings. For example @@ -110,7 +113,7 @@ An ILM policy is pre-configured to manage log retention. The policy can be adjus ### Index management [ece-logging-and-monitoring-index-management-ilm] -When sending monitoring data to a deployment, you can configure [Index Lifecycle Management (ILM)](../../../manage-data/lifecycle/index-lifecycle-management.md) to manage retention of your monitoring and logging indices. When sending logs to a deployment, an ILM policy is pre-configured to manage log retention and the policy can be customized to your needs. +When sending monitoring data to a deployment, you can configure [Index Lifecycle Management (ILM)](/manage-data/lifecycle/index-lifecycle-management.md) to manage retention of your monitoring and logging indices. When sending logs to a deployment, an ILM policy is pre-configured to manage log retention and the policy can be customized to your needs. ### Enable logging and monitoring [ece-enable-logging-and-monitoring-steps] diff --git a/deploy-manage/monitor/stack-monitoring/enable-stack-monitoring-on-eck-deployments.md b/deploy-manage/monitor/stack-monitoring/eck-stack-monitoring.md similarity index 97% rename from deploy-manage/monitor/stack-monitoring/enable-stack-monitoring-on-eck-deployments.md rename to deploy-manage/monitor/stack-monitoring/eck-stack-monitoring.md index defe7cfe8..5bbeafacd 100644 --- a/deploy-manage/monitor/stack-monitoring/enable-stack-monitoring-on-eck-deployments.md +++ b/deploy-manage/monitor/stack-monitoring/eck-stack-monitoring.md @@ -1,6 +1,9 @@ --- +navigation_title: "Elastic Cloud on Kubernetes (ECK)" mapped_pages: - https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-stack-monitoring.html +applies: + eck: all --- # Enable stack monitoring on ECK deployments [k8s-stack-monitoring] diff --git a/deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md b/deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md similarity index 96% rename from deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md rename to deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md index c383459fc..d75c8fff2 100644 --- a/deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md +++ b/deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md @@ -1,4 +1,5 @@ --- +navigation_title: "Elastic Cloud Hosted (ECH)" mapped_urls: - https://www.elastic.co/guide/en/cloud-heroku/current/ech-monitoring.html - https://www.elastic.co/guide/en/cloud/current/ec-monitoring-setup.html @@ -6,6 +7,8 @@ mapped_urls: - https://www.elastic.co/guide/en/cloud-heroku/current/ech-enable-logging-and-monitoring.html - https://www.elastic.co/guide/en/cloud-heroku/current/ech-monitoring-setup.html - https://www.elastic.co/guide/en/cloud-heroku/current/ech-restrictions-monitoring.html +applies: + hosted: all --- # Stack Monitoring on Elastic Cloud deployments diff --git a/deploy-manage/monitor/stack-monitoring/elasticsearch-monitoring-self-managed.md b/deploy-manage/monitor/stack-monitoring/elasticsearch-monitoring-self-managed.md index 59d38d2ea..5d7c38ad5 100644 --- a/deploy-manage/monitor/stack-monitoring/elasticsearch-monitoring-self-managed.md +++ b/deploy-manage/monitor/stack-monitoring/elasticsearch-monitoring-self-managed.md @@ -1,7 +1,10 @@ --- +navigation_title: "Elasticsearch self-managed" mapped_urls: - https://www.elastic.co/guide/en/elasticsearch/reference/current/monitoring-production.html - https://www.elastic.co/guide/en/elasticsearch/reference/current/secure-monitoring.html +applies: + stack: all --- # Elasticsearch monitoring self-managed diff --git a/deploy-manage/monitor/stack-monitoring/http-exporter.md b/deploy-manage/monitor/stack-monitoring/es-http-exporter.md similarity index 99% rename from deploy-manage/monitor/stack-monitoring/http-exporter.md rename to deploy-manage/monitor/stack-monitoring/es-http-exporter.md index 05c168b4c..5d2bef6d3 100644 --- a/deploy-manage/monitor/stack-monitoring/http-exporter.md +++ b/deploy-manage/monitor/stack-monitoring/es-http-exporter.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/http-exporter.html +applies: + stack: deprecated 7.16.0 --- # HTTP exporters [http-exporter] diff --git a/deploy-manage/monitor/stack-monitoring/legacy-collection-methods.md b/deploy-manage/monitor/stack-monitoring/es-legacy-collection-methods.md similarity index 97% rename from deploy-manage/monitor/stack-monitoring/legacy-collection-methods.md rename to deploy-manage/monitor/stack-monitoring/es-legacy-collection-methods.md index b53a15917..8ec9f98e9 100644 --- a/deploy-manage/monitor/stack-monitoring/legacy-collection-methods.md +++ b/deploy-manage/monitor/stack-monitoring/es-legacy-collection-methods.md @@ -2,6 +2,8 @@ navigation_title: "Legacy collection methods" mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/collecting-monitoring-data.html +applies: + stack: deprecated 7.16.0 --- @@ -73,7 +75,7 @@ To learn about monitoring in general, see [Monitor a cluster](../../monitor.md). 2. Identify where to store monitoring data. - By default, the data is stored on the same cluster by using a [`local` exporter](local-exporter.md). Alternatively, you can use an [`http` exporter](http-exporter.md) to send data to a separate *monitoring cluster*. + By default, the data is stored on the same cluster by using a [`local` exporter](es-local-exporter.md). Alternatively, you can use an [`http` exporter](es-http-exporter.md) to send data to a separate *monitoring cluster*. ::::{important} The {{es}} {monitor-features} use ingest pipelines, therefore the cluster that stores the monitoring data must have at least one [ingest node](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md). @@ -148,7 +150,7 @@ To learn about monitoring in general, see [Monitor a cluster](../../monitor.md). :::: 6. Optional: [Configure the indices that store the monitoring data](../monitoring-data/configuring-data-streamsindices-for-monitoring.md). -7. [View the monitoring data in {{kib}}](monitoring-data.md). +7. [View the monitoring data in {{kib}}](kibana-monitoring-data.md). diff --git a/deploy-manage/monitor/stack-monitoring/local-exporter.md b/deploy-manage/monitor/stack-monitoring/es-local-exporter.md similarity index 99% rename from deploy-manage/monitor/stack-monitoring/local-exporter.md rename to deploy-manage/monitor/stack-monitoring/es-local-exporter.md index 895eb9490..89c175e7f 100644 --- a/deploy-manage/monitor/stack-monitoring/local-exporter.md +++ b/deploy-manage/monitor/stack-monitoring/es-local-exporter.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/local-exporter.html +applies: + stack: deprecated 7.16.0 --- # Local exporters [local-exporter] diff --git a/deploy-manage/monitor/stack-monitoring/es-monitoring-collectors.md b/deploy-manage/monitor/stack-monitoring/es-monitoring-collectors.md index b07ba8717..70e521348 100644 --- a/deploy-manage/monitor/stack-monitoring/es-monitoring-collectors.md +++ b/deploy-manage/monitor/stack-monitoring/es-monitoring-collectors.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/es-monitoring-collectors.html +applies: + stack: deprecated 7.16.0 --- # Collectors [es-monitoring-collectors] diff --git a/deploy-manage/monitor/stack-monitoring/es-monitoring-exporters.md b/deploy-manage/monitor/stack-monitoring/es-monitoring-exporters.md index 7aea5b922..afaf502c4 100644 --- a/deploy-manage/monitor/stack-monitoring/es-monitoring-exporters.md +++ b/deploy-manage/monitor/stack-monitoring/es-monitoring-exporters.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/es-monitoring-exporters.html +applies: + stack: deprecated 7.16.0 --- # Exporters [es-monitoring-exporters] @@ -18,10 +20,10 @@ The purpose of exporters is to take data collected from any Elastic Stack source There are two types of exporters in {{es}}: `local` -: The default exporter used by {{es}} {monitor-features}. This exporter routes data back into the *same* cluster. See [Local exporters](local-exporter.md). +: The default exporter used by {{es}} {monitor-features}. This exporter routes data back into the *same* cluster. See [Local exporters](es-local-exporter.md). `http` -: The preferred exporter, which you can use to route data into any supported {{es}} cluster accessible via HTTP. Production environments should always use a separate monitoring cluster. See [HTTP exporters](http-exporter.md). +: The preferred exporter, which you can use to route data into any supported {{es}} cluster accessible via HTTP. Production environments should always use a separate monitoring cluster. See [HTTP exporters](es-http-exporter.md). Both exporters serve the same purpose: to set up the monitoring cluster and route monitoring data. However, they perform these tasks in very different ways. Even though things happen differently, both exporters are capable of sending all of the same data. @@ -37,7 +39,7 @@ When the exporters route monitoring data into the monitoring cluster, they use ` Routing monitoring data involves indexing it into the appropriate monitoring indices. Once the data is indexed, it exists in a monitoring index that, by default, is named with a daily index pattern. For {{es}} monitoring data, this is an index that matches `.monitoring-es-6-*`. From there, the data lives inside the monitoring cluster and must be curated or cleaned up as necessary. If you do not curate the monitoring data, it eventually fills up the nodes and the cluster might fail due to lack of disk space. ::::{tip} -You are strongly recommended to manage the curation of indices and particularly the monitoring indices. To do so, you can take advantage of the [cleaner service](local-exporter.md#local-exporter-cleaner) or [Elastic Curator](https://www.elastic.co/guide/en/elasticsearch/client/curator/current/index.html). +You are strongly recommended to manage the curation of indices and particularly the monitoring indices. To do so, you can take advantage of the [cleaner service](es-local-exporter.md#local-exporter-cleaner) or [Elastic Curator](https://www.elastic.co/guide/en/elasticsearch/client/curator/current/index.html). :::: diff --git a/deploy-manage/monitor/stack-monitoring/pause-export.md b/deploy-manage/monitor/stack-monitoring/es-pause-export.md similarity index 97% rename from deploy-manage/monitor/stack-monitoring/pause-export.md rename to deploy-manage/monitor/stack-monitoring/es-pause-export.md index 29c140aaa..ce63d4008 100644 --- a/deploy-manage/monitor/stack-monitoring/pause-export.md +++ b/deploy-manage/monitor/stack-monitoring/es-pause-export.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/pause-export.html +applies: + stack: deprecated 7.16.0 --- # Pausing data collection [pause-export] diff --git a/deploy-manage/monitor/stack-monitoring/k8s_audit_logging.md b/deploy-manage/monitor/stack-monitoring/k8s_audit_logging.md index 819713023..cf4d89394 100644 --- a/deploy-manage/monitor/stack-monitoring/k8s_audit_logging.md +++ b/deploy-manage/monitor/stack-monitoring/k8s_audit_logging.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s_audit_logging.html +applies: + eck: all --- # Audit logging [k8s_audit_logging] diff --git a/deploy-manage/monitor/stack-monitoring/k8s_connect_to_an_external_monitoring_elasticsearch_cluster.md b/deploy-manage/monitor/stack-monitoring/k8s_connect_to_an_external_monitoring_elasticsearch_cluster.md index 800a8334c..a730bb4a7 100644 --- a/deploy-manage/monitor/stack-monitoring/k8s_connect_to_an_external_monitoring_elasticsearch_cluster.md +++ b/deploy-manage/monitor/stack-monitoring/k8s_connect_to_an_external_monitoring_elasticsearch_cluster.md @@ -1,6 +1,9 @@ --- +navigation_title: "Connect to an external cluster" mapped_pages: - https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s_connect_to_an_external_monitoring_elasticsearch_cluster.html +applies: + eck: all --- # Connect to an external monitoring Elasticsearch cluster [k8s_connect_to_an_external_monitoring_elasticsearch_cluster] diff --git a/deploy-manage/monitor/stack-monitoring/k8s_how_it_works.md b/deploy-manage/monitor/stack-monitoring/k8s_how_it_works.md index 777b2440d..38ea5faaf 100644 --- a/deploy-manage/monitor/stack-monitoring/k8s_how_it_works.md +++ b/deploy-manage/monitor/stack-monitoring/k8s_how_it_works.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s_how_it_works.html +applies: + eck: all --- # How it works [k8s_how_it_works] diff --git a/deploy-manage/monitor/stack-monitoring/k8s_override_the_beats_pod_template.md b/deploy-manage/monitor/stack-monitoring/k8s_override_the_beats_pod_template.md index 62e7daa88..0a43e3f70 100644 --- a/deploy-manage/monitor/stack-monitoring/k8s_override_the_beats_pod_template.md +++ b/deploy-manage/monitor/stack-monitoring/k8s_override_the_beats_pod_template.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s_override_the_beats_pod_template.html +applies: + eck: all --- # Override the Beats Pod Template [k8s_override_the_beats_pod_template] diff --git a/deploy-manage/monitor/stack-monitoring/k8s_when_to_use_it.md b/deploy-manage/monitor/stack-monitoring/k8s_when_to_use_it.md index 1f880569c..5f67296ac 100644 --- a/deploy-manage/monitor/stack-monitoring/k8s_when_to_use_it.md +++ b/deploy-manage/monitor/stack-monitoring/k8s_when_to_use_it.md @@ -1,6 +1,8 @@ --- mapped_pages: - https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s_when_to_use_it.html +applies: + eck: all --- # When to use it [k8s_when_to_use_it] diff --git a/deploy-manage/monitor/stack-monitoring/monitoring-data.md b/deploy-manage/monitor/stack-monitoring/kibana-monitoring-data.md similarity index 96% rename from deploy-manage/monitor/stack-monitoring/monitoring-data.md rename to deploy-manage/monitor/stack-monitoring/kibana-monitoring-data.md index 98eabc7e2..24d26275a 100644 --- a/deploy-manage/monitor/stack-monitoring/monitoring-data.md +++ b/deploy-manage/monitor/stack-monitoring/kibana-monitoring-data.md @@ -2,9 +2,15 @@ navigation_title: "View monitoring data" mapped_pages: - https://www.elastic.co/guide/en/kibana/current/monitoring-data.html +applies: + hosted: all + ece: all + eck: all + stack: all --- - + +% hola # View monitoring data [monitoring-data] diff --git a/deploy-manage/monitor/stack-monitoring/monitoring-elastic-agent.md b/deploy-manage/monitor/stack-monitoring/kibana-monitoring-elastic-agent.md similarity index 90% rename from deploy-manage/monitor/stack-monitoring/monitoring-elastic-agent.md rename to deploy-manage/monitor/stack-monitoring/kibana-monitoring-elastic-agent.md index 264f36a2b..83264edff 100644 --- a/deploy-manage/monitor/stack-monitoring/monitoring-elastic-agent.md +++ b/deploy-manage/monitor/stack-monitoring/kibana-monitoring-elastic-agent.md @@ -2,6 +2,8 @@ navigation_title: "Collect monitoring data with {{agent}}" mapped_pages: - https://www.elastic.co/guide/en/kibana/current/monitoring-elastic-agent.html +applies: + stack: all --- @@ -9,7 +11,7 @@ mapped_pages: # Collect monitoring data with Elastic Agent [monitoring-elastic-agent] -In 8.5 and later, you can use {{agent}} to collect data about {{kib}} and ship it to the monitoring cluster, rather than [using {{metricbeat}}](monitoring-metricbeat.md) or routing data through the production cluster as described in [Legacy collection methods](monitoring-kibana.md). +In 8.5 and later, you can use {{agent}} to collect data about {{kib}} and ship it to the monitoring cluster, rather than [using {{metricbeat}}](/deploy-manage/monitor/stack-monitoring/kibana-monitoring-metricbeat.md) or routing data through the production cluster as described in [Legacy collection methods](/deploy-manage/monitor/stack-monitoring/kibana-monitoring-legacy.md). To learn about monitoring in general, see [Monitor a cluster](../../monitor.md). @@ -52,4 +54,4 @@ To collect {{kib}} monitoring data, add a {{kib}} integration to an {{agent}} an 2. Follow the steps in the **Add agent** flyout to download, install, and enroll the {{agent}}. Make sure you choose the agent policy you created earlier. 9. Wait a minute or two until incoming data is confirmed. -10. [View the monitoring data in {{kib}}](monitoring-data.md). +10. [View the monitoring data in {{kib}}](/deploy-manage/monitor/monitoring-data.md). diff --git a/deploy-manage/monitor/stack-monitoring/monitoring-kibana.md b/deploy-manage/monitor/stack-monitoring/kibana-monitoring-legacy.md similarity index 94% rename from deploy-manage/monitor/stack-monitoring/monitoring-kibana.md rename to deploy-manage/monitor/stack-monitoring/kibana-monitoring-legacy.md index ecea504b4..289e25a12 100644 --- a/deploy-manage/monitor/stack-monitoring/monitoring-kibana.md +++ b/deploy-manage/monitor/stack-monitoring/kibana-monitoring-legacy.md @@ -2,6 +2,8 @@ navigation_title: "Legacy collection methods" mapped_pages: - https://www.elastic.co/guide/en/kibana/current/monitoring-kibana.html +applies: + stack: deprecated 7.16.0 --- @@ -16,7 +18,7 @@ If you enable the Elastic {{monitor-features}} in your cluster, you can optional If you have previously configured legacy collection methods, you should migrate to using {{agent}} or {{metricbeat}} collection. Do not use legacy collection alongside other collection methods. -For more information, refer to [Collect monitoring data with {{agent}}](monitoring-elastic-agent.md) and [Collect monitoring data with {{metricbeat}}](monitoring-metricbeat.md). +For more information, refer to [Collect monitoring data with {{agent}}](kibana-monitoring-elastic-agent.md) and [Collect monitoring data with {{metricbeat}}](kibana-monitoring-metricbeat.md). :::: @@ -75,5 +77,5 @@ To learn about monitoring in general, see [Monitor a cluster](../../monitor.md). 2. [Configure encryption for traffic between {{kib}} and {{es}}](https://www.elastic.co/guide/en/kibana/current/configuring-tls.html#configuring-tls-kib-es). 5. [Start {{kib}}](../../maintenance/start-stop-services/start-stop-kibana.md). -6. [View the monitoring data in {{kib}}](monitoring-data.md). +6. [View the monitoring data in {{kib}}](kibana-monitoring-data.md). diff --git a/deploy-manage/monitor/stack-monitoring/monitoring-metricbeat.md b/deploy-manage/monitor/stack-monitoring/kibana-monitoring-metricbeat.md similarity index 97% rename from deploy-manage/monitor/stack-monitoring/monitoring-metricbeat.md rename to deploy-manage/monitor/stack-monitoring/kibana-monitoring-metricbeat.md index 0a9039b3c..ec6840dbc 100644 --- a/deploy-manage/monitor/stack-monitoring/monitoring-metricbeat.md +++ b/deploy-manage/monitor/stack-monitoring/kibana-monitoring-metricbeat.md @@ -2,6 +2,8 @@ navigation_title: "Collect monitoring data with {{metricbeat}}" mapped_pages: - https://www.elastic.co/guide/en/kibana/current/monitoring-metricbeat.html +applies: + stack: all --- @@ -9,7 +11,7 @@ mapped_pages: # Collect monitoring data with Metricbeat [monitoring-metricbeat] -In 6.4 and later, you can use {{metricbeat}} to collect data about {{kib}} and ship it to the monitoring cluster, rather than routing it through the production cluster as described in [Legacy collection methods](monitoring-kibana.md). +In 6.4 and later, you can use {{metricbeat}} to collect data about {{kib}} and ship it to the monitoring cluster, rather than routing it through the production cluster as described in [Legacy collection methods](/deploy-manage/monitor/stack-monitoring/kibana-monitoring-legacy.md). :::{image} ../../../images/kibana-metricbeat.png :alt: Example monitoring architecture @@ -142,5 +144,5 @@ To learn about monitoring in general, see [Monitor a cluster](../../monitor.md). For more information about these configuration options, see [Configure the {{es}} output](https://www.elastic.co/guide/en/beats/metricbeat/current/elasticsearch-output.html). 9. [Start {{metricbeat}}](https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-starting.html). -10. [View the monitoring data in {{kib}}](monitoring-data.md). +10. [View the monitoring data in {{kib}}](/deploy-manage/monitor/monitoring-data.md). diff --git a/deploy-manage/monitor/stack-monitoring/kibana-monitoring-self-managed.md b/deploy-manage/monitor/stack-monitoring/kibana-monitoring-self-managed.md index f50125f5a..62b1e8e93 100644 --- a/deploy-manage/monitor/stack-monitoring/kibana-monitoring-self-managed.md +++ b/deploy-manage/monitor/stack-monitoring/kibana-monitoring-self-managed.md @@ -1,7 +1,9 @@ --- -navigation_title: "Configure monitoring" +navigation_title: "Kibana self-managed" mapped_pages: - https://www.elastic.co/guide/en/kibana/current/configuring-monitoring.html +applies: + stack: all --- @@ -11,11 +13,11 @@ mapped_pages: If you enable the {{monitor-features}} in your cluster, there are a few methods available to collect metrics about {{kib}}: -* [{{agent}} collection](monitoring-elastic-agent.md): Uses a single agent to gather logs and metrics. Can be managed from a central location in {{fleet}}. -* [{{metricbeat}} collection](monitoring-metricbeat.md): Uses a lightweight {{beats}} shipper to gather metrics. May be preferred if you have an existing investment in {{beats}} or are not yet ready to use {{agent}}. -* [Legacy collection](monitoring-kibana.md): Uses internal collectors to gather metrics. Not recommended. If you have previously configured legacy collection methods, you should migrate to using {{agent}} or {{metricbeat}}. +* [{{agent}} collection](kibana-monitoring-elastic-agent.md): Uses a single agent to gather logs and metrics. Can be managed from a central location in {{fleet}}. +* [{{metricbeat}} collection](kibana-monitoring-metricbeat.md): Uses a lightweight {{beats}} shipper to gather metrics. May be preferred if you have an existing investment in {{beats}} or are not yet ready to use {{agent}}. +* [Legacy collection](/deploy-manage/monitor/stack-monitoring/kibana-monitoring-legacy.md): Uses internal collectors to gather metrics. Not recommended. If you have previously configured legacy collection methods, you should migrate to using {{agent}} or {{metricbeat}}. -You can also use {{kib}} to [visualize monitoring data from across the {{stack}}](monitoring-data.md). +You can also use {{kib}} to [visualize monitoring data from across the {{stack}}](kibana-monitoring-data.md). To learn about monitoring in general, see [Monitor a cluster](../../monitor.md). diff --git a/deploy-manage/toc.yml b/deploy-manage/toc.yml index 66a6f7a5b..610287092 100644 --- a/deploy-manage/toc.yml +++ b/deploy-manage/toc.yml @@ -687,7 +687,6 @@ toc: - file: manage-connectors.md - file: monitor.md children: - - file: monitor/monitoring-overview.md - file: monitor/autoops.md children: - file: monitor/autoops/ec-autoops-how-to-access.md @@ -705,15 +704,15 @@ toc: - file: monitor/autoops/ec-autoops-faq.md - file: monitor/stack-monitoring.md children: - - file: monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md - - file: monitor/stack-monitoring/enable-stack-monitoring-on-eck-deployments.md + - file: monitor/stack-monitoring/elastic-cloud-stack-monitoring.md + - file: monitor/stack-monitoring/eck-stack-monitoring.md children: - file: monitor/stack-monitoring/k8s_connect_to_an_external_monitoring_elasticsearch_cluster.md - file: monitor/stack-monitoring/k8s_when_to_use_it.md - file: monitor/stack-monitoring/k8s_how_it_works.md - file: monitor/stack-monitoring/k8s_audit_logging.md - file: monitor/stack-monitoring/k8s_override_the_beats_pod_template.md - - file: monitor/stack-monitoring/enable-stack-monitoring-on-ece-deployments.md + - file: monitor/stack-monitoring/ece-stack-monitoring.md children: - file: monitor/stack-monitoring/ece-restrictions-monitoring.md - file: monitor/stack-monitoring/elasticsearch-monitoring-self-managed.md @@ -721,19 +720,19 @@ toc: - file: monitor/stack-monitoring/collecting-monitoring-data-with-elastic-agent.md - file: monitor/stack-monitoring/collecting-monitoring-data-with-metricbeat.md - file: monitor/stack-monitoring/collecting-log-data-with-filebeat.md - - file: monitor/stack-monitoring/legacy-collection-methods.md + - file: monitor/stack-monitoring/es-legacy-collection-methods.md children: - file: monitor/stack-monitoring/es-monitoring-collectors.md - file: monitor/stack-monitoring/es-monitoring-exporters.md - - file: monitor/stack-monitoring/local-exporter.md - - file: monitor/stack-monitoring/http-exporter.md - - file: monitor/stack-monitoring/pause-export.md + - file: monitor/stack-monitoring/es-local-exporter.md + - file: monitor/stack-monitoring/es-http-exporter.md + - file: monitor/stack-monitoring/es-pause-export.md - file: monitor/stack-monitoring/kibana-monitoring-self-managed.md children: - - file: monitor/stack-monitoring/monitoring-elastic-agent.md - - file: monitor/stack-monitoring/monitoring-metricbeat.md - - file: monitor/stack-monitoring/monitoring-data.md - - file: monitor/stack-monitoring/monitoring-kibana.md + - file: monitor/stack-monitoring/kibana-monitoring-elastic-agent.md + - file: monitor/stack-monitoring/kibana-monitoring-metricbeat.md + - file: monitor/stack-monitoring/kibana-monitoring-data.md + - file: monitor/stack-monitoring/kibana-monitoring-legacy.md - file: monitor/orchestrators.md children: - file: monitor/orchestrators/eck-metrics-configuration.md @@ -762,10 +761,6 @@ toc: children: - file: monitor/monitoring-data/ec-memory-pressure.md - file: monitor/monitoring-data/ec-vcpu-boost-instance.md - - file: monitor/monitoring-data/ech-saas-metrics-accessing.md - children: - - file: monitor/monitoring-data/ech-memory-pressure.md - - file: monitor/monitoring-data/ech-vcpu-boost-instance.md - file: monitor/monitoring-data/configure-stack-monitoring-alerts.md - file: monitor/monitoring-data/configuring-data-streamsindices-for-monitoring.md children: @@ -780,8 +775,8 @@ toc: - file: monitor/logging-configuration/elasticsearch-deprecation-logs.md - file: monitor/logging-configuration/kibana-logging.md children: - - file: monitor/logging-configuration/log-settings-examples.md - - file: monitor/logging-configuration/_cli_configuration.md + - file: monitor/logging-configuration/kibana-log-settings-examples.md + - file: monitor/logging-configuration/kibana-logging-cli-configuration.md - file: monitor/logging-configuration/security-event-audit-logging.md children: - file: monitor/logging-configuration/enabling-elasticsearch-audit-logs.md diff --git a/deploy-manage/users-roles/cloud-enterprise-orchestrator/manage-users-roles.md b/deploy-manage/users-roles/cloud-enterprise-orchestrator/manage-users-roles.md index 91c55f16e..68097cad4 100644 --- a/deploy-manage/users-roles/cloud-enterprise-orchestrator/manage-users-roles.md +++ b/deploy-manage/users-roles/cloud-enterprise-orchestrator/manage-users-roles.md @@ -62,7 +62,7 @@ We strongly recommend using three availability zones with at least 1 GB Elastics 1. [Log into the Cloud UI](../../deploy/cloud-enterprise/log-into-cloud-ui.md). 2. Go to **Deployments** a select the **security-cluster**. 3. Configure regular snapshots of the security deployment. This is critical if you plan to create any native users. -4. Optional: [Enable monitoring](../../monitor/stack-monitoring/enable-stack-monitoring-on-ece-deployments.md) on the security deployment to a dedicated monitoring deployment. +4. Optional: [Enable monitoring](../../monitor/stack-monitoring/ece-stack-monitoring.md) on the security deployment to a dedicated monitoring deployment. If you have authentication issues, you check out the security deployment Elasticsearch logs. diff --git a/raw-migrated-files/cloud-on-k8s/cloud-on-k8s/k8s-advanced-topics.md b/raw-migrated-files/cloud-on-k8s/cloud-on-k8s/k8s-advanced-topics.md index 81570d702..227a31684 100644 --- a/raw-migrated-files/cloud-on-k8s/cloud-on-k8s/k8s-advanced-topics.md +++ b/raw-migrated-files/cloud-on-k8s/cloud-on-k8s/k8s-advanced-topics.md @@ -7,6 +7,6 @@ * [*Traffic Splitting*](../../../deploy-manage/deploy/cloud-on-k8s/requests-routing-to-elasticsearch-nodes.md) * [*Network policies*](../../../deploy-manage/deploy/cloud-on-k8s/network-policies.md) * [*Webhook namespace selectors*](../../../deploy-manage/deploy/cloud-on-k8s/webhook-namespace-selectors.md) -* [*Stack Monitoring*](../../../deploy-manage/monitor/stack-monitoring/enable-stack-monitoring-on-eck-deployments.md) +* [*Stack Monitoring*](../../../deploy-manage/monitor/stack-monitoring/eck-stack-monitoring.md) * [*Deploy a FIPS compatible version of ECK*](../../../deploy-manage/deploy/cloud-on-k8s/deploy-fips-compatible-version-of-eck.md) diff --git a/raw-migrated-files/cloud/cloud-enterprise/ece-monitoring-deployments.md b/raw-migrated-files/cloud/cloud-enterprise/ece-monitoring-deployments.md index a9a365cd7..023d560f6 100644 --- a/raw-migrated-files/cloud/cloud-enterprise/ece-monitoring-deployments.md +++ b/raw-migrated-files/cloud/cloud-enterprise/ece-monitoring-deployments.md @@ -4,7 +4,7 @@ Elastic Cloud Enterprise monitors many aspects of your installation, but some is * [Find clusters](../../../troubleshoot/deployments/cloud-enterprise/finding-deployments-finding-problems.md) that have issues. * [Move affected nodes off an allocator](../../../deploy-manage/maintenance/ece/move-nodes-instances-from-allocators.md), if the allocator fails. -* [Enable deployment logging and monitoring](../../../deploy-manage/monitor/stack-monitoring/enable-stack-monitoring-on-ece-deployments.md) to keep an eye on the performance of deployments and debug stack and solution issues. +* [Enable deployment logging and monitoring](../../../deploy-manage/monitor/stack-monitoring/ece-stack-monitoring.md) to keep an eye on the performance of deployments and debug stack and solution issues. In addition to the monitoring of clusters that is described here, don’t forget that Elastic Cloud Enterprise also provides [monitoring information for your entire installation](../../../deploy-manage/monitor/orchestrators/ece-platform-monitoring.md). You can you also monitor the physical hosts machines on which Elastic Cloud Enterprise is installed. diff --git a/raw-migrated-files/cloud/cloud-heroku/ech-add-user-settings.md b/raw-migrated-files/cloud/cloud-heroku/ech-add-user-settings.md index 9e47a524c..1d18b41e1 100644 --- a/raw-migrated-files/cloud/cloud-heroku/ech-add-user-settings.md +++ b/raw-migrated-files/cloud/cloud-heroku/ech-add-user-settings.md @@ -275,7 +275,7 @@ The following audit settings are supported: : A list of action names or wildcards. The specified policy will not print audit events for actions matching these values. ::::{note} -To enable auditing you must first [enable deployment logging](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md). +To enable auditing you must first [enable deployment logging](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md). :::: diff --git a/raw-migrated-files/cloud/cloud-heroku/ech-config-change-errors.md b/raw-migrated-files/cloud/cloud-heroku/ech-config-change-errors.md index 43954439a..c07453399 100644 --- a/raw-migrated-files/cloud/cloud-heroku/ech-config-change-errors.md +++ b/raw-migrated-files/cloud/cloud-heroku/ech-config-change-errors.md @@ -6,7 +6,7 @@ When you attempt to apply a configuration change to a deployment, the attempt ma :alt: A screen capture of the deployment page showing an error: Latest change to {{es}} configuration failed. ::: -To help diagnose these and any other types of issues in your deployments, we recommend [setting up monitoring](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md). Then, you can easily view your deployment health and access log files to troubleshoot this configuration failure. +To help diagnose these and any other types of issues in your deployments, we recommend [setting up monitoring](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md). Then, you can easily view your deployment health and access log files to troubleshoot this configuration failure. To confirm if your Elasticsearch cluster is bootlooping, you can check the most recent plan under your [Deployment Activity page](../../../deploy-manage/deploy/elastic-cloud/keep-track-of-deployment-activity.md) for the error: @@ -114,7 +114,7 @@ To view any added plugins or bundles: Configuration change errors can occur when there is insufficient RAM configured for a data tier. In this case, the cluster typically also shows OOM (out of memory) errors. To resolve these, you need to increase the amount of heap memory, which is half of the amount of memory allocated to a cluster. You might also detect OOM in plan changes via their [related exit codes](https://www.elastic.co/guide/en/elasticsearch/reference/current/stopping-elasticsearch.html#fatal-errors) `127`, `137`, and `158`. -Check the [{{es}} cluster size](../../../deploy-manage/deploy/elastic-cloud/ech-customize-deployment-components.md#ech-cluster-size) and the [JVM memory pressure indicator](../../../deploy-manage/monitor/monitoring-data/ech-memory-pressure.md) documentation to learn more. +Check the [{{es}} cluster size](../../../deploy-manage/deploy/elastic-cloud/ech-customize-deployment-components.md#ech-cluster-size) and the [JVM memory pressure indicator](/deploy-manage/monitor/monitoring-data/ec-memory-pressure.md) documentation to learn more. As well, you can read our detailed blog [Managing and troubleshooting {{es}} memory](https://www.elastic.co/blog/managing-and-troubleshooting-elasticsearch-memory). diff --git a/raw-migrated-files/cloud/cloud-heroku/ech-configure.md b/raw-migrated-files/cloud/cloud-heroku/ech-configure.md index 983733279..87dadfc13 100644 --- a/raw-migrated-files/cloud/cloud-heroku/ech-configure.md +++ b/raw-migrated-files/cloud/cloud-heroku/ech-configure.md @@ -5,7 +5,7 @@ The information in this section covers: * [Plan for production](../../../deploy-manage/production-guidance/plan-for-production-elastic-cloud.md) - Plan for a highly available and scalable deployment. * [Configure your deployment](../../../deploy-manage/deploy/elastic-cloud/ech-configure-settings.md) - Customize your cluster through a full list of settings. * [Enable Kibana](../../../deploy-manage/deploy/elastic-cloud/access-kibana.md) - Explore your data with the Elastic Stack visualization platform. -* [Enable Logging and Monitoring](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) - Monitor your cluster’s health and performance and ingest your deployment’s logs. +* [Enable Logging and Monitoring](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md) - Monitor your cluster’s health and performance and ingest your deployment’s logs. * [Upgrade versions](../../../deploy-manage/upgrade/deployment-or-cluster.md) - Stay current with the latest Elastic Stack versions. * [Delete your deployment](../../../deploy-manage/uninstall/delete-a-cloud-deployment.md) - No undo. Data is lost and billing stops. diff --git a/raw-migrated-files/cloud/cloud-heroku/ech-enable-logging-and-monitoring.md b/raw-migrated-files/cloud/cloud-heroku/ech-enable-logging-and-monitoring.md index c1132d338..6dc86b960 100644 --- a/raw-migrated-files/cloud/cloud-heroku/ech-enable-logging-and-monitoring.md +++ b/raw-migrated-files/cloud/cloud-heroku/ech-enable-logging-and-monitoring.md @@ -15,7 +15,7 @@ The steps in this section cover only the enablement of the monitoring and loggin ### Before you begin [ech-logging-and-monitoring-limitations] -Some limitations apply when you use monitoring on Elasticsearch Add-On for Heroku. To learn more, check the monitoring [restrictions and limitations](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md). +Some limitations apply when you use monitoring on Elasticsearch Add-On for Heroku. To learn more, check the monitoring [restrictions and limitations](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md). ### Monitoring for production use [ech-logging-and-monitoring-production] @@ -32,7 +32,7 @@ How many monitoring deployments you use depends on your requirements: * If you need to silo {{es}} data for different business departments. Deployments that have been configured to ship logs and metrics to a target monitoring deployment have access to indexing data and can manage monitoring index templates, which is addressed by creating separate monitoring deployments. -Logs and metrics that get sent to a dedicated monitoring {{es}} deployment [may not be cleaned up automatically](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ech-logging-and-monitoring-retention) and might require some additional steps to remove excess data periodically. +Logs and metrics that get sent to a dedicated monitoring {{es}} deployment [may not be cleaned up automatically](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ech-logging-and-monitoring-retention) and might require some additional steps to remove excess data periodically. ### Retention of monitoring daily indices [ech-logging-and-monitoring-retention] @@ -48,7 +48,7 @@ When you enable monitoring in Elasticsearch Add-On for Heroku, your monitoring i $$$ech-logging-and-monitoring-retention-7$$$ When you enable self-monitoring in Elasticsearch Add-On for Heroku, your monitoring indices are retained for a certain period by default. After the retention period has passed, the monitoring indices are deleted automatically. Monitoring data is retained for three days by default or as specified by the [`xpack.monitoring.history.duration` user setting](../../../deploy-manage/deploy/elastic-cloud/edit-stack-settings.md#xpack-monitoring-history-duration). -To retain monitoring indices as is without deleting them automatically, you must disable the [cleaner service](../../../deploy-manage/monitor/stack-monitoring/local-exporter.md#local-exporter-cleaner) by adding a disabled local exporter in your cluster settings. +To retain monitoring indices as is without deleting them automatically, you must disable the [cleaner service](../../../deploy-manage/monitor/stack-monitoring/es-local-exporter.md#local-exporter-cleaner) by adding a disabled local exporter in your cluster settings. For example @@ -67,9 +67,9 @@ PUT /_cluster/settings ### Sending monitoring data to a dedicated monitoring deployment [ech-logging-and-monitoring-retention-dedicated-monitoring] -When [monitoring for production use](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ech-logging-and-monitoring-production), where you configure your deployments **to send monitoring data to a dedicated monitoring deployment** for indexing, this retention period does not apply. Monitoring indices on a dedicated monitoring deployment are retained until you remove them. There are three options open to you: +When [monitoring for production use](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ech-logging-and-monitoring-production), where you configure your deployments **to send monitoring data to a dedicated monitoring deployment** for indexing, this retention period does not apply. Monitoring indices on a dedicated monitoring deployment are retained until you remove them. There are three options open to you: -* To enable the automatic deletion of monitoring indices from dedicated monitoring deployments, [enable monitoring](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ech-enable-logging-and-monitoring-steps) on your dedicated monitoring deployment in Elasticsearch Add-On for Heroku to send monitoring data to itself. When an {{es}} deployment sends monitoring data to itself, all monitoring indices are deleted automatically after the retention period, regardless of the origin of the monitoring data. +* To enable the automatic deletion of monitoring indices from dedicated monitoring deployments, [enable monitoring](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ech-enable-logging-and-monitoring-steps) on your dedicated monitoring deployment in Elasticsearch Add-On for Heroku to send monitoring data to itself. When an {{es}} deployment sends monitoring data to itself, all monitoring indices are deleted automatically after the retention period, regardless of the origin of the monitoring data. * Alternatively, you can enable the cleaner service on the monitoring deployment by creating a local exporter. You can define the retention period at the same time. For example diff --git a/raw-migrated-files/cloud/cloud-heroku/ech-manage-apm-settings.md b/raw-migrated-files/cloud/cloud-heroku/ech-manage-apm-settings.md index ef8af03c9..8c60608b7 100644 --- a/raw-migrated-files/cloud/cloud-heroku/ech-manage-apm-settings.md +++ b/raw-migrated-files/cloud/cloud-heroku/ech-manage-apm-settings.md @@ -361,7 +361,7 @@ Allow anonymous access only for specified agents and/or services. This is primar : The period after which to log the internal metrics. Defaults to *30s*. ::::{note} -To change logging settings you must first [enable deployment logging](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md). +To change logging settings you must first [enable deployment logging](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md). :::: diff --git a/raw-migrated-files/cloud/cloud-heroku/ech-manage-kibana-settings.md b/raw-migrated-files/cloud/cloud-heroku/ech-manage-kibana-settings.md index 265c07d36..d1530f630 100644 --- a/raw-migrated-files/cloud/cloud-heroku/ech-manage-kibana-settings.md +++ b/raw-migrated-files/cloud/cloud-heroku/ech-manage-kibana-settings.md @@ -784,7 +784,7 @@ If search latency in {{es}} is sufficiently high, such as if you are using cross ## Logging and audit settings [echlogging_and_audit_settings] ::::{note} -To change logging settings or to enable auditing you must first [enable deployment logging](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md). +To change logging settings or to enable auditing you must first [enable deployment logging](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md). :::: diff --git a/raw-migrated-files/cloud/cloud-heroku/ech-monitoring-setup.md b/raw-migrated-files/cloud/cloud-heroku/ech-monitoring-setup.md index 2e5da2114..6012ad4ec 100644 --- a/raw-migrated-files/cloud/cloud-heroku/ech-monitoring-setup.md +++ b/raw-migrated-files/cloud/cloud-heroku/ech-monitoring-setup.md @@ -22,7 +22,7 @@ After you have created a new deployment, you should enable shipping logs and met 4. Choose where to send your logs and metrics. ::::{important} - Anything used for production should go to a separate deployment you create only for monitoring. For development or testing, you can send monitoring data to the same deployment. Check [Enable logging and monitoring](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ech-logging-and-monitoring-production). + Anything used for production should go to a separate deployment you create only for monitoring. For development or testing, you can send monitoring data to the same deployment. Check [Enable logging and monitoring](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ech-logging-and-monitoring-production). :::: 5. Select **Save**. @@ -59,7 +59,7 @@ To learn more about what [Elasticsearch monitoring metrics](https://www.elastic. :alt: Node tab in Kibana under Stack Monitoring ::: -Some [performance metrics](../../../deploy-manage/monitor/monitoring-data/ech-saas-metrics-accessing.md) are also available directly in the [Elasticsearch Add-On for Heroku console](https://cloud.elastic.co?page=docs&placement=docs-body) and don’t require looking at your monitoring deployment. If you’re ever in a rush to determine if there is a performance problem, you can get a quick overview by going to the **Performance** page from your deployment menu: +Some [performance metrics](/deploy-manage/monitor/monitoring-data/access-performance-metrics-on-elastic-cloud.md) are also available directly in the [Elasticsearch Add-On for Heroku console](https://cloud.elastic.co?page=docs&placement=docs-body) and don’t require looking at your monitoring deployment. If you’re ever in a rush to determine if there is a performance problem, you can get a quick overview by going to the **Performance** page from your deployment menu: :::{image} ../../../images/cloud-heroku-ec-ce-monitoring-performance.png :alt: Performance page of the Elastic Cloud console diff --git a/raw-migrated-files/cloud/cloud-heroku/ech-monitoring.md b/raw-migrated-files/cloud/cloud-heroku/ech-monitoring.md index 93c9d77dc..5672d0de2 100644 --- a/raw-migrated-files/cloud/cloud-heroku/ech-monitoring.md +++ b/raw-migrated-files/cloud/cloud-heroku/ech-monitoring.md @@ -8,12 +8,12 @@ The most important of these is the {{es}} cluster, because it is the heart of th This section provides some best practices to help you monitor and understand the ongoing state of your deployments and their resources. -* [{{es}} cluster health](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ech-es-cluster-health) -* [{{es}} cluster performance](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ech-es-cluster-performance) -* [Health warnings](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ech-es-health-warnings) -* [Preconfigured logs and metrics](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ech-es-health-preconfigured) -* [Dedicated logs and metrics](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ech-es-health-dedicated) -* [Understanding deployment health](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ech-health-best-practices) +* [{{es}} cluster health](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ech-es-cluster-health) +* [{{es}} cluster performance](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ech-es-cluster-performance) +* [Health warnings](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ech-es-health-warnings) +* [Preconfigured logs and metrics](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ech-es-health-preconfigured) +* [Dedicated logs and metrics](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ech-es-health-dedicated) +* [Understanding deployment health](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ech-health-best-practices) ## {{es}} cluster health [ech-es-cluster-health] @@ -56,7 +56,7 @@ For each issue you can either use a troubleshooting link or get a suggestion to ## {{es}} cluster performance [ech-es-cluster-performance] -The deployment **Health** page does not include information on cluster performance. If you observe issues on search and ingest operations in terms of increased latency or throughput for queries, these might not be directly reported on the **Health** page, unless they are related to shard health or master node availability. The performance page and the out-of-the-box logs allow you to monitor your cluster performance, but for production applications we strongly recommend setting up a dedicated monitoring cluster. Check [Understanding deployment health](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ech-health-best-practices), for more guidelines on how to monitor you cluster performance. +The deployment **Health** page does not include information on cluster performance. If you observe issues on search and ingest operations in terms of increased latency or throughput for queries, these might not be directly reported on the **Health** page, unless they are related to shard health or master node availability. The performance page and the out-of-the-box logs allow you to monitor your cluster performance, but for production applications we strongly recommend setting up a dedicated monitoring cluster. Check [Understanding deployment health](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ech-health-best-practices), for more guidelines on how to monitor you cluster performance. ## Health warnings [ech-es-health-warnings] @@ -75,7 +75,7 @@ Configuration change failures Out of memory errors : Out of memory errors (OOMs) may occur during your deployment’s normal operations, and these can have a very negative impact on performance. Common causes of memory shortages are oversharding, data retention oversights, and the overall request volume. - On your deployment page, you can check the [JVM memory pressure indicator](../../../deploy-manage/monitor/monitoring-data/ech-memory-pressure.md) to get the current memory usage of each node of your deployment. You can also review the [common causes of high JVM memory usage](../../../deploy-manage/monitor/monitoring-data/ech-memory-pressure.md#ech-memory-pressure-causes) to help diagnose the source of unexpectedly high memory pressure levels. To learn more, check [How does high memory pressure affect performance?](../../../troubleshoot/monitoring/high-memory-pressure.md). + On your deployment page, you can check the [JVM memory pressure indicator]/deploy-manage/monitor/monitoring-data/ec-memory-pressure.md) to get the current memory usage of each node of your deployment. You can also review the [common causes of high JVM memory usage](/deploy-manage/monitor/monitoring-data/ec-memory-pressure.md#ec-memory-pressure-causes) to help diagnose the source of unexpectedly high memory pressure levels. To learn more, check [How does high memory pressure affect performance?](../../../troubleshoot/monitoring/high-memory-pressure.md). @@ -94,7 +94,7 @@ In a production environment, it’s important set up dedicated health monitoring You have the option of sending logs and metrics to a separate, specialized monitoring deployment, which ensures that they’re available in the event of a deployment outage. The monitoring deployment also gives you access to Kibana’s stack monitoring features, through which you can view health and performance data for all of your deployment resources. -Check the guide on [how to set up monitoring](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) to learn more. +Check the guide on [how to set up monitoring](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md) to learn more. ## Understanding deployment health [ech-health-best-practices] diff --git a/raw-migrated-files/cloud/cloud-heroku/ech-planning.md b/raw-migrated-files/cloud/cloud-heroku/ech-planning.md index 1ced65ff9..c55030dec 100644 --- a/raw-migrated-files/cloud/cloud-heroku/ech-planning.md +++ b/raw-migrated-files/cloud/cloud-heroku/ech-planning.md @@ -44,7 +44,7 @@ Clusters that only have one master node are not highly available and are at risk ## Do you know when to scale? [ech-workloads] -Knowing how to scale your deployment is critical, especially when unexpected workloads hits. Don’t forget to [check your performance metrics](../../../deploy-manage/monitor/monitoring-data/ech-saas-metrics-accessing.md) to make sure your deployments are healthy and can cope with your workloads. +Knowing how to scale your deployment is critical, especially when unexpected workloads hits. Don’t forget to [check your performance metrics](/deploy-manage/monitor/monitoring-data/access-performance-metrics-on-elastic-cloud.md) to make sure your deployments are healthy and can cope with your workloads. Scaling with Elasticsearch Add-On for Heroku is easy: diff --git a/raw-migrated-files/cloud/cloud-heroku/ech-saas-metrics-accessing.md b/raw-migrated-files/cloud/cloud-heroku/ech-saas-metrics-accessing.md index d71b5a9a4..44d91f857 100644 --- a/raw-migrated-files/cloud/cloud-heroku/ech-saas-metrics-accessing.md +++ b/raw-migrated-files/cloud/cloud-heroku/ech-saas-metrics-accessing.md @@ -2,7 +2,7 @@ Cluster performance metrics are available directly in the [Elasticsearch Add-On for Heroku console](https://cloud.elastic.co?page=docs&placement=docs-body). The graphs on this page include a subset of Elasticsearch Add-On for Heroku-specific performance metrics. -For advanced views or production monitoring, [enable logging and monitoring](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md). The monitoring application provides more advanced views for Elasticsearch and JVM metrics, and includes a configurable retention period. +For advanced views or production monitoring, [enable logging and monitoring](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md). The monitoring application provides more advanced views for Elasticsearch and JVM metrics, and includes a configurable retention period. To access cluster performance metrics: @@ -31,7 +31,7 @@ Shows the maximum usage of the CPU resources assigned to your Elasticsearch clus :alt: Graph showing available CPU credits ::: -Shows your remaining CPU credits, measured in seconds of CPU time. CPU credits enable the boosting of CPU resources assigned to your cluster to improve performance temporarily when it is needed most. For more details check [How to use vCPU to boost your instance](../../../deploy-manage/monitor/monitoring-data/ech-vcpu-boost-instance.md). +Shows your remaining CPU credits, measured in seconds of CPU time. CPU credits enable the boosting of CPU resources assigned to your cluster to improve performance temporarily when it is needed most. For more details check [How to use vCPU to boost your instance](/deploy-manage/monitor/monitoring-data/ec-vcpu-boost-instance.md). ### Number of requests [echnumber_of_requests] diff --git a/raw-migrated-files/cloud/cloud-heroku/echscenario_why_is_my_node_unavailable.md b/raw-migrated-files/cloud/cloud-heroku/echscenario_why_is_my_node_unavailable.md index baa9f6601..dc76eadea 100644 --- a/raw-migrated-files/cloud/cloud-heroku/echscenario_why_is_my_node_unavailable.md +++ b/raw-migrated-files/cloud/cloud-heroku/echscenario_why_is_my_node_unavailable.md @@ -1,6 +1,6 @@ # Diagnose unavailable nodes [echscenario_why_is_my_node_unavailable] -This section provides a list of common symptoms and possible actions that you can take to resolve issues when one or more nodes become unhealthy or unavailable. This guide is particularly useful if you are not [shipping your logs and metrics](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) to a dedicated monitoring cluster. +This section provides a list of common symptoms and possible actions that you can take to resolve issues when one or more nodes become unhealthy or unavailable. This guide is particularly useful if you are not [shipping your logs and metrics](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md) to a dedicated monitoring cluster. **What are the symptoms?** diff --git a/raw-migrated-files/cloud/cloud/ec-add-user-settings.md b/raw-migrated-files/cloud/cloud/ec-add-user-settings.md index c8db3ae68..2a9c70422 100644 --- a/raw-migrated-files/cloud/cloud/ec-add-user-settings.md +++ b/raw-migrated-files/cloud/cloud/ec-add-user-settings.md @@ -275,7 +275,7 @@ The following audit settings are supported: : A list of action names or wildcards. The specified policy will not print audit events for actions matching these values. ::::{note} -To enable auditing you must first [enable deployment logging](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md). +To enable auditing you must first [enable deployment logging](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md). :::: diff --git a/raw-migrated-files/cloud/cloud/ec-config-change-errors.md b/raw-migrated-files/cloud/cloud/ec-config-change-errors.md index 6cc173b76..2edce8d83 100644 --- a/raw-migrated-files/cloud/cloud/ec-config-change-errors.md +++ b/raw-migrated-files/cloud/cloud/ec-config-change-errors.md @@ -6,7 +6,7 @@ When you attempt to apply a configuration change to a deployment, the attempt ma :alt: A screen capture of the deployment page showing an error: Latest change to {{es}} configuration failed. ::: -To help diagnose these and any other types of issues in your deployments, we recommend [setting up monitoring](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md). Then, you can easily view your deployment health and access log files to troubleshoot this configuration failure. +To help diagnose these and any other types of issues in your deployments, we recommend [setting up monitoring](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md). Then, you can easily view your deployment health and access log files to troubleshoot this configuration failure. To confirm if your Elasticsearch cluster is bootlooping, you can check the most recent plan under your [Deployment Activity page](../../../deploy-manage/deploy/elastic-cloud/keep-track-of-deployment-activity.md) for the error: diff --git a/raw-migrated-files/cloud/cloud/ec-enable-logging-and-monitoring.md b/raw-migrated-files/cloud/cloud/ec-enable-logging-and-monitoring.md index ffe0d7002..f30761a88 100644 --- a/raw-migrated-files/cloud/cloud/ec-enable-logging-and-monitoring.md +++ b/raw-migrated-files/cloud/cloud/ec-enable-logging-and-monitoring.md @@ -15,7 +15,7 @@ The steps in this section cover only the enablement of the monitoring and loggin ### Before you begin [ec-logging-and-monitoring-limitations] -Some limitations apply when you use monitoring on Elasticsearch Service. To learn more, check the monitoring [restrictions and limitations](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ec-restrictions-monitoring). +Some limitations apply when you use monitoring on Elasticsearch Service. To learn more, check the monitoring [restrictions and limitations](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ec-restrictions-monitoring). ### Monitoring for production use [ec-logging-and-monitoring-production] @@ -32,7 +32,7 @@ How many monitoring deployments you use depends on your requirements: * If you need to silo {{es}} data for different business departments. Deployments that have been configured to ship logs and metrics to a target monitoring deployment have access to indexing data and can manage monitoring index templates, which is addressed by creating separate monitoring deployments. -Logs and metrics that get sent to a dedicated monitoring {{es}} deployment [may not be cleaned up automatically](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ec-logging-and-monitoring-retention) and might require some additional steps to remove excess data periodically. +Logs and metrics that get sent to a dedicated monitoring {{es}} deployment [may not be cleaned up automatically](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ec-logging-and-monitoring-retention) and might require some additional steps to remove excess data periodically. ### Retention of monitoring daily indices [ec-logging-and-monitoring-retention] @@ -48,7 +48,7 @@ When you enable monitoring in Elasticsearch Service, your monitoring indices are $$$ec-logging-and-monitoring-retention-7$$$ When you enable self-monitoring in Elasticsearch Service, your monitoring indices are retained for a certain period by default. After the retention period has passed, the monitoring indices are deleted automatically. Monitoring data is retained for three days by default or as specified by the [`xpack.monitoring.history.duration` user setting](../../../deploy-manage/deploy/elastic-cloud/edit-stack-settings.md#xpack-monitoring-history-duration). -To retain monitoring indices as is without deleting them automatically, you must disable the [cleaner service](../../../deploy-manage/monitor/stack-monitoring/local-exporter.md#local-exporter-cleaner) by adding a disabled local exporter in your cluster settings. +To retain monitoring indices as is without deleting them automatically, you must disable the [cleaner service](../../../deploy-manage/monitor/stack-monitoring/es-local-exporter.md#local-exporter-cleaner) by adding a disabled local exporter in your cluster settings. For example @@ -67,9 +67,9 @@ PUT /_cluster/settings ### Sending monitoring data to a dedicated monitoring deployment [ec-logging-and-monitoring-retention-dedicated-monitoring] -When [monitoring for production use](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ec-logging-and-monitoring-production), where you configure your deployments **to send monitoring data to a dedicated monitoring deployment** for indexing, this retention period does not apply. Monitoring indices on a dedicated monitoring deployment are retained until you remove them. There are three options open to you: +When [monitoring for production use](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ec-logging-and-monitoring-production), where you configure your deployments **to send monitoring data to a dedicated monitoring deployment** for indexing, this retention period does not apply. Monitoring indices on a dedicated monitoring deployment are retained until you remove them. There are three options open to you: -* To enable the automatic deletion of monitoring indices from dedicated monitoring deployments, [enable monitoring](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ec-enable-logging-and-monitoring-steps) on your dedicated monitoring deployment in Elasticsearch Service to send monitoring data to itself. When an {{es}} deployment sends monitoring data to itself, all monitoring indices are deleted automatically after the retention period, regardless of the origin of the monitoring data. +* To enable the automatic deletion of monitoring indices from dedicated monitoring deployments, [enable monitoring](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ec-enable-logging-and-monitoring-steps) on your dedicated monitoring deployment in Elasticsearch Service to send monitoring data to itself. When an {{es}} deployment sends monitoring data to itself, all monitoring indices are deleted automatically after the retention period, regardless of the origin of the monitoring data. * Alternatively, you can enable the cleaner service on the monitoring deployment by creating a local exporter. You can define the retention period at the same time. For example diff --git a/raw-migrated-files/cloud/cloud/ec-manage-apm-settings.md b/raw-migrated-files/cloud/cloud/ec-manage-apm-settings.md index 3833f6291..482d9bc59 100644 --- a/raw-migrated-files/cloud/cloud/ec-manage-apm-settings.md +++ b/raw-migrated-files/cloud/cloud/ec-manage-apm-settings.md @@ -361,7 +361,7 @@ Allow anonymous access only for specified agents and/or services. This is primar : The period after which to log the internal metrics. Defaults to *30s*. ::::{note} -To change logging settings you must first [enable deployment logging](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md). +To change logging settings you must first [enable deployment logging](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md). :::: diff --git a/raw-migrated-files/cloud/cloud/ec-manage-kibana-settings.md b/raw-migrated-files/cloud/cloud/ec-manage-kibana-settings.md index dd71202fa..3b52d8ee2 100644 --- a/raw-migrated-files/cloud/cloud/ec-manage-kibana-settings.md +++ b/raw-migrated-files/cloud/cloud/ec-manage-kibana-settings.md @@ -784,7 +784,7 @@ If search latency in {{es}} is sufficiently high, such as if you are using cross ## Logging and audit settings [ec_logging_and_audit_settings] ::::{note} -To change logging settings or to enable auditing you must first [enable deployment logging](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md). +To change logging settings or to enable auditing you must first [enable deployment logging](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md). :::: diff --git a/raw-migrated-files/cloud/cloud/ec-monitoring-setup.md b/raw-migrated-files/cloud/cloud/ec-monitoring-setup.md index 5819719d6..2f2b33093 100644 --- a/raw-migrated-files/cloud/cloud/ec-monitoring-setup.md +++ b/raw-migrated-files/cloud/cloud/ec-monitoring-setup.md @@ -22,7 +22,7 @@ After you have created a new deployment, you should enable shipping logs and met 4. Choose where to send your logs and metrics. ::::{important} - Anything used for production should go to a separate deployment you create only for monitoring. For development or testing, you can send monitoring data to the same deployment. Check [Enable logging and monitoring](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ec-logging-and-monitoring-production). + Anything used for production should go to a separate deployment you create only for monitoring. For development or testing, you can send monitoring data to the same deployment. Check [Enable logging and monitoring](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ec-logging-and-monitoring-production). :::: 5. Select **Save**. diff --git a/raw-migrated-files/cloud/cloud/ec-monitoring.md b/raw-migrated-files/cloud/cloud/ec-monitoring.md index 2e9d502d7..e4f239d3e 100644 --- a/raw-migrated-files/cloud/cloud/ec-monitoring.md +++ b/raw-migrated-files/cloud/cloud/ec-monitoring.md @@ -102,7 +102,7 @@ You have the option of sending logs and metrics to a separate, specialized monit As part of health monitoring, it’s also a best practice to [configure alerting](../../../deploy-manage/monitor/monitoring-data/configure-stack-monitoring-alerts.md), so that you can be notified right away about any deployment health issues. -Check the guide on [how to set up monitoring](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) to learn more. +Check the guide on [how to set up monitoring](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md) to learn more. ## Understanding deployment health [ec-health-best-practices] diff --git a/raw-migrated-files/cloud/cloud/ec-prepare-production.md b/raw-migrated-files/cloud/cloud/ec-prepare-production.md index 838d72f4d..00c9d4226 100644 --- a/raw-migrated-files/cloud/cloud/ec-prepare-production.md +++ b/raw-migrated-files/cloud/cloud/ec-prepare-production.md @@ -8,5 +8,5 @@ To make sure you’re all set for production, consider the following actions: * [Add extensions and plugins](../../../deploy-manage/deploy/elastic-cloud/add-plugins-extensions.md) to use Elastic supported extensions or add your own custom dictionaries and scripts. * [Edit settings and defaults](../../../deploy-manage/deploy/elastic-cloud/edit-stack-settings.md) to fine tune the performance of specific features. * [Manage your deployment](../../../deploy-manage/deploy/elastic-cloud/manage-deployments.md) as a whole to restart, upgrade, stop routing, or delete. -* [Set up monitoring](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) to learn how to configure your deployments for observability, which includes metric and log collection, troubleshooting views, and cluster alerts to automate performance monitoring. +* [Set up monitoring](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md) to learn how to configure your deployments for observability, which includes metric and log collection, troubleshooting views, and cluster alerts to automate performance monitoring. diff --git a/raw-migrated-files/cloud/cloud/ec-saas-metrics-accessing.md b/raw-migrated-files/cloud/cloud/ec-saas-metrics-accessing.md index 3f47b0edb..8b5743c35 100644 --- a/raw-migrated-files/cloud/cloud/ec-saas-metrics-accessing.md +++ b/raw-migrated-files/cloud/cloud/ec-saas-metrics-accessing.md @@ -2,7 +2,7 @@ Cluster performance metrics are available directly in the [Elasticsearch Service Console](https://cloud.elastic.co?page=docs&placement=docs-body). The graphs on this page include a subset of Elasticsearch Service-specific performance metrics. -For advanced views or production monitoring, [enable logging and monitoring](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md). The monitoring application provides more advanced views for Elasticsearch and JVM metrics, and includes a configurable retention period. +For advanced views or production monitoring, [enable logging and monitoring](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md). The monitoring application provides more advanced views for Elasticsearch and JVM metrics, and includes a configurable retention period. To access cluster performance metrics: @@ -25,7 +25,7 @@ The following metrics are available: Shows the maximum usage of the CPU resources assigned to your Elasticsearch cluster, as a percentage. CPU resources are relative to the size of your cluster, so that a cluster with 32GB of RAM gets assigned twice as many CPU resources as a cluster with 16GB of RAM. All clusters are guaranteed their share of CPU resources, as Elasticsearch Service infrastructure does not overcommit any resources. CPU credits permit boosting the performance of smaller clusters temporarily, so that CPU usage can exceed 100%. ::::{tip} -This chart reports the maximum CPU values over the sampling period. [Logs and Metrics](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) ingested into [Stack Monitoring](../../../deploy-manage/monitor/monitoring-data/visualizing-monitoring-data.md)'s "CPU Usage" instead reflects the average CPU over the sampling period. Therefore, you should not expect the two graphs to look exactly the same. When investigating [CPU-related performance issues](../../../troubleshoot/monitoring/performance.md), you should default to [Stack Monitoring](../../../deploy-manage/monitor/monitoring-data/visualizing-monitoring-data.md). +This chart reports the maximum CPU values over the sampling period. [Logs and Metrics](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md) ingested into [Stack Monitoring](../../../deploy-manage/monitor/monitoring-data/visualizing-monitoring-data.md)'s "CPU Usage" instead reflects the average CPU over the sampling period. Therefore, you should not expect the two graphs to look exactly the same. When investigating [CPU-related performance issues](../../../troubleshoot/monitoring/performance.md), you should default to [Stack Monitoring](../../../deploy-manage/monitor/monitoring-data/visualizing-monitoring-data.md). :::: diff --git a/raw-migrated-files/cloud/cloud/ec-scenario_why_are_shards_unavailable.md b/raw-migrated-files/cloud/cloud/ec-scenario_why_are_shards_unavailable.md index 5cd5ae589..33421363b 100644 --- a/raw-migrated-files/cloud/cloud/ec-scenario_why_are_shards_unavailable.md +++ b/raw-migrated-files/cloud/cloud/ec-scenario_why_are_shards_unavailable.md @@ -150,7 +150,7 @@ The response is as follows: #### Check {{es}} cluster logs [ec-check-es-cluster-logs] -To determine the allocation issue, you can [check the logs](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md#ec-check-logs). This is easier if you have set up a dedicated monitoring deployment. +To determine the allocation issue, you can [check the logs](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md#ec-check-logs). This is easier if you have set up a dedicated monitoring deployment. ## Analyze unassigned shards using the Kibana UI [ec-analyze_shards_with-kibana] diff --git a/raw-migrated-files/cloud/cloud/ec-scenario_why_is_my_node_unavailable.md b/raw-migrated-files/cloud/cloud/ec-scenario_why_is_my_node_unavailable.md index 33b6861d9..4e80c77c1 100644 --- a/raw-migrated-files/cloud/cloud/ec-scenario_why_is_my_node_unavailable.md +++ b/raw-migrated-files/cloud/cloud/ec-scenario_why_is_my_node_unavailable.md @@ -1,6 +1,6 @@ # Diagnose unavailable nodes [ec-scenario_why_is_my_node_unavailable] -This section provides a list of common symptoms and possible actions that you can take to resolve issues when one or more nodes become unhealthy or unavailable. This guide is particularly useful if you are not [shipping your logs and metrics](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md) to a dedicated monitoring cluster. +This section provides a list of common symptoms and possible actions that you can take to resolve issues when one or more nodes become unhealthy or unavailable. This guide is particularly useful if you are not [shipping your logs and metrics](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md) to a dedicated monitoring cluster. **What are the symptoms?** diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/how-monitoring-works.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/how-monitoring-works.md index 4ba7a71b8..eca7ff05a 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/how-monitoring-works.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/how-monitoring-works.md @@ -17,7 +17,7 @@ To learn how to collect monitoring data, refer to: * [Collecting monitoring data with {{agent}}](../../../deploy-manage/monitor/stack-monitoring/collecting-monitoring-data-with-elastic-agent.md): Uses a single agent to gather logs and metrics. Can be managed from a central location in {{fleet}}. * [Collecting monitoring data with {{metricbeat}}](../../../deploy-manage/monitor/stack-monitoring/collecting-monitoring-data-with-metricbeat.md): Uses a lightweight {{beats}} shipper to gather metrics. May be preferred if you have an existing investment in {{beats}} or are not yet ready to use {{agent}}. - * [Legacy collection methods](../../../deploy-manage/monitor/stack-monitoring/legacy-collection-methods.md): Uses internal exporters to gather metrics. Not recommended. If you have previously configured legacy collection methods, you should migrate to using {{agent}} or {{metricbeat}}. + * [Legacy collection methods](../../../deploy-manage/monitor/stack-monitoring/es-legacy-collection-methods.md): Uses internal exporters to gather metrics. Not recommended. If you have previously configured legacy collection methods, you should migrate to using {{agent}} or {{metricbeat}}. * [Monitoring {{kib}}](../../../deploy-manage/monitor/monitoring-data/visualizing-monitoring-data.md) * [Monitoring {{ls}}](https://www.elastic.co/guide/en/logstash/current/configuring-logstash.html) diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/monitor-elasticsearch-cluster.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/monitor-elasticsearch-cluster.md index 7489d0a26..55247fd47 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/monitor-elasticsearch-cluster.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/monitor-elasticsearch-cluster.md @@ -10,6 +10,6 @@ The {{stack}} {monitor-features} provide a way to keep a pulse on the health and * [Collecting monitoring data with {{metricbeat}}](../../../deploy-manage/monitor/stack-monitoring/collecting-monitoring-data-with-metricbeat.md) * [Collecting log data with {{filebeat}}](../../../deploy-manage/monitor/stack-monitoring/collecting-log-data-with-filebeat.md) * [*Configuring data streams/indices for monitoring*](../../../deploy-manage/monitor/monitoring-data/configuring-data-streamsindices-for-monitoring.md) -* [Legacy collection methods](../../../deploy-manage/monitor/stack-monitoring/legacy-collection-methods.md) +* [Legacy collection methods](../../../deploy-manage/monitor/stack-monitoring/es-legacy-collection-methods.md) * [*Troubleshooting monitoring*](../../../troubleshoot/elasticsearch/monitoring-troubleshooting.md) diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/monitoring-production.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/monitoring-production.md index 614576d6e..32129be86 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/monitoring-production.md +++ b/raw-migrated-files/elasticsearch/elasticsearch-reference/monitoring-production.md @@ -66,7 +66,7 @@ To store monitoring data in a separate cluster: * [{{agent}} collection methods](../../../deploy-manage/monitor/stack-monitoring/collecting-monitoring-data-with-elastic-agent.md) * [{{metricbeat}} collection methods](../../../deploy-manage/monitor/stack-monitoring/collecting-monitoring-data-with-metricbeat.md) - * [Legacy collection methods](../../../deploy-manage/monitor/stack-monitoring/legacy-collection-methods.md) + * [Legacy collection methods](../../../deploy-manage/monitor/stack-monitoring/es-legacy-collection-methods.md) 3. (Optional) [Configure {{ls}} to collect data and send it to the monitoring cluster](https://www.elastic.co/guide/en/logstash/current/configuring-logstash.html). 4. (Optional) [Configure {{ents}} monitoring](https://www.elastic.co/guide/en/enterprise-search/current/monitoring.html). @@ -82,9 +82,9 @@ To store monitoring data in a separate cluster: 6. (Optional) [Configure APM Server monitoring](https://www.elastic.co/guide/en/apm/guide/current/monitor-apm.html) 7. (Optional) Configure {{kib}} to collect data and send it to the monitoring cluster: - * [{{agent}} collection methods](../../../deploy-manage/monitor/stack-monitoring/monitoring-elastic-agent.md) - * [{{metricbeat}} collection methods](../../../deploy-manage/monitor/stack-monitoring/monitoring-metricbeat.md) - * [Legacy collection methods](../../../deploy-manage/monitor/stack-monitoring/monitoring-kibana.md) + * [{{agent}} collection methods](../../../deploy-manage/monitor/stack-monitoring/kibana-monitoring-elastic-agent.md) + * [{{metricbeat}} collection methods](../../../deploy-manage/monitor/stack-monitoring/kibana-monitoring-metricbeat.md) + * [Legacy collection methods](../../../deploy-manage/monitor/stack-monitoring/kibana-monitoring-legacy.md) 8. (Optional) Create a dedicated {{kib}} instance for monitoring, rather than using a single {{kib}} instance to access both your production cluster and monitoring cluster. @@ -95,5 +95,5 @@ To store monitoring data in a separate cluster: 1. (Optional) Disable the collection of monitoring data in this {{kib}} instance. Set the `xpack.monitoring.kibana.collection.enabled` setting to `false` in the `kibana.yml` file. For more information about this setting, see [Monitoring settings in {{kib}}](https://www.elastic.co/guide/en/kibana/current/monitoring-settings-kb.html). -9. [Configure {{kib}} to retrieve and display the monitoring data](../../../deploy-manage/monitor/stack-monitoring/monitoring-data.md). +9. [Configure {{kib}} to retrieve and display the monitoring data](../../../deploy-manage/monitor/stack-monitoring/kibana-monitoring-data.md). diff --git a/raw-migrated-files/kibana/kibana/logging-settings.md b/raw-migrated-files/kibana/kibana/logging-settings.md index e1bf51107..60237f131 100644 --- a/raw-migrated-files/kibana/kibana/logging-settings.md +++ b/raw-migrated-files/kibana/kibana/logging-settings.md @@ -20,13 +20,13 @@ The logging configuration is validated against the predefined schema and if ther * Loggers define what logging settings, such as the level of verbosity and the appenders, to apply to a particular context. Each log entry context provides information about the service or plugin that emits it and any of its sub-parts, for example, `metrics.ops` or `elasticsearch.query`. * Root is a logger that applies to all the log entries in {{kib}}. -The following table serves as a quick reference for different logging configuration keys. Note that these are not stand-alone settings and may require additional logging configuration. See the [Configure Logging in {{kib}}](../../../deploy-manage/monitor/logging-configuration/kibana-logging.md) guide and complete [examples](../../../deploy-manage/monitor/logging-configuration/log-settings-examples.md) for common configuration use cases. +The following table serves as a quick reference for different logging configuration keys. Note that these are not stand-alone settings and may require additional logging configuration. See the [Configure Logging in {{kib}}](../../../deploy-manage/monitor/logging-configuration/kibana-logging.md) guide and complete [examples](../../../deploy-manage/monitor/logging-configuration/kibana-log-settings-examples.md) for common configuration use cases. | | | | --- | --- | | `logging.appenders[].` | Unique appender identifier. | -| `logging.appenders[].console:` | Appender to use for logging records to **stdout**. By default, uses the `[%date][%level][%logger] %message` **pattern*** layout. To use a ***json**, set the [layout type to `json`](../../../deploy-manage/monitor/logging-configuration/log-settings-examples.md#log-in-json-ECS-example). | -| `logging.appenders[].file:` | Allows you to specify a fileName to write log records to disk. To write [all log records to file](../../../deploy-manage/monitor/logging-configuration/log-settings-examples.md#log-to-file-example), add the file appender to `root.appenders`. If configured, you also need to specify [`logging.appenders.file.pathName`](../../../deploy-manage/monitor/logging-configuration/log-settings-examples.md#log-to-file-example). | +| `logging.appenders[].console:` | Appender to use for logging records to **stdout**. By default, uses the `[%date][%level][%logger] %message` **pattern*** layout. To use a ***json**, set the [layout type to `json`](../../../deploy-manage/monitor/logging-configuration/kibana-log-settings-examples.md#log-in-json-ECS-example). | +| `logging.appenders[].file:` | Allows you to specify a fileName to write log records to disk. To write [all log records to file](../../../deploy-manage/monitor/logging-configuration/kibana-log-settings-examples.md#log-to-file-example), add the file appender to `root.appenders`. If configured, you also need to specify [`logging.appenders.file.pathName`](../../../deploy-manage/monitor/logging-configuration/kibana-log-settings-examples.md#log-to-file-example). | | `logging.appenders[].rolling-file:` | Similar to [Log4j’s](https://logging.apache.org/log4j/2.x/) `RollingFileAppender`, this appender will log to a file and rotate if following a rolling strategy when the configured policy triggers. There are currently two policies supported: [`size-limit`](../../../deploy-manage/monitor/logging-configuration/kibana-logging.md#size-limit-triggering-policy) and [`time-interval`](../../../deploy-manage/monitor/logging-configuration/kibana-logging.md#time-interval-triggering-policy). | | `logging.appenders[]..type` | The appender type determines where the log messages are sent. Options are `console`, `file`, `rewrite`, `rolling-file`. Required. | | `logging.appenders[]..fileName` | Determines the filepath where the log messages are written to for file and rolling-file appender types. Required for appenders that write to file. | diff --git a/raw-migrated-files/stack-docs/elastic-stack/upgrade-elastic-stack-for-elastic-cloud.md b/raw-migrated-files/stack-docs/elastic-stack/upgrade-elastic-stack-for-elastic-cloud.md index a328fc015..62fe191df 100644 --- a/raw-migrated-files/stack-docs/elastic-stack/upgrade-elastic-stack-for-elastic-cloud.md +++ b/raw-migrated-files/stack-docs/elastic-stack/upgrade-elastic-stack-for-elastic-cloud.md @@ -6,7 +6,7 @@ Minor version upgrades, upgrades from 8.17 to 9.0.0-beta1, and cluster configura {{ess}} and {{ece}} do not support the ability to upgrade to or from release candidate builds, such as 8.0.0-rc1. -If you use a separate [monitoring deployment](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md), you should upgrade the monitoring deployment before the production deployment. In general, the monitoring deployment and the deployments being monitored should be running the same version of the Elastic Stack. A monitoring deployment cannot monitor production deployments running newer versions of the stack. If necessary, the monitoring deployment can monitor production deployments running the latest release of the previous major version. +If you use a separate [monitoring deployment](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md), you should upgrade the monitoring deployment before the production deployment. In general, the monitoring deployment and the deployments being monitored should be running the same version of the Elastic Stack. A monitoring deployment cannot monitor production deployments running newer versions of the stack. If necessary, the monitoring deployment can monitor production deployments running the latest release of the previous major version. ::::{important} Although it’s simple to upgrade an Elastic Cloud deployment, the new version might include breaking changes that affect your application. Make sure you review the deprecation logs, make any necessary changes, and test against the new version before upgrading your production deployment. diff --git a/solutions/observability/apps/monitor-apm-server.md b/solutions/observability/apps/monitor-apm-server.md index 25aaf3753..cbe40aa57 100644 --- a/solutions/observability/apps/monitor-apm-server.md +++ b/solutions/observability/apps/monitor-apm-server.md @@ -18,5 +18,5 @@ Select your deployment method to get started: {{ecloud}} manages the installation and configuration of a monitoring agent for you — so all you have to do is flip a switch and watch the data pour in. -* **{{ess}}** user? See [ESS: Enable logging and monitoring](../../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md). +* **{{ess}}** user? See [ESS: Enable logging and monitoring](../../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md). * **{{ece}}** user? See [ECE: Enable logging and monitoring](https://www.elastic.co/guide/en/cloud-enterprise/{{ece-version-link}}/ece-enable-logging-and-monitoring.html). diff --git a/solutions/observability/apps/monitor-fleet-managed-apm-server.md b/solutions/observability/apps/monitor-fleet-managed-apm-server.md index 59a5d5f8b..fc11df528 100644 --- a/solutions/observability/apps/monitor-fleet-managed-apm-server.md +++ b/solutions/observability/apps/monitor-fleet-managed-apm-server.md @@ -166,4 +166,4 @@ See the [{{agent}} command reference](https://www.elastic.co/guide/en/fleet/curr For more information about these configuration options, see [Configure the {{es}} output](https://www.elastic.co/guide/en/beats/metricbeat/current/elasticsearch-output.html). 6. [Start {{metricbeat}}](https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-starting.html) to begin collecting APM monitoring data. -7. [View the monitoring data in {{kib}}](../../../deploy-manage/monitor/stack-monitoring/monitoring-data.md). +7. [View the monitoring data in {{kib}}](../../../deploy-manage/monitor/stack-monitoring/kibana-monitoring-data.md). diff --git a/solutions/observability/apps/use-internal-collection-to-send-monitoring-data.md b/solutions/observability/apps/use-internal-collection-to-send-monitoring-data.md index f912e513f..37a2118a1 100644 --- a/solutions/observability/apps/use-internal-collection-to-send-monitoring-data.md +++ b/solutions/observability/apps/use-internal-collection-to-send-monitoring-data.md @@ -70,7 +70,7 @@ Use internal collectors to send {{beats}} monitoring data directly to your monit You must specify the `username` as `""` explicitly so that the username from the client certificate (`CN`) is used. See [SSL/TLS output settings](ssltls-output-settings.md) for more information about SSL settings. 3. Start APM Server. -4. [View the monitoring data in {{kib}}](../../../deploy-manage/monitor/stack-monitoring/monitoring-data.md). +4. [View the monitoring data in {{kib}}](../../../deploy-manage/monitor/stack-monitoring/kibana-monitoring-data.md). ## Settings for internal collection [apm-configuration-monitor] diff --git a/solutions/observability/apps/use-metricbeat-to-send-monitoring-data.md b/solutions/observability/apps/use-metricbeat-to-send-monitoring-data.md index 16f6c9888..d38a21ace 100644 --- a/solutions/observability/apps/use-metricbeat-to-send-monitoring-data.md +++ b/solutions/observability/apps/use-metricbeat-to-send-monitoring-data.md @@ -161,5 +161,5 @@ To collect and ship monitoring data: For more information about these configuration options, see [Configure the {{es}} output](https://www.elastic.co/guide/en/beats/metricbeat/current/elasticsearch-output.html). 6. [Start {{metricbeat}}](https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-starting.html) to begin collecting monitoring data. -7. [View the monitoring data in {{kib}}](../../../deploy-manage/monitor/stack-monitoring/monitoring-data.md). +7. [View the monitoring data in {{kib}}](../../../deploy-manage/monitor/stack-monitoring/kibana-monitoring-data.md). diff --git a/troubleshoot/elasticsearch/high-cpu-usage.md b/troubleshoot/elasticsearch/high-cpu-usage.md index 3c6f02aa1..1420ed7df 100644 --- a/troubleshoot/elasticsearch/high-cpu-usage.md +++ b/troubleshoot/elasticsearch/high-cpu-usage.md @@ -35,7 +35,7 @@ To track CPU usage over time, we recommend enabling monitoring: :::::::{tab-set} ::::::{tab-item} Elasticsearch Service -* (Recommended) Enable [logs and metrics](../../deploy-manage/monitor/stack-monitoring/stack-monitoring-on-elastic-cloud-deployments.md). When logs and metrics are enabled, monitoring information is visible on {{kib}}'s [Stack Monitoring](../../deploy-manage/monitor/monitoring-data/visualizing-monitoring-data.md) page. +* (Recommended) Enable [logs and metrics](../../deploy-manage/monitor/stack-monitoring/elastic-cloud-stack-monitoring.md). When logs and metrics are enabled, monitoring information is visible on {{kib}}'s [Stack Monitoring](../../deploy-manage/monitor/monitoring-data/visualizing-monitoring-data.md) page. You can also enable the [CPU usage threshold alert](../../deploy-manage/monitor/monitoring-data/kibana-alerts.md) to be notified about potential issues through email. diff --git a/troubleshoot/kibana/access.md b/troubleshoot/kibana/access.md index f6581c78c..daf4585e9 100644 --- a/troubleshoot/kibana/access.md +++ b/troubleshoot/kibana/access.md @@ -67,7 +67,7 @@ Troubleshoot the `Kibana Server is not Ready yet` error. These {{kib}}-backing indices must also not have [index settings](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-get-settings.html) flagging `read_only_allow_delete` or `write` [index blocks](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-blocks.html). 3. [Shut down all {{kib}} nodes](../../deploy-manage/maintenance/start-stop-services/start-stop-kibana.md). -4. Choose any {{kib}} node, then update the config to set the [debug logging](../../deploy-manage/monitor/logging-configuration/log-settings-examples.md#change-overall-log-level). +4. Choose any {{kib}} node, then update the config to set the [debug logging](../../deploy-manage/monitor/logging-configuration/kibana-log-settings-examples.md#change-overall-log-level). 5. [Start the node](../../deploy-manage/maintenance/start-stop-services/start-stop-kibana.md), then check the start-up debug logs for `ERROR` messages or other start-up issues. For example: diff --git a/troubleshoot/kibana/error-server-not-ready.md b/troubleshoot/kibana/error-server-not-ready.md index 42df91c89..2855617e5 100644 --- a/troubleshoot/kibana/error-server-not-ready.md +++ b/troubleshoot/kibana/error-server-not-ready.md @@ -67,7 +67,7 @@ Troubleshoot the `Kibana Server is not Ready yet` error. These {{kib}}-backing indices must also not have [index settings](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-get-settings.html) flagging `read_only_allow_delete` or `write` [index blocks](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-blocks.html). 3. [Shut down all {{kib}} nodes](../../deploy-manage/maintenance/start-stop-services/start-stop-kibana.md). -4. Choose any {{kib}} node, then update the config to set the [debug logging](../../deploy-manage/monitor/logging-configuration/log-settings-examples.md#change-overall-log-level). +4. Choose any {{kib}} node, then update the config to set the [debug logging](../../deploy-manage/monitor/logging-configuration/kibana-log-settings-examples.md#change-overall-log-level). 5. [Start the node](../../deploy-manage/maintenance/start-stop-services/start-stop-kibana.md), then check the start-up debug logs for `ERROR` messages or other start-up issues. For example: From d2e48ea9627efa3c70a1ad1df140e7721ccb7e96 Mon Sep 17 00:00:00 2001 From: florent-leborgne Date: Wed, 5 Feb 2025 11:42:35 +0100 Subject: [PATCH 07/15] [E&A] Fix 3 * formatting in Explore and Analyze (#324) * fix 3 * formatting in Explore and Analyze * Update explore-analyze/visualize/maps/maps-connect-to-ems.md --- .../alerts/kibana/alerting-setup.md | 2 +- .../dashboards/_import_dashboards.md | 4 +-- explore-analyze/dashboards/add-controls.md | 4 +-- ...ashboard-of-panels-with-web-server-data.md | 2 +- .../dashboards/create-dashboard.md | 4 +-- explore-analyze/dashboards/open-dashboard.md | 2 +- explore-analyze/dashboards/sharing.md | 2 +- .../discover/discover-get-started.md | 12 +++---- explore-analyze/discover/document-explorer.md | 2 +- explore-analyze/discover/try-esql.md | 2 +- ...bservability-aiops-detect-change-points.md | 2 +- .../anomaly-detection/ml-getting-started.md | 2 +- .../ml-dfa-custom-urls.md | 2 +- .../machine-learning/nlp/ml-nlp-e5.md | 2 +- .../machine-learning/nlp/ml-nlp-elser.md | 2 +- .../machine-learning/nlp/ml-nlp-inference.md | 2 +- .../sql-client-apps-tableau-server.md | 2 +- .../query-filter/languages/sql-data-types.md | 2 +- .../automating-report-generation.md | 2 +- .../reporting-troubleshooting-csv.md | 2 +- explore-analyze/visualize.md | 2 +- explore-analyze/visualize/esorql.md | 2 +- explore-analyze/visualize/field-statistics.md | 2 +- explore-analyze/visualize/lens.md | 8 ++--- explore-analyze/visualize/link-panels.md | 8 ++--- .../visualize/maps/maps-getting-started.md | 32 +++++++++---------- .../maps/maps-vector-style-properties.md | 4 +-- 27 files changed, 57 insertions(+), 57 deletions(-) diff --git a/explore-analyze/alerts/kibana/alerting-setup.md b/explore-analyze/alerts/kibana/alerting-setup.md index a248a9f03..361f8a063 100644 --- a/explore-analyze/alerts/kibana/alerting-setup.md +++ b/explore-analyze/alerts/kibana/alerting-setup.md @@ -104,7 +104,7 @@ When you create a rule in {{kib}}, an API key is created that captures a snapsho When you disable a rule, it retains the associated API key which is reused when the rule is enabled. If the API key is missing when you enable the rule, a new key is generated that has your current security privileges. When you import a rule, you must enable it before you can use it and a new API key is generated at that time. -You can generate a new API key at any time in **{{stack-manage-app}} > {{rules-ui}}*** or in the rule details page by selecting ***Update API key** in the actions menu. +You can generate a new API key at any time in **{{stack-manage-app}} > {{rules-ui}}** or in the rule details page by selecting **Update API key** in the actions menu. If you manage your rules by using {{kib}} APIs, they support support both key- and token-based authentication as described in [Authentication](https://www.elastic.co/guide/en/kibana/current/api.html#api-authentication). To use key-based authentication, create API keys and use them in the header of your API calls as described in [API Keys](../../../deploy-manage/api-keys/elasticsearch-api-keys.md). To use token-based authentication, provide a username and password; an API key that matches the current privileges of the user is created automatically. In both cases, the API key is subsequently associated with the rule and used when it runs. diff --git a/explore-analyze/dashboards/_import_dashboards.md b/explore-analyze/dashboards/_import_dashboards.md index 124abf943..f0c3b0cd1 100644 --- a/explore-analyze/dashboards/_import_dashboards.md +++ b/explore-analyze/dashboards/_import_dashboards.md @@ -5,11 +5,11 @@ mapped_pages: # Import dashboards [_import_dashboards] -You can import dashboards from the **Saved Objects*** page under ***Stack Management**. Refer to [Manage saved objects](../find-and-organize/saved-objects.md). +You can import dashboards from the **Saved Objects** page under **Stack Management**. Refer to [Manage saved objects](../find-and-organize/saved-objects.md). When importing dashboards, you also import their related objects, such as data views and visualizations. Import options allow you to define how the import should behave with these related objects. -* **Check for existing objects***: When selected, objects are not imported when another object with the same ID already exists in this space or cluster. For example, if you import a dashboard that uses a data view which already exists, the data view is not imported and the dashboard uses the existing data view instead. You can also chose to select manually which of the imported or the existing objects are kept by selecting ***Request action on conflict**. +* **Check for existing objects**: When selected, objects are not imported when another object with the same ID already exists in this space or cluster. For example, if you import a dashboard that uses a data view which already exists, the data view is not imported and the dashboard uses the existing data view instead. You can also chose to select manually which of the imported or the existing objects are kept by selecting **Request action on conflict**. * **Create new objects with random IDs**: All related objects are imported and are assigned a new ID to avoid conflicts. ![Import panel](../../images/kibana-dashboard-import-saved-object.png "") diff --git a/explore-analyze/dashboards/add-controls.md b/explore-analyze/dashboards/add-controls.md index 0e8bf76e9..77177bcd9 100644 --- a/explore-analyze/dashboards/add-controls.md +++ b/explore-analyze/dashboards/add-controls.md @@ -52,7 +52,7 @@ To add interactive Options list and Range slider controls, create the controls, 3. On the **Create control** flyout, from the **Data view** dropdown, select the data view that contains the field you want to use for the **Control**. 4. In the **Field** list, select the field you want to filter on. -5. Under **Control type**, select whether the control should be an **Options list*** or a ***Range slider**. +5. Under **Control type**, select whether the control should be an **Options list** or a **Range slider**. ::::{tip} Range sliders are for Number type fields only. @@ -127,7 +127,7 @@ Several settings that apply to all controls of the same dashboard are available. * **Validate user selections** — When selected, any selected option that results in no data is ignored. * **Chain controls** — When selected, controls are applied sequentially from left to right, and line by line. Any selected options in one control narrows the available options in the next control. - * **Apply selections automatically*** — The dashboard is updated dynamically when options are selected in controls. When this option is disabled, users first need to ***Apply** their control selection before they are applied to the dashboard. + * **Apply selections automatically** — The dashboard is updated dynamically when options are selected in controls. When this option is disabled, users first need to **Apply** their control selection before they are applied to the dashboard. * To remove all controls from the dashboard, click **Delete all**. diff --git a/explore-analyze/dashboards/create-dashboard-of-panels-with-web-server-data.md b/explore-analyze/dashboards/create-dashboard-of-panels-with-web-server-data.md index 0c41d8a46..a3723dca9 100644 --- a/explore-analyze/dashboards/create-dashboard-of-panels-with-web-server-data.md +++ b/explore-analyze/dashboards/create-dashboard-of-panels-with-web-server-data.md @@ -356,7 +356,7 @@ Now that you have a complete overview of your web server data, save the dashboar 1. In the toolbar, click **Save**. 2. On the **Save dashboard** window, enter `Logs dashboard` in the **Title** field. 3. Select **Store time with dashboard**. -4. Click **Save**. You will be identified as the **creator*** of the dashboard. If you or another user edit the dashboard, you can also view the ***last editor** when checking the dashboard information. +4. Click **Save**. You will be identified as the **creator** of the dashboard. If you or another user edit the dashboard, you can also view the **last editor** when checking the dashboard information. :::{image} ../../images/kibana-dashboard-creator-editor.png :alt: Information panel of a dashboard showing its creator and last editor diff --git a/explore-analyze/dashboards/create-dashboard.md b/explore-analyze/dashboards/create-dashboard.md index 7870c8cd2..1be1b20c9 100644 --- a/explore-analyze/dashboards/create-dashboard.md +++ b/explore-analyze/dashboards/create-dashboard.md @@ -12,9 +12,9 @@ mapped_pages: 3. Add content to the dashboard. You have several options covered in more detail in the [Visualizations section](../visualize.md#panels-editors): - * [**Create visualization***](../visualize/lens.md). This option is a shortcut to create a chart using ***Lens**, the default visualization editor in {{kib}}. + * [**Create visualization**](../visualize/lens.md). This option is a shortcut to create a chart using **Lens**, the default visualization editor in {{kib}}. * [**Add panel**](../visualize.md#panels-editors). Choose one of the available panels to add and configure content to your dashboard. - * **Add from library***. Select existing content that has already been configured and saved to the ***Visualize Library**. + * **Add from library**. Select existing content that has already been configured and saved to the **Visualize Library**. * [**Controls**](add-controls.md). Add controls to help filter the content of your dashboard. :::{image} images/add_content_to_dashboard_8.15.0.png diff --git a/explore-analyze/dashboards/open-dashboard.md b/explore-analyze/dashboards/open-dashboard.md index cf082ad5c..2973b5d32 100644 --- a/explore-analyze/dashboards/open-dashboard.md +++ b/explore-analyze/dashboards/open-dashboard.md @@ -13,7 +13,7 @@ mapped_pages: :::: 3. Click the dashboard **Title** you want to open. -4. Make sure that you are in **Edit*** mode to be able to make changes to the dashboard. You can switch between ***Edit*** and ***View** modes from the toolbar. +4. Make sure that you are in **Edit** mode to be able to make changes to the dashboard. You can switch between **Edit** and **View** modes from the toolbar. :::{image} https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/blt619b284e92c2be27/6750f3a512a5eae780936fe3/switch-to-view-mode-8.17.0.gif :alt: Switch between Edit and View modes diff --git a/explore-analyze/dashboards/sharing.md b/explore-analyze/dashboards/sharing.md index f8e30579c..7ce6059b9 100644 --- a/explore-analyze/dashboards/sharing.md +++ b/explore-analyze/dashboards/sharing.md @@ -19,7 +19,7 @@ When sharing a dashboard with a link while a panel is in maximized view, the gen ## Export dashboards [export-dashboards] -You can export dashboards from **Stack Management*** > ***Saved Objects**. To configure and start the export: +You can export dashboards from **Stack Management** > **Saved Objects**. To configure and start the export: 1. Select the dashboard that you want, then click **Export**. 2. Enable **Include related objects** if you want that objects associated to the selected dashboard, such as data views and visualizations, also get exported. This option is enabled by default and recommended if you plan to import that dashboard again in a different space or cluster. diff --git a/explore-analyze/discover/discover-get-started.md b/explore-analyze/discover/discover-get-started.md index d758354ab..095ce25d0 100644 --- a/explore-analyze/discover/discover-get-started.md +++ b/explore-analyze/discover/discover-get-started.md @@ -7,7 +7,7 @@ mapped_pages: Learn how to use **Discover** to: -* **Select*** and ***filter** your {{es}} data. +* **Select** and **filter** your {{es}} data. * **Explore** the fields and content of your data in depth. * **Present** your findings in a visualization. @@ -132,7 +132,7 @@ In the following example, we’re adding 2 fields: A simple "Hello world" field, ### Visualize aggregated fields [_visualize_aggregated_fields] -If a field can be [aggregated](../aggregations.md), you can quickly visualize it in detail by opening it in **Lens*** from ***Discover***. ***Lens** is the default visualization editor in {{kib}}. +If a field can be [aggregated](../aggregations.md), you can quickly visualize it in detail by opening it in **Lens** from **Discover**. **Lens** is the default visualization editor in {{kib}}. 1. In the list of fields, find an aggregatable field. For example, with the sample data, you can look for `day_of_week`. @@ -158,7 +158,7 @@ For geo point fields (![Geo point field icon](../../images/kibana-geoip-icon.png You can use **Discover** to compare and diff the field values of multiple results or documents in the table. 1. Select the results you want to compare from the Documents or Results tab in Discover. -2. From the **Selected*** menu in the table toolbar, choose ***Compare selected**. The comparison view opens and shows the selected results next to each other. +2. From the **Selected** menu in the table toolbar, choose **Compare selected**. The comparison view opens and shows the selected results next to each other. 3. Compare the values of each field. By default the first result selected shows as the reference for displaying differences in the other results. When the value remains the same for a given field, it’s displayed in green. When the value differs, it’s displayed in red. ::::{tip} @@ -177,7 +177,7 @@ You can use **Discover** to compare and diff the field values of multiple result You can quickly copy the content currently displayed in the table for one or several results to your clipboard. 1. Select the results you want to copy. -2. Open the **Selected*** menu in the table toolbar, and select ***Copy selection as text*** or ***Copy documents as JSON**. +2. Open the **Selected** menu in the table toolbar, and select **Copy selection as text** or **Copy documents as JSON**. The content is copied to your clipboard in the selected format. Fields that are not currently added to the table are ignored. @@ -198,7 +198,7 @@ Dive into an individual document to view its fields and the documents that occur * You can pin some fields by clicking the left column to keep them displayed even if you filter the table. ::::{tip} - You can restrict the fields listed in the detailed view to just the fields that you explicitly added to the **Discover*** table, using the ***Selected only** toggle. In ES|QL mode, you also have an option to hide fields with null values. + You can restrict the fields listed in the detailed view to just the fields that you explicitly added to the **Discover** table, using the **Selected only** toggle. In ES|QL mode, you also have an option to hide fields with null values. :::: 3. To navigate to a view of the document that you can bookmark and share, select ** View single document**. @@ -248,7 +248,7 @@ You can use **Discover** with the Elasticsearch Query Language, ES|QL. When usin You can switch to the ES|QL mode of Discover from the application menu bar. -Note that in ES|QL mode, the **Documents*** tab is named ***Results**. +Note that in ES|QL mode, the **Documents** tab is named **Results**. Learn more about how to use ES|QL queries in [Using ES|QL](try-esql.md). diff --git a/explore-analyze/discover/document-explorer.md b/explore-analyze/discover/document-explorer.md index c2067ff8c..5d637b21e 100644 --- a/explore-analyze/discover/document-explorer.md +++ b/explore-analyze/discover/document-explorer.md @@ -50,7 +50,7 @@ You can define different settings for the header row and body rows. ### Limit the sample size [document-explorer-sample-size] -When the number of results returned by your search query (displayed at the top of the **Documents*** or ***Results*** tab) is greater than the value of [`discover:sampleSize`](https://www.elastic.co/guide/en/kibana/current/advanced-options.html#kibana-discover-settings), the number of results displayed in the table is limited to the configured value by default. You can adjust the initial sample size for searches to any number between 10 and `discover:sampleSize` from the ***Display options** located in the table toolbar. +When the number of results returned by your search query (displayed at the top of the **Documents** or **Results** tab) is greater than the value of [`discover:sampleSize`](https://www.elastic.co/guide/en/kibana/current/advanced-options.html#kibana-discover-settings), the number of results displayed in the table is limited to the configured value by default. You can adjust the initial sample size for searches to any number between 10 and `discover:sampleSize` from the **Display options** located in the table toolbar. On the last page of the table, a message indicates that you’ve reached the end of the loaded search results. From that message, you can choose to load more results to continue exploring. diff --git a/explore-analyze/discover/try-esql.md b/explore-analyze/discover/try-esql.md index 91b7a75a0..0fe49d71f 100644 --- a/explore-analyze/discover/try-esql.md +++ b/explore-analyze/discover/try-esql.md @@ -18,7 +18,7 @@ For the complete {{esql}} documentation, including tutorials, examples and the f ## Prerequisite [prerequisite] -To view the {{esql}} option in **Discover***, the `enableESQL` setting must be enabled from Kibana’s ***Advanced Settings**. It is enabled by default. +To view the {{esql}} option in **Discover**, the `enableESQL` setting must be enabled from Kibana’s **Advanced Settings**. It is enabled by default. ## Use {{esql}} [tutorial-try-esql] diff --git a/explore-analyze/machine-learning/aiops-labs/observability-aiops-detect-change-points.md b/explore-analyze/machine-learning/aiops-labs/observability-aiops-detect-change-points.md index 82bd3c9db..427cbddae 100644 --- a/explore-analyze/machine-learning/aiops-labs/observability-aiops-detect-change-points.md +++ b/explore-analyze/machine-learning/aiops-labs/observability-aiops-detect-change-points.md @@ -18,7 +18,7 @@ To detect change points: 1. In your {{obs-serverless}} project, go to **Machine learning** → **Change point detection**. 2. Choose a data view or saved search to access the data you want to analyze. -3. Select a function: **avg**, **max***, ***min**, or **sum**. +3. Select a function: **avg**, **max**, **min**, or **sum**. 4. In the time filter, specify a time range over which you want to detect change points. 5. From the **Metric field** list, select a field you want to check for change points. 6. (Optional) From the **Split field** list, select a field to split the data by. If the cardinality of the split field exceeds 10,000, only the first 10,000 values, sorted by document count, are analyzed. Use this option when you want to investigate the change point across multiple instances, pods, clusters, and so on. For example, you may want to view CPU utilization split across multiple instances without having to jump across multiple dashboards and visualizations. diff --git a/explore-analyze/machine-learning/anomaly-detection/ml-getting-started.md b/explore-analyze/machine-learning/anomaly-detection/ml-getting-started.md index 19a807f9d..65a1036b5 100644 --- a/explore-analyze/machine-learning/anomaly-detection/ml-getting-started.md +++ b/explore-analyze/machine-learning/anomaly-detection/ml-getting-started.md @@ -266,7 +266,7 @@ In addition to detecting anomalous behavior in your data, you can use the {{ml-f To create a forecast in {{kib}}: -1. View your job results (for example, for the `low_request_rate` job) in the **Single Metric Viewer**. To find that view, click the **View series*** button in the ***Actions** column on the **Anomaly Detection** page. +1. View your job results (for example, for the `low_request_rate` job) in the **Single Metric Viewer**. To find that view, click the **View series** button in the **Actions** column on the **Anomaly Detection** page. 2. Click **Forecast**. :::{image} ../../../images/machine-learning-ml-gs-forecast.png diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-custom-urls.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-custom-urls.md index 6db53d6c8..7c4fa2778 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-custom-urls.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-custom-urls.md @@ -5,7 +5,7 @@ mapped_pages: # Adding custom URLs to data frame analytics jobs [ml-dfa-custom-urls] -You can optionally attach one or more custom URLs to your {{dfanalytics-jobs}}. These links can direct you to dashboards, the **Discover** app, or external websites. For example, you can define a custom URL that provides a way for users to drill down to the source data from a {{regression}} job. You can create a custom URL during job creation under **Additional settings** in the **Job details*** step. Alternatively, you can edit or add new custom URLs in the job list by clicking ***Edit** in the **Actions** menu. +You can optionally attach one or more custom URLs to your {{dfanalytics-jobs}}. These links can direct you to dashboards, the **Discover** app, or external websites. For example, you can define a custom URL that provides a way for users to drill down to the source data from a {{regression}} job. You can create a custom URL during job creation under **Additional settings** in the **Job details** step. Alternatively, you can edit or add new custom URLs in the job list by clicking **Edit** in the **Actions** menu. :::{image} ../../../images/machine-learning-ml-dfa-custom-url.png :alt: Creating a custom URL during job creation diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-e5.md b/explore-analyze/machine-learning/nlp/ml-nlp-e5.md index cd4fa6f1d..fe269a514 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-e5.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-e5.md @@ -54,7 +54,7 @@ After you created the E5 {{infer}} endpoint, it’s ready to be used for semanti ### Alternative methods to download and deploy E5 [alternative-download-deploy-e5] -You can also download and deploy the E5 model either from **{{ml-app}}** > **Trained Models***, from ***Search** > **Indices**, or by using the trained models API in Dev Console. +You can also download and deploy the E5 model either from **{{ml-app}}** > **Trained Models**, from **Search** > **Indices**, or by using the trained models API in Dev Console. ::::{note} For most cases, the preferred version is the **Intel and Linux optimized** model, it is recommended to download and deploy that version. diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-elser.md b/explore-analyze/machine-learning/nlp/ml-nlp-elser.md index b1ecf424e..750fc1c2f 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-elser.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-elser.md @@ -89,7 +89,7 @@ After you created the ELSER {{infer}} endpoint, it’s ready to be used for sema ### Alternative methods to download and deploy ELSER [alternative-download-deploy] -You can also download and deploy ELSER either from **{{ml-app}}** > **Trained Models***, from ***Search** > **Indices**, or by using the trained models API in Dev Console. +You can also download and deploy ELSER either from **{{ml-app}}** > **Trained Models**, from **Search** > **Indices**, or by using the trained models API in Dev Console. ::::{note} * For most cases, the preferred version is the **Intel and Linux optimized** model, it is recommended to download and deploy that version. diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-inference.md b/explore-analyze/machine-learning/nlp/ml-nlp-inference.md index d9ea5022e..b66dbf4b4 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-inference.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-inference.md @@ -15,7 +15,7 @@ After you [deploy a trained model in your cluster](ml-nlp-deploy-models.md), you ## Add an {{infer}} processor to an ingest pipeline [ml-nlp-inference-processor] -In {{kib}}, you can create and edit pipelines in **{{stack-manage-app}}** > **Ingest Pipelines***. To open ***Ingest Pipelines**, find **{{stack-manage-app}}** in the main menu, or use the [global search field](../../overview/kibana-quickstart.md#_finding_your_apps_and_objects). +In {{kib}}, you can create and edit pipelines in **{{stack-manage-app}}** > **Ingest Pipelines**. To open **Ingest Pipelines**, find **{{stack-manage-app}}** in the main menu, or use the [global search field](../../overview/kibana-quickstart.md#_finding_your_apps_and_objects). :::{image} ../../../images/machine-learning-ml-nlp-pipeline-lang.png :alt: Creating a pipeline in the Stack Management app diff --git a/explore-analyze/query-filter/languages/sql-client-apps-tableau-server.md b/explore-analyze/query-filter/languages/sql-client-apps-tableau-server.md index 63a64e767..840c70f1d 100644 --- a/explore-analyze/query-filter/languages/sql-client-apps-tableau-server.md +++ b/explore-analyze/query-filter/languages/sql-client-apps-tableau-server.md @@ -30,7 +30,7 @@ Move the {{es}} Connector for Tableau to the Tableau Server connectors directory Restart Tableau Server. -To load data into a workbook, add a **New Data Source** from the **Data*** menu or using the icon. In the ***Connectors*** tab of the ***Connect to Data** modal, select **Elasticsearch by Elastic**. +To load data into a workbook, add a **New Data Source** from the **Data** menu or using the icon. In the **Connectors** tab of the **Connect to Data** modal, select **Elasticsearch by Elastic**. $$$apps_tableau_server_from_connector$$$ ![Select Elasticsearch as the data source](../../../images/elasticsearch-reference-apps_tableau_server_from_connector.png "") diff --git a/explore-analyze/query-filter/languages/sql-data-types.md b/explore-analyze/query-filter/languages/sql-data-types.md index d4dea95e2..fe142b785 100644 --- a/explore-analyze/query-filter/languages/sql-data-types.md +++ b/explore-analyze/query-filter/languages/sql-data-types.md @@ -33,7 +33,7 @@ mapped_pages: | *types not mentioned above* | `unsupported` | OTHER | 0 | ::::{note} -Most of {{es}} [data types](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html) are available in Elasticsearch SQL, as indicated above. As one can see, all of {{es}} [data types](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html) are mapped to the data type with the same name in Elasticsearch SQL, with the exception of **date** data type which is mapped to **datetime*** in Elasticsearch SQL. This is to avoid confusion with the ANSI SQL types ***DATE** (date only) and **TIME** (time only), which are also supported by Elasticsearch SQL in queries (with the use of [`CAST`](sql-functions-type-conversion.md#sql-functions-type-conversion-cast)/[`CONVERT`](sql-functions-type-conversion.md#sql-functions-type-conversion-convert)), but don’t correspond to an actual mapping in {{es}} (see the [`table`](#es-sql-only-types) below). +Most of {{es}} [data types](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html) are available in Elasticsearch SQL, as indicated above. As one can see, all of {{es}} [data types](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html) are mapped to the data type with the same name in Elasticsearch SQL, with the exception of **date** data type which is mapped to **datetime** in Elasticsearch SQL. This is to avoid confusion with the ANSI SQL types **DATE** (date only) and **TIME** (time only), which are also supported by Elasticsearch SQL in queries (with the use of [`CAST`](sql-functions-type-conversion.md#sql-functions-type-conversion-cast)/[`CONVERT`](sql-functions-type-conversion.md#sql-functions-type-conversion-convert)), but don’t correspond to an actual mapping in {{es}} (see the [`table`](#es-sql-only-types) below). :::: diff --git a/explore-analyze/report-and-share/automating-report-generation.md b/explore-analyze/report-and-share/automating-report-generation.md index 18a6749ad..2cb21e89a 100644 --- a/explore-analyze/report-and-share/automating-report-generation.md +++ b/explore-analyze/report-and-share/automating-report-generation.md @@ -140,7 +140,7 @@ The response payload of a request to generate a report includes the path to down * **`400` (Bad Request)**: When sending requests to the POST URL, if you don’t use `POST` as the HTTP method, or if your request is missing the `kbn-xsrf` header, Kibana will return a code `400` status response for the request. * **`503` (Service Unavailable)**: When using the `path` to request the download, you will get a `503` status response if report generation hasn’t completed yet. The response will include a `Retry-After` header. You can set the script to wait the number of seconds in the `Retry-After` header, and then repeat if needed, until the report is complete. -* **`500` (Internal Server Error)***: When using the `path` to request the download, you will get a `500` status response if the report isn’t available due to an error when generating the report. More information is available at ***Management > Kibana > Reporting**. +* **`500` (Internal Server Error)**: When using the `path` to request the download, you will get a `500` status response if the report isn’t available due to an error when generating the report. More information is available at **Management > Kibana > Reporting**. ## Deprecated report URLs [deprecated-report-urls] diff --git a/explore-analyze/report-and-share/reporting-troubleshooting-csv.md b/explore-analyze/report-and-share/reporting-troubleshooting-csv.md index f821d34bf..b8a289966 100644 --- a/explore-analyze/report-and-share/reporting-troubleshooting-csv.md +++ b/explore-analyze/report-and-share/reporting-troubleshooting-csv.md @@ -72,7 +72,7 @@ The listing of reports in **Stack Management > Reporting** allows you to inspect 1. Go to **Stack Management > Reporting** and click the info icon next to a report. 2. In the footer of the report flyout, click **Actions**. -3. Click **Inspect query in Console*** in the ***Actions** menu. +3. Click **Inspect query in Console** in the **Actions** menu. 4. This will open the **Console** application, pre-filled with the queries used to generate the CSV export. :::{image} https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/blt4758e67aaec715d9/67897d0be92e090a6dc626a8/inspect-query-from-csv-export.gif diff --git a/explore-analyze/visualize.md b/explore-analyze/visualize.md index 17470025e..6775f0c79 100644 --- a/explore-analyze/visualize.md +++ b/explore-analyze/visualize.md @@ -11,7 +11,7 @@ Use one of the editors to create visualizations of your data. Each editor offers $$$panels-editors$$$ -| **Content*** | ***Panel type*** | ***Description** | +| **Content** | **Panel type** | **Description** | | --- | --- | --- | | Visualizations | [Lens](visualize/lens.md) | The default editor for creating powerful [charts](visualize/supported-chart-types.md) in {{kib}} | | [ES|QL](https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-kibana.md) | Create visualizations from ES|QL queries | diff --git a/explore-analyze/visualize/esorql.md b/explore-analyze/visualize/esorql.md index e9240f7f9..ddf5945ba 100644 --- a/explore-analyze/visualize/esorql.md +++ b/explore-analyze/visualize/esorql.md @@ -18,7 +18,7 @@ You can then **Save** and add it to an existing or a new dashboard using the sav ## Create from dashboard [_create_from_dashboard] 1. From your dashboard, select **Add panel**. -2. Choose **ES|QL*** under ***Visualizations***. An ES|QL editor appears and lets you configure your query and its associated visualization. The ***Suggestions** panel can help you find alternative ways to configure the visualization. +2. Choose **ES|QL** under **Visualizations**. An ES|QL editor appears and lets you configure your query and its associated visualization. The **Suggestions** panel can help you find alternative ways to configure the visualization. ::::{tip} Check the [ES|QL reference](https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-language.html) to get familiar with the syntax and optimize your query. diff --git a/explore-analyze/visualize/field-statistics.md b/explore-analyze/visualize/field-statistics.md index ab92faa76..2513d13fd 100644 --- a/explore-analyze/visualize/field-statistics.md +++ b/explore-analyze/visualize/field-statistics.md @@ -8,7 +8,7 @@ mapped_pages: **Field statistics** panels allow you to display a table with additional field information in your dashboards, such as document count, values, and distribution. 1. From your dashboard, select **Add panel**. -2. Choose **Field statistics*** under ***Visualizations**. An ES|QL editor appears and lets you configure your query with the fields and information that you want to show. +2. Choose **Field statistics** under **Visualizations**. An ES|QL editor appears and lets you configure your query with the fields and information that you want to show. ::::{tip} Check the [ES|QL reference](https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-language.html) to get familiar with the syntax and optimize your query. diff --git a/explore-analyze/visualize/lens.md b/explore-analyze/visualize/lens.md index f1aaf2f74..c5003774b 100644 --- a/explore-analyze/visualize/lens.md +++ b/explore-analyze/visualize/lens.md @@ -48,7 +48,7 @@ Edit and delete. 6. Click **Apply and close**. ::::{tip} -Use the **Edit visualization*** flyout to make edits without having to leave the dashboard, or click ***Edit in Lens** in the flyout to make edits using the Lens application. +Use the **Edit visualization** flyout to make edits without having to leave the dashboard, or click **Edit in Lens** in the flyout to make edits using the Lens application. :::: @@ -362,7 +362,7 @@ The options available can vary based on the type of chart you’re setting up. F **Change the legend’s display** -With the **Visibility***, ***Position***, and ***Width** options, you can adjust the way the legend appears in or next to the visualization. +With the **Visibility**, **Position**, and **Width** options, you can adjust the way the legend appears in or next to the visualization. **Truncate long labels** @@ -370,9 +370,9 @@ With the **Label truncation** option, you can keep your legend minimal in case o **Show additional statistics for time series charts** -To make your legends as informative as possible, you can show some additional **Statistics*** for charts with a timestamp on one of the axes, and add a ***Series header**. +To make your legends as informative as possible, you can show some additional **Statistics** for charts with a timestamp on one of the axes, and add a **Series header**. -**Bar***, ***Line*** and ***Area** charts can show the following values: +**Bar**, **Line** and **Area** charts can show the following values: * **Average**: Average value considering all data points in the chart * **Median**: Median value considering all data points in the chart diff --git a/explore-analyze/visualize/link-panels.md b/explore-analyze/visualize/link-panels.md index 0b5a8b415..fb4c6ac02 100644 --- a/explore-analyze/visualize/link-panels.md +++ b/explore-analyze/visualize/link-panels.md @@ -5,7 +5,7 @@ mapped_pages: # Link panels [dashboard-links] -You can use **Links*** panels to create links to other dashboards or external websites. When creating links to other dashboards, you have the option to carry the time range, query, and filters to apply over to the linked dashboard. Links to external websites follow the [`externalUrl.policy`](https://www.elastic.co/guide/en/kibana/current/url-drilldown-settings-kb.html#external-URL-policy) settings. ***Links** panels support vertical and horizontal layouts and may be saved to the **Library** for use in other dashboards. +You can use **Links** panels to create links to other dashboards or external websites. When creating links to other dashboards, you have the option to carry the time range, query, and filters to apply over to the linked dashboard. Links to external websites follow the [`externalUrl.policy`](https://www.elastic.co/guide/en/kibana/current/url-drilldown-settings-kb.html#external-URL-policy) settings. **Links** panels support vertical and horizontal layouts and may be saved to the **Library** for use in other dashboards. :::{image} ../../images/kibana-dashboard_links_panel.png :alt: A screenshot displaying the new links panel @@ -22,11 +22,11 @@ You can use **Links*** panels to create links to other dashboards or external we To add a links panel to your dashboard: 1. From your dashboard, select **Add panel**. -2. In the **Add panel*** flyout, select ***Links***. The ***Create links panel** flyout appears and lets you add the link you want to display. +2. In the **Add panel** flyout, select **Links**. The **Create links panel** flyout appears and lets you add the link you want to display. 3. Choose between the panel displaying vertically or horizontally on your dashboard and add your link. 4. Specify the following: - * **Go to*** - Select **Dashboard** to link to another dashboard, or ***URL** to link to an external website. + * **Go to** - Select **Dashboard** to link to another dashboard, or **URL** to link to an external website. * **Choose destination** - Use the dropdown to select another dashboard or enter an external URL. * **Text** - Enter text for the link, which displays in the panel. * **Options** - When linking to another dashboard, use the sliders to use the filters and queries from the original dashboard, use the date range from the original dashboard, or open the dashboard in a new tab. When linking to an external URL, use the sliders to open the URL in a new tab, or encode the URL. @@ -40,7 +40,7 @@ To add a links panel to your dashboard: To add a previously saved links panel to another dashboard: 1. From your dashboard, select **Add from library**. -2. In the **Add from library*** flyout, select ***Links*** from the ***Types** dropdown and then select the Links panel you want to add. +2. In the **Add from library** flyout, select **Links** from the **Types** dropdown and then select the Links panel you want to add. 3. Click **Save**. diff --git a/explore-analyze/visualize/maps/maps-getting-started.md b/explore-analyze/visualize/maps/maps-getting-started.md index 40fb19eea..e9248e805 100644 --- a/explore-analyze/visualize/maps/maps-getting-started.md +++ b/explore-analyze/visualize/maps/maps-getting-started.md @@ -41,12 +41,12 @@ When you complete this tutorial, you’ll have a map that looks like this: The first layer you’ll add is a choropleth layer to shade world countries by web log traffic. Darker shades will symbolize countries with more web log traffic, and lighter shades will symbolize countries with less traffic. -1. Click **Add layer***, and then click ***Choropleth**. -2. From the **EMS boundaries*** dropdown menu, select ***World Countries**. +1. Click **Add layer**, and then click **Choropleth**. +2. From the **EMS boundaries** dropdown menu, select **World Countries**. 3. In **Statistics source**, set: - * **Data view*** to ***kibana_sample_data_logs** - * **Join field*** to ***geo.dest** + * **Data view** to **kibana_sample_data_logs** + * **Join field** to **geo.dest** 4. Click **Add and continue**. 5. In **Layer settings**, set: @@ -64,7 +64,7 @@ The first layer you’ll add is a choropleth layer to shade world countries by w * Set **Fill color > As number** to the grey color ramp. * Set **Border color** to white. - * Under **Label***, change ***By value*** to ***Fixed**. + * Under **Label**, change **By value** to **Fixed**. 8. Click **Keep changes**. @@ -86,8 +86,8 @@ To avoid overwhelming the user with too much data at once, you’ll add two laye This layer displays web log documents as points. The layer is only visible when users zoom in. -1. Click **Add layer***, and then click ***Documents**. -2. Set **Data view*** to ***kibana_sample_data_logs**. +1. Click **Add layer**, and then click **Documents**. +2. Set **Data view** to **kibana_sample_data_logs**. 3. Click **Add and continue**. 4. In **Layer settings**, set: @@ -95,9 +95,9 @@ This layer displays web log documents as points. The layer is only visible when * **Visibility** to the range [9, 24] * **Opacity** to 100% -5. Add a tooltip field and select **agent***, ***bytes***, ***clientip***, ***host***, ***machine.os***, ***request***, ***response***, and ***timestamp**. +5. Add a tooltip field and select **agent**, **bytes**, **clientip**, **host**, **machine.os**, **request**, **response**, and **timestamp**. 6. In **Scaling**, enable **Limit results to 10,000.** -7. In **Layer style***, set ***Fill color*** to ***#2200FF**. +7. In **Layer style**, set **Fill color** to **#2200FF**. 8. Click **Keep changes**. Your map will look like this from zoom level 9 to 24: @@ -113,8 +113,8 @@ This layer displays web log documents as points. The layer is only visible when You’ll create a layer for [aggregated data](../../aggregations.md) and make it visible only when the map is zoomed out. Darker colors will symbolize grids with more web log traffic, and lighter colors will symbolize grids with less traffic. Larger circles will symbolize grids with more total bytes transferred, and smaller circles will symbolize grids with less bytes transferred. -1. Click **Add layer***, and select ***Clusters**. -2. Set **Data view*** to ***kibana_sample_data_logs**. +1. Click **Add layer**, and select **Clusters**. +2. Set **Data view** to **kibana_sample_data_logs**. 3. Click **Add and continue**. 4. In **Layer settings**, set: @@ -124,11 +124,11 @@ You’ll create a layer for [aggregated data](../../aggregations.md) and make it 5. In **Metrics**: - * Set **Aggregation*** to ***Count**. + * Set **Aggregation** to **Count**. * Click **Add metric**. - * Set **Aggregation*** to ***Sum*** with ***Field*** set to ***bytes**. + * Set **Aggregation** to **Sum** with **Field** set to **bytes**. -6. In **Layer style***, change ***Symbol size**: +6. In **Layer style**, change **Symbol size**: * Set **By value** to **sum bytes**. * Set the min size to 7 and the max size to 25 px. @@ -156,7 +156,7 @@ Now that your map is complete, save it and return to the dashboard. View your geospatial data alongside a heat map and pie chart, and then filter the data. When you apply a filter in one panel, it is applied to all panels on the dashboard. 1. Click **Add from library** to open a list of panels that you can add to the dashboard. -2. Add **[Logs] Unique Destination Heatmap*** and ***[Logs] Bytes distribution** to the dashboard. +2. Add **[Logs] Unique Destination Heatmap** and **[Logs] Bytes distribution** to the dashboard. :::{image} ../../../images/kibana-gs_dashboard_with_map.png :alt: Map in a dashboard with 2 other panels @@ -168,7 +168,7 @@ View your geospatial data alongside a heat map and pie chart, and then filter th 5. Set a filter from the map: 1. Open a tooltip by clicking anywhere in the United States vector. - 2. To show only documents where **geo.src*** is ***US***, click the filter icon ![filter icon](../../../images/kibana-gs-filter-icon.png "")in the row for ***ISO 3066-1 alpha-2**. + 2. To show only documents where **geo.src** is **US**, click the filter icon ![filter icon](../../../images/kibana-gs-filter-icon.png "")in the row for **ISO 3066-1 alpha-2**. :::{image} ../../../images/kibana-gs_tooltip_filter.png :alt: Tooltip on map diff --git a/explore-analyze/visualize/maps/maps-vector-style-properties.md b/explore-analyze/visualize/maps/maps-vector-style-properties.md index 1b29c68f6..ff75ee2c4 100644 --- a/explore-analyze/visualize/maps/maps-vector-style-properties.md +++ b/explore-analyze/visualize/maps/maps-vector-style-properties.md @@ -52,9 +52,9 @@ Available icons Custom Icons -You can also use your own SVG icon to style Point features in your map. In **Layer settings*** open the **icon** dropdown, and click the ***Add custom icon** button. For best results, your SVG icon should be monochrome and have limited details. +You can also use your own SVG icon to style Point features in your map. In **Layer settings** open the **icon** dropdown, and click the **Add custom icon** button. For best results, your SVG icon should be monochrome and have limited details. -Dynamic styling in **Elastic Maps*** requires rendering SVG icons as PNGs using a [signed distance function](https://en.wikipedia.org/wiki/Signed_distance_function). As a result, sharp corners and intricate details may not render correctly. Modifying the settings under ***Advanced Options*** in the ***Add custom icon** modal may improve rendering. +Dynamic styling in **Elastic Maps** requires rendering SVG icons as PNGs using a [signed distance function](https://en.wikipedia.org/wiki/Signed_distance_function). As a result, sharp corners and intricate details may not render correctly. Modifying the settings under **Advanced Options** in the **Add custom icon** modal may improve rendering. Manage your custom icons in [settings](maps-settings.md). From badb295abb8e46fc7242c37c827617c8bfbccc13 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Wed, 5 Feb 2025 11:46:59 +0100 Subject: [PATCH 08/15] [E&A] Refines outlier detection, classification, and regression pages (#323) * [E&A] Refines data frame analytics doc set. --- docset.yml | 12 +- .../machine-learning/data-frame-analytics.md | 17 +- .../ml-dfa-classification.md | 133 ++++++---------- .../ml-dfa-finding-outliers.md | 147 ++++++++---------- .../data-frame-analytics/ml-dfa-overview.md | 13 +- .../data-frame-analytics/ml-dfa-regression.md | 128 ++++++--------- .../kibana/kibana/xpack-ml-dfanalytics.md | 13 -- .../stack-docs/machine-learning/index.md | 3 - .../machine-learning/ml-dfanalytics.md | 18 --- raw-migrated-files/toc.yml | 4 - 10 files changed, 177 insertions(+), 311 deletions(-) delete mode 100644 raw-migrated-files/kibana/kibana/xpack-ml-dfanalytics.md delete mode 100644 raw-migrated-files/stack-docs/machine-learning/index.md delete mode 100644 raw-migrated-files/stack-docs/machine-learning/ml-dfanalytics.md diff --git a/docset.yml b/docset.yml index 05de5870c..235e093ad 100644 --- a/docset.yml +++ b/docset.yml @@ -370,10 +370,10 @@ subs: dataframe-transforms-cap: "Transforms" dfanalytics-cap: "Data frame analytics" dfanalytics: "data frame analytics" - dataframe-analytics-config: "'{dataframe} analytics config'" - dfanalytics-job: "'{dataframe} analytics job'" - dfanalytics-jobs: "'{dataframe} analytics jobs'" - dfanalytics-jobs-cap: "'{dataframe-cap} analytics jobs'" + dataframe-analytics-config: "data frame analytics analytics config" + dfanalytics-job: "data frame analytics analytics job" + dfanalytics-jobs: "data frame analytics analytics jobs" + dfanalytics-jobs-cap: "Data frame analytics analytics jobs" cdataframe: "continuous data frame" cdataframes: "continuous data frames" cdataframe-cap: "Continuous data frame" @@ -390,8 +390,8 @@ subs: olscore: "outlier score" olscores: "outlier scores" fiscore: "feature influence score" - evaluatedf-api: "evaluate {dataframe} analytics API" - evaluatedf-api-cap: "Evaluate {dataframe} analytics API" + evaluatedf-api: "evaluate data frame analytics API" + evaluatedf-api-cap: "Evaluate data frame analytics API" binarysc: "binary soft classification" binarysc-cap: "Binary soft classification" regression: "regression" diff --git a/explore-analyze/machine-learning/data-frame-analytics.md b/explore-analyze/machine-learning/data-frame-analytics.md index e0e5ef374..adfa29518 100644 --- a/explore-analyze/machine-learning/data-frame-analytics.md +++ b/explore-analyze/machine-learning/data-frame-analytics.md @@ -4,11 +4,18 @@ mapped_urls: - https://www.elastic.co/guide/en/kibana/current/xpack-ml-dfanalytics.html --- -# Data frame analytics +# Data frame analytics [ml-dfanalytics] -% What needs to be done: Lift-and-shift +::::{important} +Using {{dfanalytics}} requires source data to be structured as a two dimensional "tabular" data structure, in other words a {{dataframe}}. [{{transforms-cap}}](../transforms.md) enable you to create {{dataframes}} which can be used as the source for {{dfanalytics}}. +:::: -% Use migrated content from existing pages that map to this page: +{{dfanalytics-cap}} enable you to perform different analyses of your data and annotate it with the results. Consult [Setup and security](setting-up-machine-learning.md) to learn more about the license and the security privileges that are required to use {{dfanalytics}}. -% - [ ] ./raw-migrated-files/stack-docs/machine-learning/ml-dfanalytics.md -% - [ ] ./raw-migrated-files/kibana/kibana/xpack-ml-dfanalytics.md \ No newline at end of file +* [Overview](data-frame-analytics/ml-dfa-overview.md) +* [*Finding outliers*](data-frame-analytics/ml-dfa-finding-outliers.md) +* [*Predicting numerical values with {{regression}}*](data-frame-analytics/ml-dfa-regression.md) +* [*Predicting classes with {{classification}}*](data-frame-analytics/ml-dfa-classification.md) +* [*Advanced concepts*](data-frame-analytics/ml-dfa-concepts.md) +* [*API quick reference*](data-frame-analytics/ml-dfanalytics-apis.md) +* [*Resources*](data-frame-analytics/ml-dfa-resources.md) diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-classification.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-classification.md index 9f164ca9b..4ae0f76d5 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-classification.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-classification.md @@ -15,22 +15,18 @@ In reality, {{classification}} problems are more complex, such as classifying ma When you create a {{classification}} job, you must specify which field contains the classes that you want to predict. This field is known as the *{{depvar}}*. It can contain maximum 100 classes. By default, all other [supported fields](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-dfanalytics.html#dfa-supported-fields) are included in the analysis and are known as *{{feature-vars}}*. You can optionally include or exclude fields. For more information about field selection, refer to the [explain data frame analytics API](https://www.elastic.co/guide/en/elasticsearch/reference/current/explain-dfanalytics.html). - ## {{classification-cap}} algorithms [dfa-classification-algorithm] {{classanalysis-cap}} uses an ensemble algorithm that is similar to extreme gradient boosting (XGBoost) which combines multiple weak models into a composite one. It uses decision trees to learn to predict the probability that a data point belongs to a certain class. XGBoost trains a sequence of decision trees and every decision tree learns from the mistakes of the forest so far. In each iteration, the trees added to the forest improve the decision quality of the combined decision forest. The classification algorithm optimizes for a loss function called cross-entropy loss. - ## 1. Define the problem [dfa-classification-problem] {{classification-cap}} can be useful in cases where discrete, categorical values needs to be predicted. If your use case requires predicting such values, then {{classification}} might be the suitable choice for you. - ## 2. Set up the environment [dfa-classification-environment] Before you can use the {{stack-ml-features}}, there are some configuration requirements (such as security privileges) that must be addressed. Refer to [Setup and security](../setting-up-machine-learning.md). - ## 3. Prepare and transform data [dfa-classification-prepare-data] {{classification-cap}} is a supervised {{ml}} method, which means you need to supply a labeled training data set. This data set must have values for the {{feature-vars}} and the {{depvar}} which are used to train the model. The training process uses this information to learn the relationships between the classes and the {{feature-vars}}. This labeled data set also plays a critical role in model evaluation. @@ -41,7 +37,6 @@ You might also need to [{{transform}}](../../transforms.md) your data to create To learn more about how to prepare your data, refer to [the relevant section](ml-dfa-overview.md#prepare-transform-data) of the supervised learning overview. - ## 4. Create a job [dfa-classification-create-job] {{dfanalytics-jobs-cap}} contain the configuration information and metadata necessary to perform an analytics task. You can create {{dfanalytics-jobs}} via {{kib}} or using the [create {{dfanalytics-jobs}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-dfanalytics.html). @@ -52,10 +47,8 @@ Select {{classification}} as the analytics type, then select the field that you You can view the statistics of the selectable fields in the {{dfanalytics}} wizard. The field statistics displayed in a flyout provide more meaningful context to help you select relevant fields. :::: - To improve performance, consider using a small `training_percent` value to train the model more quickly. It is a good strategy to make progress iteratively: run the analysis with a small training percentage, then evaluate the performance. Based on the results, you can decide if it is necessary to increase the `training_percent` value. - ## 5. Start the job [dfa-classification-start] You can start the job via {{kib}} or using the [start {{dfanalytics-jobs}}](https://www.elastic.co/guide/en/elasticsearch/reference/current/start-dfanalytics.html) API. A {{classification}} job has the following phases: @@ -75,8 +68,6 @@ After the last phase is finished, the job stops and the results are ready for ev When you create a {{dfanalytics-job}}, the inference step of the process might fail if the model is too large to fit into JVM. For a workaround, refer to [this GitHub issue](https://github.com/elastic/elasticsearch/issues/76093). :::: - - ## 6. Evaluate and interpret the result [ml-dfanalytics-classification-evaluation] Using the {{dfanalytics}} features to gain insights from a data set is an iterative process. After you defined the problem you want to solve, and chose the analytics type that can help you to do so, you need to produce a high-quality data set and create the appropriate {{dfanalytics-job}}. You might need to experiment with different configurations, parameters, and ways to transform data before you arrive at a result that satisfies your use case. A valuable companion to this process is the [{{evaluatedf-api}}](https://www.elastic.co/guide/en/elasticsearch/reference/current/evaluate-dfanalytics.html), which enables you to evaluate the {{dfanalytics}} performance. It helps you understand error distributions and identifies the points where the {{dfanalytics}} model performs well or less trustworthily. @@ -90,11 +81,10 @@ You can measure how well the model has performed on your training data set by us The following metrics helps you interpret the analysis results: -* {feat-imp} +* {{feat-imp}} * `class_probability` * `class_score` - ### Multiclass confusion matrix [ml-dfanalytics-mccm] The multiclass confusion matrix provides a summary of the performance of the {{classanalysis}}. It contains the number of occurrences where the analysis classified data points correctly with their actual class as well as the number of occurrences where it misclassified them. @@ -115,7 +105,6 @@ As the number of classes increases, the confusion matrix becomes more complex: This matrix contains the actual labels on the left side while the predicted labels are on the top. The proportion of correct and incorrect predictions is broken down for each class. This enables you to examine how the {{classanalysis}} confused the different classes while it made its predictions. - ### Area under the curve of receiver operating characteristic (AUC ROC) [ml-dfanalytics-class-aucroc] The receiver operating characteristic (ROC) curve is a plot that represents the performance of the {{classification}} process at different predicted probability thresholds. It compares the true positive rate for a specific class against the rate of all the other classes combined ("one versus all" strategy) at the different threshold levels to create the curve. @@ -128,18 +117,14 @@ From this plot, you can compute the area under the curve (AUC) value, which is a To use this evaluation method, you must set `num_top_classes` to `-1` or a value greater than or equal to the total number of classes when you create the {{dfanalytics-job}}. :::: - - ### {{feat-imp-cap}} [dfa-classification-feature-importance] {{feat-imp-cap}} provides further information about the results of an analysis and helps to interpret the results in a more subtle way. If you want to learn more about {{feat-imp}}, refer to [{{feat-imp-cap}}](ml-feature-importance.md). - ### `class_probability` [dfa-classification-class-probability] The `class_probability` is a value between 0 and 1, which indicates how likely it is that a given data point belongs to a certain class. The higher the number, the higher the probability that the data point belongs to the named class. This information is stored in the `top_classes` array for each document in the destination index. - ### `class_score` [dfa-classification-class-score] The `class_score` is a function of the `class_probability` and has a value that is greater than or equal to zero. It takes into consideration your objective (as defined in the `class_assignment_objective` job configuration option): *accuracy* or *recall*. @@ -155,7 +140,6 @@ If your objective is to maximize accuracy, the scores are weighted to maximize t If there is an imbalanced class distribution in your training data, focusing on accuracy can decrease your model’s sensitivity to incorrect predictions in the under-represented classes. :::: - By default, {{classanalysis}} jobs accept a slight degradation of the overall accuracy in return for greater sensitivity to classes that are predicted incorrectly. That is to say, their objective is to maximize the minimum recall. For example, in the context of a multi-class confusion matrix, the predictions of interest are in each row: :::{image} ../../../images/machine-learning-confusion-matrix-multiclass-recall.jpg @@ -167,7 +151,6 @@ For each class, the recall is calculated as the number of correct predictions di To learn more about choosing the class assignment objective that fits your goal, refer to this [Jupyter notebook](https://github.com/elastic/examples/blob/master/Machine%20Learning/Class%20Assigment%20Objectives/classification-class-assignment-objective.ipynb). - ## 7. Deploy the model [dfa-classification-deploy] The model that you created is stored as {{es}} documents in internal indices. In other words, the characteristics of your trained model are saved and ready to be deployed and used as functions. @@ -175,24 +158,24 @@ The model that you created is stored as {{es}} documents in internal indices. In 1. To deploy {{dfanalytics}} model in a pipeline, navigate to **Machine Learning** > **Model Management** > **Trained models** in the main menu, or use the [global search field](../../overview/kibana-quickstart.md#_finding_your_apps_and_objects) in {{kib}}. 2. Find the model you want to deploy in the list and click **Deploy model** in the **Actions** menu. - :::{image} ../../../images/machine-learning-ml-dfa-trained-models-ui.png - :alt: The trained models UI in {kib} - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-ml-dfa-trained-models-ui.png +:alt: The trained models UI in {kib} +:class: screenshot +::: 3. Create an {{infer}} pipeline to be able to use the model against new data through the pipeline. Add a name and a description or use the default values. - :::{image} ../../../images/machine-learning-ml-dfa-inference-pipeline.png - :alt: Creating an inference pipeline - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-ml-dfa-inference-pipeline.png +:alt: Creating an inference pipeline +:class: screenshot +::: 4. Configure the pipeline processors or use the default settings. - :::{image} ../../../images/machine-learning-ml-dfa-inference-processor.png - :alt: Configuring an inference processor - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-ml-dfa-inference-processor.png +:alt: Configuring an inference processor +:class: screenshot +::: 5. Configure to handle ingest failures or use the default settings. 6. (Optional) Test your pipeline by running a simulation of the pipeline to confirm it produces the anticipated results. @@ -200,21 +183,18 @@ The model that you created is stored as {{es}} documents in internal indices. In The model is deployed and ready to use through the {{infer}} pipeline. - ### {{infer-cap}} [ml-inference-class] {{infer-cap}} enables you to use [trained {{ml}} models](ml-trained-models.md) against incoming data in a continuous fashion. For instance, suppose you have an online service and you would like to predict whether a customer is likely to churn. You have an index with historical data – information on the customer behavior throughout the years in your business – and a {{classification}} model that is trained on this data. The new information comes into a destination index of a {{ctransform}}. With {{infer}}, you can perform the {{classanalysis}} against the new data with the same input fields that you’ve trained the model on, and get a prediction. - #### {{infer-cap}} processor [ml-inference-processor-class] {{infer-cap}} can be used as a processor specified in an [ingest pipeline](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md). It uses a trained model to infer against the data that is being ingested in the pipeline. The model is used on the ingest node. {{infer-cap}} pre-processes the data by using the model and provides a prediction. After the process, the pipeline continues executing (if there is any other processor in the pipeline), finally the new data together with the results are indexed into the destination index. Check the [{{infer}} processor](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-processor.html) and [the {{ml}} {dfanalytics} API documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-df-analytics-apis.html) to learn more. - #### {{infer-cap}} aggregation [ml-inference-aggregation-class] {{infer-cap}} can also be used as a pipeline aggregation. You can reference a trained model in the aggregation to infer on the result field of the parent bucket aggregation. The {{infer}} aggregation uses the model on the results to provide a prediction. This aggregation enables you to run {{classification}} or {{reganalysis}} at search time. If you want to perform the analysis on a small set of data, this aggregation enables you to generate predictions without the need to set up a processor in the ingest pipeline. @@ -225,8 +205,6 @@ Check the [{{infer}} bucket aggregation](https://www.elastic.co/guide/en/elastic If you use trained model aliases to reference your trained model in an {{infer}} processor or {{infer}} aggregation, you can replace your trained model with a new one without the need of updating the processor or the aggregation. Reassign the alias you used to a new trained model ID by using the [Create or update trained model aliases API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-trained-models-aliases.html). The new trained model needs to use the same type of {{dfanalytics}} as the old one. :::: - - ## Performing {{classanalysis}} in the sample flight data set [performing-classification] Let’s try to predict whether a flight will be delayed or not by using the [sample flight data](../../overview/kibana-quickstart.md#gs-get-data-into-kibana). The data set contains information such as weather conditions, carrier, flight distance, origin, destination, and whether or not the flight was delayed. The {{classification}} model learns the relationships between the fields in your data to predict the value of the *dependent variable*, which in this case is the boolean `FlightDelay` field. @@ -235,8 +213,6 @@ Let’s try to predict whether a flight will be delayed or not by using the [sam If you want to view this example in a Jupyter notebook, [click here](https://github.com/elastic/examples/tree/master/Machine%20Learning/Analytics%20Jupyter%20Notebooks). :::: - - ### Preparing your data [flightdata-classification-data] Each document in the sample flight data set contains details for a single flight, so the data is ready for analysis; it is already in a two-dimensional entity-based data structure. In general, you often need to [transform](../../transforms.md) the data into an entity-centric index before you can analyze it. @@ -293,13 +269,10 @@ In order to be analyzed, a document must contain at least one field with a suppo :::: - ::::{tip} The sample flight data set is used in this example because it is easily accessible. However, the data has been manually created and contains some inconsistencies. For example, a flight can be both delayed and canceled. This is a good reminder that the quality of your input data affects the quality of your results. :::: - - ### Creating a {{classification}} model [flightdata-classification-model] To predict whether a specific flight is delayed: @@ -308,10 +281,10 @@ To predict whether a specific flight is delayed: You can use the wizard on the **{{ml-app}}** > **Data Frame Analytics** tab in {{kib}} or the [create {{dfanalytics-jobs}}](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-dfanalytics.html) API. - :::{image} ../../../images/machine-learning-flights-classification-job-1.jpg - :alt: Creating a {{dfanalytics-job}} in {kib} - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-flights-classification-job-1.jpg +:alt: Creating a {{dfanalytics-job}} in {kib} +:class: screenshot +::: 1. Choose `kibana_sample_data_flights` as the source index. 2. Choose `classification` as the job type. @@ -320,10 +293,10 @@ To predict whether a specific flight is delayed: The wizard includes a scatterplot matrix, which enables you to explore the relationships between the numeric fields. The color of each point is affected by the value of the {{depvar}} for that document, as shown in the legend. You can highlight an area in one of the charts and the corresponding area is also highlighted in the rest of the charts. You can use this matrix to help you decide which fields to include or exclude. - :::{image} ../../../images/machine-learning-flights-classification-scatterplot.png - :alt: A scatterplot matrix for three fields in {kib} - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-flights-classification-scatterplot.png +:alt: A scatterplot matrix for three fields in {kib} +:class: screenshot +::: If you want these charts to represent data from a larger sample size or from a randomized selection of documents, you can change the default behavior. However, a larger sample size might slow down the performance of the matrix and a randomized selection might put more load on the cluster due to the more intensive query. @@ -334,9 +307,9 @@ To predict whether a specific flight is delayed: 9. Add the name of the destination index that will contain the results. In {{kib}}, the index name matches the job ID by default. It will contain a copy of the source index data where each document is annotated with the results. If the index does not exist, it will be created automatically. 10. Use default values for all other options. - ::::{dropdown} API example - ```console - PUT _ml/data_frame/analytics/model-flight-delays-classification +::::{dropdown} API example +```console +PUT _ml/data_frame/analytics/model-flight-delays-classification { "source": { "index": [ @@ -363,13 +336,13 @@ To predict whether a specific flight is delayed: ] } } - ``` +``` - 1. The field name in the `dest` index that contains the analysis results. - 2. To disable {{feat-imp}} calculations, omit this option. +1. The field name in the `dest` index that contains the analysis results. +2. To disable {{feat-imp}} calculations, omit this option. - :::: +:::: After you configured your job, the configuration details are automatically validated. If the checks are successful, you can start the job. A warning message is shown if the configuration is invalid. The message contains a suggestion to improve the configuration to be validated. @@ -378,30 +351,30 @@ To predict whether a specific flight is delayed: The job takes a few minutes to run. Runtime depends on the local hardware and also on the number of documents and fields that are analyzed. The more fields and documents, the longer the job runs. It stops automatically when the analysis is complete. - ::::{dropdown} API example - ```console - POST _ml/data_frame/analytics/model-flight-delays-classification/_start - ``` +::::{dropdown} API example +```console +POST _ml/data_frame/analytics/model-flight-delays-classification/_start +``` - :::: +:::: 3. Check the job stats to follow the progress in {{kib}} or use the [get {{dfanalytics-jobs}} statistics API](https://www.elastic.co/guide/en/elasticsearch/reference/current/get-dfanalytics-stats.html). - :::{image} ../../../images/machine-learning-flights-classification-details.jpg - :alt: Statistics for a {{dfanalytics-job}} in {kib} - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-flights-classification-details.jpg +:alt: Statistics for a {{dfanalytics-job}} in {kib} +:class: screenshot +::: When the job stops, the results are ready to view and evaluate. To learn more about the job phases, see [How {{dfanalytics-jobs}} work](ml-dfa-phases.md). - ::::{dropdown} API example - ```console - GET _ml/data_frame/analytics/model-flight-delays-classification/_stats - ``` +::::{dropdown} API example +```console +GET _ml/data_frame/analytics/model-flight-delays-classification/_stats +``` - The API call returns the following response: +The API call returns the following response: - ```console-result +```console-result { "count" : 1, "data_frame_analytics" : [ @@ -481,15 +454,13 @@ To predict whether a specific flight is delayed: "loss_type" : "binomial_logistic" } } - } + } } ] } - ``` - - :::: - +``` +:::: ### Viewing {{classification}} results [flightdata-classification-results] @@ -510,7 +481,6 @@ If you want to understand how certain the model is about each prediction, you ca If you have a large number of classes, your destination index contains a large number of predicted probabilities for each document. When you create the {{classification}} job, you can use the `num_top_classes` option to modify this behavior. :::: - ::::{dropdown} API example ```console GET model-flight-delays-classification/_search @@ -543,12 +513,10 @@ The snippet below shows the probability and score details for a document in the 1. An array of values specifying the probability of the prediction and the score for each class. - The class with the highest score is the prediction. In this example, `false` has a `class_score` of 0.35 while `true` has only 0.06, so the prediction will be `false`. For more details about these values, see [`class_score`](#dfa-classification-class-score). :::: - If you chose to calculate {{feat-imp}}, the destination index also contains `ml.feature_importance` objects. Every field that is included in the analysis (known as a *feature* of the data point) is assigned a {{feat-imp}} value. It has both a magnitude and a direction (positive or negative), which indicates how each field affects a particular prediction. Only the most significant values (in this case, the top 10) are stored in the index. However, the trained model metadata also contains the average magnitude of the {{feat-imp}} values for each field across all the training data. You can view this summarized information in {{kib}}: :::{image} ../../../images/machine-learning-flights-classification-total-importance.jpg @@ -646,7 +614,6 @@ The snippet below shows an example of the total {{feat-imp}} and the correspondi 3. This value is the minimum {{feat-imp}} value across all the training data for this field when the predicted class is `false`. 4. This value is the maximum {{feat-imp}} value across all the training data for this field when the predicted class is `false`. - To see the top {{feat-imp}} values for each prediction, search the destination index. For example: ```console @@ -698,10 +665,8 @@ The sum of the {{feat-imp}} values for each class in this data point approximate :::: - Lastly, {{kib}} provides a scatterplot matrix in the results. It has the same functionality as the matrix that you saw in the job wizard. Its purpose is to help you visualize and explore the relationships between the numeric fields and the {{depvar}}. - ### Evaluating {{classification}} results [flightdata-classification-evaluate] Though you can look at individual results and compare the predicted value (`ml.FlightDelay_prediction`) to the actual value (`FlightDelay`), you typically need to evaluate the success of your {{classification}} model as a whole. @@ -717,7 +682,6 @@ Though you can look at individual results and compare the predicted value (`ml.F As the sample data may change when it is loaded into {{kib}}, the results of the analysis can vary even if you use the same configuration as the example. Therefore, use this information as a guideline for interpreting your own results. :::: - If you want to see the exact number of occurrences, select a quadrant in the matrix. You can also use the **Training** and **Testing** filter options to refine the contents of the matrix. Thus you can see how well the model performs on previously unseen data. You can check how many documents are `true` in the testing data, how many of them are identified correctly (*true positives*) and how many of them are identified incorrectly as `false` (*false negatives*). Likewise if you select other quadrants in the matrix, it shows the number of documents that have the `false` class as their actual value in the testing data. The matrix shows the number of documents that are correctly identified as `false` (*true negatives*) and the number of documents that are incorrectly predicted as `true` (*false positives*). When you perform {{classanalysis}} on your own data, it might take multiple iterations before you are satisfied with the results and ready to deploy the model. @@ -759,7 +723,6 @@ POST _ml/data_frame/_evaluate 1. We calculate the training error by evaluating only the training data. - Next, we calculate the generalization error that represents how well the model performed on previously unseen data: ```console @@ -787,7 +750,6 @@ POST _ml/data_frame/_evaluate 1. We evaluate only the documents that are not part of the training data. - The returned confusion matrix shows us how many data points were classified correctly (where the `actual_class` matches the `predicted_class`) and how many were misclassified (`actual_class` does not match `predicted_class`): ```console-result @@ -837,13 +799,10 @@ The returned confusion matrix shows us how many data points were classified corr 3. The name of the predicted class. 4. The number of documents that belong to the actual class and are labeled as the predicted class. - :::: - If you don’t want to keep the {{dfanalytics-job}}, you can delete it in {{kib}} or by using the [delete {{dfanalytics-job}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/delete-dfanalytics.html). When you delete {{dfanalytics-jobs}} in {{kib}}, you have the option to also remove the destination indices and {{data-sources}}. - ### Further readings [dfa-classification-readings] * [{{classanalysis-cap}} example (Jupyter notebook)](https://github.com/elastic/examples/tree/master/Machine%20Learning/Analytics%20Jupyter%20Notebooks) diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-finding-outliers.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-finding-outliers.md index c3790234a..b237b6c56 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-finding-outliers.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-finding-outliers.md @@ -11,8 +11,6 @@ mapped_pages: {{oldetection-cap}} is a batch analysis, it runs against your data once. If new data comes into the index, you need to do the analysis again on the altered data. :::: - - ## {{oldetection-cap}} algorithms [dfa-outlier-algorithms] In the {{stack}}, we use an ensemble of four different distance and density based {{oldetection}} methods: @@ -26,44 +24,37 @@ You don’t need to select the methods or provide any parameters, but you can ov The four algorithms don’t always agree on which points are outliers. By default, {{oldetection}} jobs use all these methods, then normalize and combine their results and give every data point in the index an {{olscore}}. The {{olscore}} ranges from 0 to 1, where the higher number represents the chance that the data point is an outlier compared to the other data points in the index. - ### Feature influence [dfa-feature-influence] Feature influence – another score calculated while detecting outliers – provides a relative ranking of the different features and their contribution towards a point being an outlier. This score allows you to understand the context or the reasoning on why a certain data point is an outlier. - ## 1. Define the problem [dfa-outlier-detection-problem] {{oldetection-cap}} in the {{stack}} can be used to detect any unusual entity in a given population. For example, to detect malicious software on a machine or unusual user behavior on a network. As {{oldetection}} operates on the assumption that the outliers make up a small proportion of the overall data population, you can use this feature in such cases. {{oldetection-cap}} is a batch analysis that works best on an entity-centric index. If your use case is based on time series data, you might want to use [{{anomaly-detect}}](../anomaly-detection.md) instead. The {{ml-features}} provide unsupervised {{oldetection}}, which means there is no need to provide a training data set. - ## 2. Set up the environment [dfa-outlier-detection-environment] Before you can use the {{stack-ml-features}}, there are some configuration requirements (such as security privileges) that must be addressed. Refer to [Setup and security](../setting-up-machine-learning.md). - ## 3. Prepare and transform data [dfa-outlier-detection-prepare-data] {{oldetection-cap}} requires specifically structured source data: a two dimensional tabular data structure. For this reason, you might need to [{{transform}}](../../transforms.md) your data to create a {{dataframe}} which can be used as the source for {{oldetection}}. You can find an example of how to transform your data into an entity-centric index in [this section](#weblogs-outliers). - ## 4. Create a job [dfa-outlier-detection-create-job] -{{dfanalytics-jobs-cap}} contain the configuration information and metadata necessary to perform an analytics task. You can create {{dfanalytics-jobs}} via {{kib}} or using the [create {{dfanalytics-jobs}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-dfanalytics.html). Select {{oldetection}} as the analytics type that the {{dfanalytics-job}} performs. You can also decide to include and exclude fields to/from the analysis when you create the job. +{{dfanalytics-cap}} jobs contain the configuration information and metadata necessary to perform an analytics task. You can create {{dfanalytics}} jobs via {{kib}} or using the [create {{dfanalytics}} jobs API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-dfanalytics.html). Select {{oldetection}} as the analytics type that the {{dfanalytics}} job performs. You can also decide to include and exclude fields to/from the analysis when you create the job. ::::{tip} You can view the statistics of the selectable fields in the {{dfanalytics}} wizard. The field statistics displayed in a flyout provide more meaningful context to help you select relevant fields. :::: - - ## 5. Start the job [dfa-outlier-detection-start] -You can start the job via {{kib}} or using the [start {{dfanalytics-jobs}}](https://www.elastic.co/guide/en/elasticsearch/reference/current/start-dfanalytics.html) API. An {{oldetection}} job has four phases: +You can start the job via {{kib}} or using the [start {{dfanalytics}} job](https://www.elastic.co/guide/en/elasticsearch/reference/current/start-dfanalytics.html) API. An {{oldetection}} job has four phases: * `reindexing`: documents are copied from the source index to the destination index. * `loading_data`: the job fetches the necessary data from the destination index. @@ -72,14 +63,13 @@ You can start the job via {{kib}} or using the [start {{dfanalytics-jobs}}](http After the last phase is finished, the job stops and the results are ready for evaluation. -{{oldetection-cap}} jobs – unlike other {{dfanalytics-jobs}} – run one time in their life cycle. If you’d like to run the analysis again, you need to create a new job. - +{{oldetection-cap}} jobs – unlike other {{dfanalytics}} jobs – run one time in their life cycle. If you’d like to run the analysis again, you need to create a new job. ## 6. Evaluate the results [ml-outlier-detection-evaluate] -Using the {{dfanalytics}} features to gain insights from a data set is an iterative process. After you defined the problem you want to solve, and chose the analytics type that can help you to do so, you need to produce a high-quality data set and create the appropriate {{dfanalytics-job}}. You might need to experiment with different configurations, parameters, and ways to transform data before you arrive at a result that satisfies your use case. A valuable companion to this process is the [{{evaluatedf-api}}](https://www.elastic.co/guide/en/elasticsearch/reference/current/evaluate-dfanalytics.html), which enables you to evaluate the {{dfanalytics}} performance. It helps you understand error distributions and identifies the points where the {{dfanalytics}} model performs well or less trustworthily. +Using the {{dfanalytics}} features to gain insights from a data set is an iterative process. After you defined the problem you want to solve, and chose the analytics type that can help you to do so, you need to produce a high-quality data set and create the appropriate {{dfanalytics}} job. You might need to experiment with different configurations, parameters, and ways to transform data before you arrive at a result that satisfies your use case. A valuable companion to this process is the [evaluate {{dfanalytics}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/evaluate-dfanalytics.html), which enables you to evaluate the {{dfanalytics}} performance. It helps you understand error distributions and identifies the points where the {{dfanalytics}} model performs well or less trustworthily. -To evaluate the analysis with this API, you need to annotate your index that contains the results of the analysis with a field that marks each document with the ground truth. The {{evaluatedf-api}} evaluates the performance of the {{dfanalytics}} against this manually provided ground truth. +To evaluate the analysis with this API, you need to annotate your index that contains the results of the analysis with a field that marks each document with the ground truth. The evaluate {{dfanalytics}} API evaluates the performance of the {{dfanalytics}} against this manually provided ground truth. The {{oldetection}} evaluation type offers the following metrics to evaluate the model performance: @@ -88,7 +78,6 @@ The {{oldetection}} evaluation type offers the following metrics to evaluate the * recall * receiver operating characteristic (ROC) curve. - ### Confusion matrix [ml-dfanalytics-confusion-matrix] A confusion matrix provides four measures of how well the {{dfanalytics}} worked on your data set: @@ -98,10 +87,9 @@ A confusion matrix provides four measures of how well the {{dfanalytics}} worked * False positives (FP): Not class members that the analysis misidentified as class members. * False negatives (FN): Class members that the analysis misidentified as not class members. -Although, the {{evaluatedf-api}} can compute the confusion matrix out of the analysis results, these results are not binary values (class member/not class member), but a number between 0 and 1 (which called the {{olscore}} in case of {{oldetection}}). This value captures how likely it is for a data point to be a member of a certain class. It means that it is up to the user to decide what is the threshold or cutoff point at which the data point will be considered as a member of the given class. For example, the user can say that all the data points with an {{olscore}} higher than 0.5 will be considered as outliers. - -To take this complexity into account, the {{evaluatedf-api}} returns the confusion matrix at different thresholds (by default, 0.25, 0.5, and 0.75). +Although, the evaluate {{dfanalytics}} API can compute the confusion matrix out of the analysis results, these results are not binary values (class member/not class member), but a number between 0 and 1 (which called the {{olscore}} in case of {{oldetection}}). This value captures how likely it is for a data point to be a member of a certain class. It means that it is up to the user to decide what is the threshold or cutoff point at which the data point will be considered as a member of the given class. For example, the user can say that all the data points with an {{olscore}} higher than 0.5 will be considered as outliers. +To take this complexity into account, the evaluate {{dfanalytics}} API returns the confusion matrix at different thresholds (by default, 0.25, 0.5, and 0.75). ### Precision and recall [ml-dfanalytics-precision-recall] @@ -113,21 +101,17 @@ Recall shows how many of the data points that are actual class members were iden Precision and recall are computed at different threshold levels. - ### Receiver operating characteristic curve [ml-dfanalytics-roc] The receiver operating characteristic (ROC) curve is a plot that represents the performance of the binary classification process at different thresholds. It compares the rate of true positives against the rate of false positives at the different threshold levels to create the curve. From this plot, you can compute the area under the curve (AUC) value, which is a number between 0 and 1. The closer to 1, the better the algorithm performance. -The {{evaluatedf-api}} can return the false positive rate (`fpr`) and the true positive rate (`tpr`) at the different threshold levels, so you can visualize the algorithm performance by using these values. - +The evaluate {{dfanalytics}} API can return the false positive rate (`fpr`) and the true positive rate (`tpr`) at the different threshold levels, so you can visualize the algorithm performance by using these values. ## Detecting unusual behavior in the logs data set [weblogs-outliers] The goal of {{oldetection}} is to find the most unusual documents in an index. Let’s try to detect unusual behavior in the [data logs sample data set](../../overview/kibana-quickstart.md#gs-get-data-into-kibana). -1. Verify that your environment is set up properly to use {{ml-features}}. If the {{es}} {security-features} are enabled, you need a user that has authority to create and manage {{dfanalytics-jobs}}. See [Setup and security](../setting-up-machine-learning.md). - - Since we’ll be creating {{transforms}}, you also need `manage_data_frame_transforms` cluster privileges. +1. Verify that your environment is set up properly to use {{ml-features}}. If the {{es}} {{security-features}} are enabled, you need a user that has authority to create and manage {{dfanalytics}} jobs. See [Setup and security](../setting-up-machine-learning.md). Since we’ll be creating {{transforms}}, you also need `manage_data_frame_transforms` cluster privileges. 2. Create a {{transform}} that generates an entity-centric index with numeric or boolean data to analyze. @@ -137,16 +121,17 @@ The goal of {{oldetection}} is to find the most unusual documents in an index. L You can preview the {{transform}} before you create it in **{{stack-manage-app}}** > **Transforms**: - :::{image} ../../../images/machine-learning-logs-transform-preview.jpg - :alt: Creating a {{transform}} in {kib} - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-logs-transform-preview.jpg +:alt: Creating a {{transform}} in {kib} +:class: screenshot +::: Alternatively, you can use the [preview {{transform}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/preview-transform.html) and the [create {{transform}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-transform.html). - ::::{dropdown} API example - ```console - POST _transform/_preview +::::{dropdown} API example + +```console +POST _transform/_preview { "source": { "index": [ @@ -229,52 +214,50 @@ The goal of {{oldetection}} is to find the most unusual documents in an index. L "index": "weblog-clientip" } } - ``` - - :::: +``` +:::: For more details about creating {{transforms}}, see [Transforming the eCommerce sample data](../../transforms/ecommerce-transforms.md). 3. Start the {{transform}}. - ::::{tip} - Even though resource utilization is automatically adjusted based on the cluster load, a {{transform}} increases search and indexing load on your cluster while it runs. If you’re experiencing an excessive load, however, you can stop it. - :::: - +::::{tip} +Even though resource utilization is automatically adjusted based on the cluster load, a {{transform}} increases search and indexing load on your cluster while it runs. If you’re experiencing an excessive load, however, you can stop it. +:::: You can start, stop, and manage {{transforms}} in {{kib}}. Alternatively, you can use the [start {{transforms}}](https://www.elastic.co/guide/en/elasticsearch/reference/current/start-data-frame-transform.html) API. - ::::{dropdown} API example - ```console - POST _transform/logs-by-clientip/_start - ``` +::::{dropdown} API example +```console +POST _transform/logs-by-clientip/_start +``` - :::: +:::: 4. Create a {{dfanalytics-job}} to detect outliers in the new entity-centric index. In the wizard on the **Machine Learning** > **Data Frame Analytics** page in {{kib}}, select your new {{data-source}} then use the default values for {{oldetection}}. For example: - :::{image} ../../../images/machine-learning-weblog-outlier-job-1.jpg - :alt: Create a {{dfanalytics-job}} in {kib} - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-weblog-outlier-job-1.jpg +:alt: Create a {{dfanalytics-job}} in {kib} +:class: screenshot +::: The wizard includes a scatterplot matrix, which enables you to explore the relationships between the fields. You can use that information to help you decide which fields to include or exclude from the analysis. - :::{image} ../../../images/machine-learning-weblog-outlier-scatterplot.jpg - :alt: A scatterplot matrix for three fields in {kib} - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-weblog-outlier-scatterplot.jpg +:alt: A scatterplot matrix for three fields in {kib} +:class: screenshot +::: If you want these charts to represent data from a larger sample size or from a randomized selection of documents, you can change the default behavior. However, a larger sample size might slow down the performance of the matrix and a randomized selection might put more load on the cluster due to the more intensive query. - Alternatively, you can use the [create {{dfanalytics-jobs}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-dfanalytics.html). + Alternatively, you can use the [create {{dfanalytics}} jobs API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-dfanalytics.html). - ::::{dropdown} API example - ```console - PUT _ml/data_frame/analytics/weblog-outliers +::::{dropdown} API example +```console +PUT _ml/data_frame/analytics/weblog-outliers { "source": { "index": "weblog-clientip" @@ -290,50 +273,50 @@ The goal of {{oldetection}} is to find the most unusual documents in an index. L "includes" : ["@timestamp.value_count","bytes.max","bytes.sum","request.value_count"] } } - ``` +``` - :::: +:::: After you configured your job, the configuration details are automatically validated. If the checks are successful, you can proceed and start the job. A warning message is shown if the configuration is invalid. The message contains a suggestion to improve the configuration to be validated. -5. Start the {{dfanalytics-job}}. +5. Start the {{dfanalytics}} job. - You can start, stop, and manage {{dfanalytics-jobs}} on the **Machine Learning** > **Data Frame Analytics** page. Alternatively, you can use the [start {{dfanalytics-jobs}}](https://www.elastic.co/guide/en/elasticsearch/reference/current/start-dfanalytics.html) and [stop {{dfanalytics-jobs}}](https://www.elastic.co/guide/en/elasticsearch/reference/current/stop-dfanalytics.html) APIs. + You can start, stop, and manage {{dfanalytics-jobs}} on the **Machine Learning** > **Data Frame Analytics** page. Alternatively, you can use the [start {{dfanalytics}} jobs](https://www.elastic.co/guide/en/elasticsearch/reference/current/start-dfanalytics.html) and [stop {{dfanalytics}} jobs](https://www.elastic.co/guide/en/elasticsearch/reference/current/stop-dfanalytics.html) APIs. - ::::{dropdown} API example - ```console +::::{dropdown} API example +```console POST _ml/data_frame/analytics/weblog-outliers/_start - ``` +``` - :::: +:::: 6. View the results of the {{oldetection}} analysis. - The {{dfanalytics-job}} creates an index that contains the original data and {{olscores}} for each document. The {{olscore}} indicates how different each entity is from other entities. + The {{dfanalytics}} job creates an index that contains the original data and {{olscores}} for each document. The {{olscore}} indicates how different each entity is from other entities. - In {{kib}}, you can view the results from the {{dfanalytics-job}} and sort them on the outlier score: + In {{kib}}, you can view the results from the {{dfanalytics}} job and sort them on the outlier score: - :::{image} ../../../images/machine-learning-outliers.jpg - :alt: View {{oldetection}} results in {kib} - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-outliers.jpg +:alt: View {{oldetection}} results in {kib} +:class: screenshot +::: The `ml.outlier` score is a value between 0 and 1. The larger the value, the more likely they are to be an outlier. In {{kib}}, you can optionally enable histogram charts to get a better understanding of the distribution of values for each column in the result. In addition to an overall outlier score, each document is annotated with feature influence values for each field. These values add up to 1 and indicate which fields are the most important in deciding whether an entity is an outlier or inlier. For example, the dark shading on the `bytes.sum` field for the client IP `111.237.144.54` indicates that the sum of the exchanged bytes was the most influential feature in determining that that client IP is an outlier. - If you want to see the exact feature influence values, you can retrieve them from the index that is associated with your {{dfanalytics-job}}. + If you want to see the exact feature influence values, you can retrieve them from the index that is associated with your {{dfanalytics}} job. - ::::{dropdown} API example - ```console - GET weblog-outliers/_search?q="111.237.144.54" - ``` +::::{dropdown} API example +```console +GET weblog-outliers/_search?q="111.237.144.54" +``` The search results include the following {{oldetection}} scores: - ```js - ... +```js + ... "ml" : { "outlier_score" : 0.9830020666122437, "feature_influence" : [ @@ -355,8 +338,8 @@ The goal of {{oldetection}} is to find the most unusual documents in an index. L } ] } - ... - ``` + ... +``` :::: @@ -370,18 +353,14 @@ The goal of {{oldetection}} is to find the most unusual documents in an index. L You can highlight an area in one of the charts and the corresponding area is also highlighted in the rest of the charts. This function makes it easier to focus on specific values and areas in the results. In addition to the sample size and random scoring options, there is a **Dynamic size** option. If you enable this option, the size of each point is affected by its {{olscore}}; that is to say, the largest points have the highest {{olscores}}. The goal of these charts and options is to help you visualize and explore the outliers within your data. - Now that you’ve found unusual behavior in the sample data set, consider how you might apply these steps to other data sets. If you have data that is already marked up with true outliers, you can determine how well the {{oldetection}} algorithms perform by using the evaluate {{dfanalytics}} API. See [6. Evaluate the results](#ml-outlier-detection-evaluate). ::::{tip} -If you do not want to keep the {{transform}} and the {{dfanalytics-job}}, you can delete them in {{kib}} or use the [delete {{transform}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/delete-data-frame-transform.html) and [delete {{dfanalytics-job}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/delete-dfanalytics.html). When you delete {{transforms}} and {{dfanalytics-jobs}} in {{kib}}, you have the option to also remove the destination indices and {{data-sources}}. +If you do not want to keep the {{transform}} and the {{dfanalytics}} job, you can delete them in {{kib}} or use the [delete {{transform}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/delete-data-frame-transform.html) and [delete {{dfanalytics}} job API](https://www.elastic.co/guide/en/elasticsearch/reference/current/delete-dfanalytics.html). When you delete {{transforms}} and {{dfanalytics}} jobs in {{kib}}, you have the option to also remove the destination indices and {{data-sources}}. :::: - - ## Further reading [outlier-detection-reading] * If you want to see another example of {{oldetection}} in a Jupyter notebook, [click here](https://github.com/elastic/examples/tree/master/Machine%20Learning/Outlier%20Detection/Introduction). * [This blog post](https://www.elastic.co/blog/catching-malware-with-elastic-outlier-detection) shows you how to catch malware using {{oldetection}}. * [Benchmarking {{oldetection}} results in Elastic {{ml}}](https://www.elastic.co/blog/benchmarking-outlier-detection-in-elastic-machine-learning) - diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-overview.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-overview.md index 3b09a9e77..a31b6f6fd 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-overview.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-overview.md @@ -4,16 +4,13 @@ mapped_pages: - https://www.elastic.co/guide/en/machine-learning/current/ml-dfa-overview.html --- - - # Overview [ml-dfa-overview] - {{dfanalytics-cap}} enable you to perform different analyses of your data and annotate it with the results. By doing this, it provides additional insights into the data. [{{oldetection-cap}}](ml-dfa-finding-outliers.md) identifies unusual data points in the data set. [{{regression-cap}}](ml-dfa-regression.md) makes predictions on your data after it determines certain relationships among your data points. [{{classification-cap}}](ml-dfa-classification.md) predicts the class or category of a given data point in a data set. {{infer-cap}} enables you to use trained {{ml}} models against incoming data in a continuous fashion. -The process leaves the source index intact, it creates a new index that contains a copy of the source data and the annotated data. You can slice and dice the data extended with the results as you normally do with any other data set. Read [How {{dfanalytics-jobs}} work](ml-dfa-phases.md) for more information. +The process leaves the source index intact, it creates a new index that contains a copy of the source data and the annotated data. You can slice and dice the data extended with the results as you normally do with any other data set. Read [How {{dfanalytics}} jobs work](ml-dfa-phases.md) for more information. -You can evaluate the {{dfanalytics}} performance by using the {{evaluatedf-api}} against a marked up data set. It helps you understand error distributions and identifies the points where the {{dfanalytics}} model performs well or less trustworthily. +You can evaluate the {{dfanalytics}} performance by using the evaluate {{dfanalytics}} API against a marked up data set. It helps you understand error distributions and identifies the points where the {{dfanalytics}} model performs well or less trustworthily. Consult [Introduction to supervised learning](#ml-supervised-workflow) to learn more about how to make predictions with supervised learning. @@ -23,7 +20,6 @@ Consult [Introduction to supervised learning](#ml-supervised-workflow) to learn | {{regression}} | supervised | | {{classification}} | supervised | - ## Introduction to supervised learning [ml-supervised-workflow] Elastic supervised learning enables you to train a {{ml}} model based on training examples that you provide. You can then use your model to make predictions on new data. This page summarizes the end-to-end workflow for training, evaluating and deploying a model. It gives a high-level overview of the steps required to identify and implement a solution using supervised learning. @@ -36,7 +32,6 @@ The workflow for supervised learning consists of the following stages: These are iterative stages, meaning that after evaluating each step, you might need to make adjustments before you move further. - ### Define the problem [define-problem] It’s important to take a moment and think about where {{ml}} can be most impactful. Consider what type of data you have available and what value it holds. The better you know the data, the quicker you will be able to create {{ml}} models that generate useful insights. What kinds of patterns do you want to discover in your data? What type of value do you want to predict: a category, or a numerical value? The answers help you choose the type of analysis that fits your use case. @@ -48,7 +43,6 @@ After you identify the problem, consider which of the {{ml-features}} are most l * {{regression}}: predicts **continuous, numerical values** like the response time of a web request. * {{classification}}: predicts **discrete, categorical values** like whether a [DNS request originates from a malicious or benign domain](https://www.elastic.co/blog/machine-learning-in-cybersecurity-training-supervised-models-to-detect-dga-activity). - ### Prepare and transform data [prepare-transform-data] You have defined the problem and selected an appropriate type of analysis. The next step is to produce a high-quality data set in {{es}} with a clear relationship to your training objectives. If your data is not already in {{es}}, this is the stage where you develop your data pipeline. If you want to learn more about how to ingest data into {{es}}, refer to the [Ingest node documentation](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md). @@ -61,7 +55,6 @@ Before you train the model, consider preprocessing the data. In practice, the ty {{regression-cap}} and {{classification}} require specifically structured source data: a two dimensional tabular data structure. For this reason, you might need to [{{transform}}](../../transforms.md) your data to create a {{dataframe}} which can be used as the source for these types of {{dfanalytics}}. - ### Train, test, iterate [train-test-iterate] After your data is prepared and transformed into the right format, it is time to train the model. Training is an iterative process — every iteration is followed by an evaluation to see how the model performs. @@ -74,14 +67,12 @@ During the training process, the training data is fed through the learning algor Once the model is trained, you can evaluate how well it predicts previously unseen data with the model generalization error. There are further evaluation types for both {{regression}} and {{classification}} analysis which provide metrics about training performance. When you are satisfied with the results, you are ready to deploy the model. Otherwise, you may want to adjust the training configuration or consider alternative ways to preprocess and represent your data. - ### Deploy model [deploy-model] You have trained the model and are satisfied with the performance. The last step is to deploy your trained model and start using it on new data. The Elastic {{ml}} feature called {{infer}} enables you to make predictions for new data either by using it as a processor in an ingest pipeline, in a continuous {{transform}} or as an aggregation at search time. When new data comes into your ingest pipeline or you run a search on your data with an {{infer}} aggregation, the model is used to infer against the data and make predictions on it. - ### Next steps [next-steps] * Read more about how to [transform you data](../../transforms.md) into an entity-centric index. diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-regression.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-regression.md index 1acbe5f28..6dab15d00 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-regression.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-regression.md @@ -9,24 +9,20 @@ mapped_pages: When you perform {{reganalysis}}, you must identify a subset of fields that you want to use to create a model for predicting other fields. *Feature variables* are the fields that are used to create the model. The *dependent variable* is the field you want to predict. - ## {{regression-cap}} algorithms [dfa-regression-algorithm] {{regression-cap}} uses an ensemble learning technique that is similar to extreme gradient boosting (XGBoost) which combines decision trees with gradient boosting methodologies. XGBoost trains a sequence of decision trees and every decision tree learns from the mistakes of the forest so far. In each iteration, the trees added to the forest improve the decision quality of the combined decision forest. By default, the regression algorithm optimizes for a [loss function](dfa-regression-lossfunction.md) called mean-squared error loss. There are three types of {{feature-vars}} that you can use with these algorithms: numerical, categorical, or Boolean. Arrays are not supported. - ## 1. Define the problem [dfa-regression-problem] {{regression-cap}} can be useful in cases where a continuous quantity needs to be predicted. The values that {{reganalysis}} can predict are numerical values. If your use case requires predicting continuous, numerical values, then {{regression}} might be the suitable choice for you. - ## 2. Set up the environment [dfa-regression-environment] Before you can use the {{stack-ml-features}}, there are some configuration requirements (such as security privileges) that must be addressed. Refer to [Setup and security](../setting-up-machine-learning.md). - ## 3. Prepare and transform data [dfa-regression-prepare-data] {{regression-cap}} is a supervised {{ml}} method, which means you need to supply a labeled training data set. This data set must have values for the {{feature-vars}} and the {{depvar}} which are used to train the model. This information is used during training to identify relationships among the various characteristics of the data and the predicted value. This labeled data set also plays a critical role in model evaluation. @@ -35,10 +31,9 @@ You might also need to [{{transform}}](../../transforms.md) your data to create To learn more about how to prepare your data, refer to [the relevant section](ml-dfa-overview.md#prepare-transform-data) of the supervised learning overview. - ## 4. Create a job [dfa-regression-create-job] -{{dfanalytics-jobs-cap}} contain the configuration information and metadata necessary to perform an analytics task. You can create {{dfanalytics-jobs}} via {{kib}} or using the [create {{dfanalytics-jobs}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-dfanalytics.html). +{{dfanalytics-cap}} jobs contain the configuration information and metadata necessary to perform an analytics task. You can create {{dfanalytics}} jobs via {{kib}} or using the [create {{dfanalytics}} jobs API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-dfanalytics.html). Select {{regression}} as the analytics type for the job, then select the field that you want to predict (the {{depvar}}). You can also include and exclude fields to/from the analysis. @@ -46,11 +41,9 @@ Select {{regression}} as the analytics type for the job, then select the field t You can view the statistics of the selectable fields in the {{dfanalytics}} wizard. The field statistics displayed in a flyout provide more meaningful context to help you select relevant fields. :::: - - ## 5. Start the job [dfa-regression-start] -You can start the job via {{kib}} or using the [start {{dfanalytics-jobs}}](https://www.elastic.co/guide/en/elasticsearch/reference/current/start-dfanalytics.html) API. A {{regression}} job has the following phases: +You can start the job via {{kib}} or using the [start {{dfanalytics}} jobs](https://www.elastic.co/guide/en/elasticsearch/reference/current/start-dfanalytics.html) API. A {{regression}} job has the following phases: * `reindexing`: Documents are copied from the source index to the destination index. * `loading_data`: The job fetches the necessary data from the destination index. @@ -67,11 +60,9 @@ After the last phase is finished, the job stops and the results are ready for ev When you create a {{dfanalytics-job}}, the inference step of the process might fail if the model is too large to fit into JVM. For a workaround, refer to [this GitHub issue](https://github.com/elastic/elasticsearch/issues/76093). :::: - - ## 6. Evaluate the result [ml-dfanalytics-regression-evaluation] -Using the {{dfanalytics}} features to gain insights from a data set is an iterative process. After you defined the problem you want to solve, and chose the analytics type that can help you to do so, you need to produce a high-quality data set and create the appropriate {{dfanalytics-job}}. You might need to experiment with different configurations, parameters, and ways to transform data before you arrive at a result that satisfies your use case. A valuable companion to this process is the [{{evaluatedf-api}}](https://www.elastic.co/guide/en/elasticsearch/reference/current/evaluate-dfanalytics.html), which enables you to evaluate the {{dfanalytics}} performance. It helps you understand error distributions and identifies the points where the {{dfanalytics}} model performs well or less trustworthily. +Using the {{dfanalytics}} features to gain insights from a data set is an iterative process. After you defined the problem you want to solve, and chose the analytics type that can help you to do so, you need to produce a high-quality data set and create the appropriate {{dfanalytics}} job. You might need to experiment with different configurations, parameters, and ways to transform data before you arrive at a result that satisfies your use case. A valuable companion to this process is the [evaluate {{dfanalytics}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/evaluate-dfanalytics.html), which enables you to evaluate the {{dfanalytics}} performance. It helps you understand error distributions and identifies the points where the {{dfanalytics}} model performs well or less trustworthily. To evaluate the analysis with this API, you need to annotate your index that contains the results of the analysis with a field that marks each document with the ground truth. The {{evaluatedf-api}} evaluates the performance of the {{dfanalytics}} against this manually provided ground truth. @@ -84,34 +75,28 @@ The {{regression}} evaluation type offers the following metrics to evaluate the * Mean squared error (MSE) * Mean squared logarithmic error (MSLE) * Pseudo-Huber loss -* R-squared (R2) - +* R-squared (R^2^) ### Mean squared error [ml-dfanalytics-mse] MSE is the average squared sum of the difference between the true value and the predicted value. (Avg (predicted value-actual value)2). - ### Mean squared logarithmic error [ml-dfanalytics-msle] MSLE is a variation of mean squared error. It can be used for cases when the target values are positive and distributed with a long tail such as data on prices or population. Consult the [Loss functions for {{regression}} analyses](dfa-regression-lossfunction.md) page to learn more about loss functions. - ### Pseudo-Huber loss [ml-dfanalytics-huber] [Pseudo-Huber loss metric](https://en.wikipedia.org/wiki/Huber_loss#Pseudo-Huber_loss_function) behaves as mean absolute error (MAE) for errors larger than a predefined value (defaults to `1`) and as mean squared error (MSE) for errors smaller than the predefined value. This loss function uses the `delta` parameter to define the transition point between MAE and MSE. Consult the [Loss functions for {{regression}} analyses](dfa-regression-lossfunction.md) page to learn more about loss functions. - ### R-squared [ml-dfanalytics-r-squared] -R-squared (R2) represents the goodness of fit and measures how much of the variation in the data the predictions are able to explain. The value of R2 are less than or equal to 1, where 1 indicates that the predictions and true values are equal. A value of 0 is obtained when all the predictions are set to the mean of the true values. A value of 0.5 for R2 would indicate that the predictions are 1 - 0.5(1/2) (about 30%) closer to true values than their mean. - +R-squared (R^2^) represents the goodness of fit and measures how much of the variation in the data the predictions are able to explain. The value of R^2^ are less than or equal to 1, where 1 indicates that the predictions and true values are equal. A value of 0 is obtained when all the predictions are set to the mean of the true values. A value of 0.5 for R^2^ would indicate that the predictions are 1 - 0.5 ^(1/2)^ (about 30%) closer to true values than their mean. ### {{feat-imp-cap}} [dfa-regression-feature-importance] {{feat-imp-cap}} provides further information about the results of an analysis and helps to interpret the results in a more subtle way. If you want to learn more about {{feat-imp}}, [click here](ml-feature-importance.md). - ## 7. Deploy the model [dfa-regression-deploy] The model that you created is stored as {{es}} documents in internal indices. In other words, the characteristics of your trained model are saved and ready to be deployed and used as functions. The [{{infer}}](#ml-inference-reg) feature enables you to use your model in a preprocessor of an ingest pipeline or in a pipeline aggregation of a search query to make predictions about your data. @@ -119,24 +104,24 @@ The model that you created is stored as {{es}} documents in internal indices. In 1. To deploy {{dfanalytics}} model in a pipeline, navigate to **Machine Learning** > **Model Management** > **Trained models** in the main menu, or use the [global search field](../../overview/kibana-quickstart.md#_finding_your_apps_and_objects) in {{kib}}. 2. Find the model you want to deploy in the list and click **Deploy model** in the **Actions** menu. - :::{image} ../../../images/machine-learning-ml-dfa-trained-models-ui.png - :alt: The trained models UI in {kib} - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-ml-dfa-trained-models-ui.png +:alt: The trained models UI in {kib} +:class: screenshot +::: 3. Create an {{infer}} pipeline to be able to use the model against new data through the pipeline. Add a name and a description or use the default values. - :::{image} ../../../images/machine-learning-ml-dfa-inference-pipeline.png - :alt: Creating an inference pipeline - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-ml-dfa-inference-pipeline.png +:alt: Creating an inference pipeline +:class: screenshot +::: 4. Configure the pipeline processors or use the default settings. - :::{image} ../../../images/machine-learning-ml-dfa-inference-processor.png - :alt: Configuring an inference processor - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-ml-dfa-inference-processor.png +:alt: Configuring an inference processor +:class: screenshot +::: 5. Configure to handle ingest failures or use the default settings. 6. (Optional) Test your pipeline by running a simulation of the pipeline to confirm it produces the anticipated results. @@ -144,20 +129,17 @@ The model that you created is stored as {{es}} documents in internal indices. In The model is deployed and ready to use through the {{infer}} pipeline. - ### {{infer-cap}} [ml-inference-reg] {{infer-cap}} enables you to use [trained {{ml}} models](ml-trained-models.md) against incoming data in a continuous fashion. For instance, suppose you have an online service and you would like to predict whether a customer is likely to churn. You have an index with historical data – information on the customer behavior throughout the years in your business – and a {{classification}} model that is trained on this data. The new information comes into a destination index of a {{ctransform}}. With {{infer}}, you can perform the {{classanalysis}} against the new data with the same input fields that you’ve trained the model on, and get a prediction. - #### {{infer-cap}} processor [ml-inference-processor-reg] {{infer-cap}} can be used as a processor specified in an [ingest pipeline](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md). It uses a trained model to infer against the data that is being ingested in the pipeline. The model is used on the ingest node. {{infer-cap}} pre-processes the data by using the model and provides a prediction. After the process, the pipeline continues executing (if there is any other processor in the pipeline), finally the new data together with the results are indexed into the destination index. -Check the [{{infer}} processor](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-processor.html) and [the {{ml}} {dfanalytics} API documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-df-analytics-apis.html) to learn more. - +Check the [{{infer}} processor](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-processor.html) and [the {{ml}} {{dfanalytics}} API documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-df-analytics-apis.html) to learn more. #### {{infer-cap}} aggregation [ml-inference-aggregation-reg] @@ -169,13 +151,10 @@ Check the [{{infer}} bucket aggregation](https://www.elastic.co/guide/en/elastic If you use trained model aliases to reference your trained model in an {{infer}} processor or {{infer}} aggregation, you can replace your trained model with a new one without the need of updating the processor or the aggregation. Reassign the alias you used to a new trained model ID by using the [Create or update trained model aliases API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-trained-models-aliases.html). The new trained model needs to use the same type of {{dfanalytics}} as the old one. :::: - - ## Performing {{reganalysis}} in the sample flight data set [performing-regression] Let’s try to predict flight delays by using the [sample flight data](../../overview/kibana-quickstart.md#gs-get-data-into-kibana). The data set contains information such as weather conditions, flight destinations and origins, flight distances, carriers, and the number of minutes each flight was delayed. When you create a {{regression}} job, it learns the relationships between the fields in your data to predict the value of a *{{depvar}}*, which - in this case - is the numeric `FlightDelayMins` field. For an overview of these concepts, see [*Predicting numerical values with {{regression}}*]() and [Introduction to supervised learning](ml-dfa-overview.md#ml-supervised-workflow). - ### Preparing your data [flightdata-regression-data] Each document in the data set contains details for a single flight, so this data is ready for analysis; it is already in a two-dimensional entity-based data structure. In general, you often need to [transform](../../transforms.md) the data into an entity-centric index before you analyze it. @@ -232,13 +211,10 @@ To be analyzed, a document must contain at least one field with a supported data :::: - ::::{note} The sample flight data is used in this example because it is easily accessible. However, the data contains some inconsistencies. For example, a flight can be both delayed and canceled. This is a good reminder that the quality of your input data affects the quality of your results. :::: - - ### Creating a {{regression}} model [flightdata-regression-model] To predict the number of minutes delayed for each flight: @@ -248,10 +224,10 @@ To predict the number of minutes delayed for each flight: You can use the wizard on the **{{ml-app}}** > **Data Frame Analytics** tab in {{kib}} or the [create {{dfanalytics-jobs}}](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-dfanalytics.html) API. - :::{image} ../../../images/machine-learning-flights-regression-job-1.jpg - :alt: Creating a {{dfanalytics-job}} in {kib} - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-flights-regression-job-1.jpg +:alt: Creating a {{dfanalytics-job}} in {kib} +:class: screenshot +::: 1. Choose `kibana_sample_data_flights` as the source index. 2. Choose `regression` as the job type. @@ -261,10 +237,10 @@ To predict the number of minutes delayed for each flight: The wizard includes a scatterplot matrix, which enables you to explore the relationships between the numeric fields. The color of each point is affected by the value of the {{depvar}} for that document, as shown in the legend. You can highlight an area in one of the charts and the corresponding area is also highlighted in the rest of the chart. You can use this matrix to help you decide which fields to include or exclude from the analysis. - :::{image} ../../../images/machine-learning-flightdata-regression-scatterplot.png - :alt: A scatterplot matrix for three fields in {kib} - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-flightdata-regression-scatterplot.png +:alt: A scatterplot matrix for three fields in {kib} +:class: screenshot +::: If you want these charts to represent data from a larger sample size or from a randomized selection of documents, you can change the default behavior. However, a larger sample size might slow down the performance of the matrix and a randomized selection might put more load on the cluster due to the more intensive query. @@ -274,9 +250,9 @@ To predict the number of minutes delayed for each flight: 9. Add a job ID (such as `model-flight-delay-regression`) and optionally a job description. 10. Add the name of the destination index that will contain the results of the analysis. In {{kib}}, the index name matches the job ID by default. It will contain a copy of the source index data where each document is annotated with the results. If the index does not exist, it will be created automatically. - ::::{dropdown} API example - ```console - PUT _ml/data_frame/analytics/model-flight-delays-regression +::::{dropdown} API example +```console +PUT _ml/data_frame/analytics/model-flight-delays-regression { "source": { "index": [ @@ -311,9 +287,9 @@ To predict the number of minutes delayed for each flight: ] } } - ``` +``` - :::: +:::: After you configured your job, the configuration details are automatically validated. If the checks are successful, you can proceed and start the job. A warning message is shown if the configuration is invalid. The message contains a suggestion to improve the configuration to be validated. @@ -322,30 +298,30 @@ To predict the number of minutes delayed for each flight: The job takes a few minutes to run. Runtime depends on the local hardware and also on the number of documents and fields that are analyzed. The more fields and documents, the longer the job runs. It stops automatically when the analysis is complete. - ::::{dropdown} API example - ```console - POST _ml/data_frame/analytics/model-flight-delays-regression/_start - ``` +::::{dropdown} API example +```console +POST _ml/data_frame/analytics/model-flight-delays-regression/_start +``` - :::: +:::: 4. Check the job stats to follow the progress in {{kib}} or use the [get {{dfanalytics-jobs}} statistics API](https://www.elastic.co/guide/en/elasticsearch/reference/current/get-dfanalytics-stats.html). - :::{image} ../../../images/machine-learning-flights-regression-details.jpg - :alt: Statistics for a {{dfanalytics-job}} in {kib} - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-flights-regression-details.jpg +:alt: Statistics for a {{dfanalytics-job}} in {kib} +:class: screenshot +::: When the job stops, the results are ready to view and evaluate. To learn more about the job phases, see [How {{dfanalytics-jobs}} work](ml-dfa-phases.md). - ::::{dropdown} API example - ```console - GET _ml/data_frame/analytics/model-flight-delays-regression/_stats - ``` +::::{dropdown} API example +```console +GET _ml/data_frame/analytics/model-flight-delays-regression/_stats +``` - The API call returns the following response: +The API call returns the following response: - ```console-result +```console-result { "count" : 1, "data_frame_analytics" : [ @@ -428,11 +404,9 @@ To predict the number of minutes delayed for each flight: } ] } - ``` - - :::: - +``` +:::: ### Viewing {{regression}} results [flightdata-regression-results] @@ -508,7 +482,6 @@ The snippet below shows an example of the total feature importance details in th 3. The minimum {{feat-imp}} value across all the training data for this field. 4. The maximum {{feat-imp}} value across all the training data for this field. - To see the top {{feat-imp}} values for each prediction, search the destination index. For example: ```console @@ -554,10 +527,8 @@ The snippet below shows a part of a document with the annotated results: :::: - Lastly, {{kib}} provides a scatterplot matrix in the results. It has the same functionality as the matrix that you saw in the job wizard. Its purpose is to likewise help you visualize and explore the relationships between the numeric fields and the {{depvar}} in your data. - ### Evaluating {{regression}} results [flightdata-regression-evaluate] Though you can look at individual results and compare the predicted value (`ml.FlightDelayMin_prediction`) to the actual value (`FlightDelayMins`), you typically need to evaluate the success of the {{regression}} model as a whole. @@ -654,15 +625,12 @@ POST _ml/data_frame/_evaluate 1. Evaluates only the documents that are not part of the training data. - :::: - When you have trained a satisfactory model, you can [deploy it](#dfa-regression-deploy) to make predictions about new data. If you don’t want to keep the {{dfanalytics-job}}, you can delete it. For example, use {{kib}} or the [delete {{dfanalytics-job}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/delete-dfanalytics.html). When you delete {{dfanalytics-jobs}} in {{kib}}, you have the option to also remove the destination indices and {{data-sources}}. - ## Further reading [dfa-regression-reading] * [Feature importance for {{dfanalytics}} (Jupyter notebook)](https://github.com/elastic/examples/tree/master/Machine%20Learning/Feature%20Importance) diff --git a/raw-migrated-files/kibana/kibana/xpack-ml-dfanalytics.md b/raw-migrated-files/kibana/kibana/xpack-ml-dfanalytics.md deleted file mode 100644 index 1c6302e3b..000000000 --- a/raw-migrated-files/kibana/kibana/xpack-ml-dfanalytics.md +++ /dev/null @@ -1,13 +0,0 @@ -# {{dfanalytics-cap}} [xpack-ml-dfanalytics] - -The Elastic {{ml}} {dfanalytics} feature enables you to analyze your data using {{classification}}, {{oldetection}}, and {{regression}} algorithms and generate new indices that contain the results alongside your source data. - -If you have a license that includes the {{ml-features}}, you can create {{dfanalytics-jobs}} and view their results on the **Data Frame Analytics** page in {{kib}}. For example: - -:::{image} ../../../images/kibana-classification.png -:alt: {{classification-cap}} results in {kib} -:class: screenshot -::: - -For more information about the {{dfanalytics}} feature, see [{{ml-cap}} {dfanalytics}](../../../explore-analyze/machine-learning/data-frame-analytics.md). - diff --git a/raw-migrated-files/stack-docs/machine-learning/index.md b/raw-migrated-files/stack-docs/machine-learning/index.md deleted file mode 100644 index f5a259a9f..000000000 --- a/raw-migrated-files/stack-docs/machine-learning/index.md +++ /dev/null @@ -1,3 +0,0 @@ -# Machine learning - -Migrated files from the Machine learning book. diff --git a/raw-migrated-files/stack-docs/machine-learning/ml-dfanalytics.md b/raw-migrated-files/stack-docs/machine-learning/ml-dfanalytics.md deleted file mode 100644 index 3632a6e79..000000000 --- a/raw-migrated-files/stack-docs/machine-learning/ml-dfanalytics.md +++ /dev/null @@ -1,18 +0,0 @@ -# {{dfanalytics-cap}} [ml-dfanalytics] - -::::{important} -Using {{dfanalytics}} requires source data to be structured as a two dimensional "tabular" data structure, in other words a {{dataframe}}. [{{transforms-cap}}](../../../explore-analyze/transforms.md) enable you to create {{dataframes}} which can be used as the source for {{dfanalytics}}. -:::: - - -{{dfanalytics-cap}} enable you to perform different analyses of your data and annotate it with the results. Consult [Setup and security](../../../explore-analyze/machine-learning/setting-up-machine-learning.md) to learn more about the license and the security privileges that are required to use {{dfanalytics}}. - -* [Overview](../../../explore-analyze/machine-learning/data-frame-analytics/ml-dfa-overview.md) -* [*Finding outliers*](../../../explore-analyze/machine-learning/data-frame-analytics/ml-dfa-finding-outliers.md) -* [*Predicting numerical values with {{regression}}*](../../../explore-analyze/machine-learning/data-frame-analytics/ml-dfa-regression.md) -* [*Predicting classes with {{classification}}*](../../../explore-analyze/machine-learning/data-frame-analytics/ml-dfa-classification.md) -* [Language identification](https://www.elastic.co/guide/en/machine-learning/current/ml-dfa-lang-ident.html) -* [*Advanced concepts*](../../../explore-analyze/machine-learning/data-frame-analytics/ml-dfa-concepts.md) -* [*API quick reference*](../../../explore-analyze/machine-learning/data-frame-analytics/ml-dfanalytics-apis.md) -* [*Resources*](../../../explore-analyze/machine-learning/data-frame-analytics/ml-dfa-resources.md) - diff --git a/raw-migrated-files/toc.yml b/raw-migrated-files/toc.yml index ab854a034..5cd90f0c9 100644 --- a/raw-migrated-files/toc.yml +++ b/raw-migrated-files/toc.yml @@ -718,7 +718,6 @@ toc: - file: kibana/kibana/using-kibana-with-security.md - file: kibana/kibana/watcher-ui.md - file: kibana/kibana/xpack-ml-aiops.md - - file: kibana/kibana/xpack-ml-dfanalytics.md - file: kibana/kibana/xpack-security-authorization.md - file: kibana/kibana/xpack-security-fips-140-2.md - file: kibana/kibana/xpack-security.md @@ -1016,9 +1015,6 @@ toc: - file: stack-docs/elastic-stack/upgrading-elastic-stack.md - file: stack-docs/elastic-stack/upgrading-elasticsearch.md - file: stack-docs/elastic-stack/upgrading-kibana.md - - file: stack-docs/machine-learning/index.md - children: - - file: stack-docs/machine-learning/ml-dfanalytics.md - file: tech-content/starting-with-the-elasticsearch-platform-and-its-solutions/index.md children: - file: tech-content/starting-with-the-elasticsearch-platform-and-its-solutions/get-elastic.md From 7f83ea7b8919c22c452c7f479d4e04b99a23b73f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Wed, 5 Feb 2025 14:21:32 +0100 Subject: [PATCH 09/15] [E&A] Refined data frame analytics advanced concepts and other pages (#327) * [E&A] Refines DFA advanced concepts. --- .../dfa-regression-lossfunction.md | 4 +- .../data-frame-analytics/hyperparameters.md | 4 +- .../data-frame-analytics/ml-dfa-concepts.md | 10 --- .../ml-dfa-custom-urls.md | 3 +- .../ml-dfa-limitations.md | 64 ++++++------------- .../data-frame-analytics/ml-dfa-phases.md | 19 ++---- .../data-frame-analytics/ml-dfa-resources.md | 2 - .../data-frame-analytics/ml-dfa-scale.md | 9 --- .../ml-dfanalytics-apis.md | 1 - .../ml-feature-encoding.md | 1 - .../ml-feature-importance.md | 2 - .../ml-feature-processors.md | 9 --- .../data-frame-analytics/ml-trained-models.md | 61 ++++++++---------- 13 files changed, 57 insertions(+), 132 deletions(-) diff --git a/explore-analyze/machine-learning/data-frame-analytics/dfa-regression-lossfunction.md b/explore-analyze/machine-learning/data-frame-analytics/dfa-regression-lossfunction.md index d508fd548..3562cfa92 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/dfa-regression-lossfunction.md +++ b/explore-analyze/machine-learning/data-frame-analytics/dfa-regression-lossfunction.md @@ -19,8 +19,6 @@ You can specify the loss function to be used during {{reganalysis}} when you cre Consult [the Jupyter notebook on regression loss functions](https://github.com/elastic/examples/tree/master/Machine%20Learning/Regression%20Loss%20Functions) to learn more. -::::{tip} +::::{tip} The default loss function parameter values work fine for most of the cases. It is highly recommended to use the default values, unless you fully understand the impact of the different loss function parameters. :::: - - diff --git a/explore-analyze/machine-learning/data-frame-analytics/hyperparameters.md b/explore-analyze/machine-learning/data-frame-analytics/hyperparameters.md index ce532e1c2..36f8fdda6 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/hyperparameters.md +++ b/explore-analyze/machine-learning/data-frame-analytics/hyperparameters.md @@ -13,8 +13,6 @@ You can view the hyperparameter values that were ultimately chosen by expanding Different hyperparameters may affect the model performance to a different degree. To estimate the importance of the optimized hyperparameters, analysis of variance decomposition is used. The resulting `absolute importance` shows how much the variation of a hyperparameter impacts the variation in the validation loss. Additionally, `relative importance` is also computed which gives the importance of the hyperparameter compared to the rest of the tuneable hyperparameters. The sum of all relative importances is 1. You can check these results in the response of the [get {{dfanalytics-job}} stats API](https://www.elastic.co/guide/en/elasticsearch/reference/current/get-dfanalytics-stats.html). -::::{tip} +::::{tip} Unless you fully understand the purpose of a hyperparameter, it is highly recommended that you leave it unset and allow hyperparameter optimization to occur. :::: - - diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-concepts.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-concepts.md index 386c19e11..8d4b7f023 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-concepts.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-concepts.md @@ -16,13 +16,3 @@ This section explains the more complex concepts of the Elastic {{ml}} {dfanalyti * [Loss functions for {{regression}} analyses](dfa-regression-lossfunction.md) * [Hyperparameter optimization](hyperparameters.md) * [Trained models](ml-trained-models.md) - - - - - - - - - - diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-custom-urls.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-custom-urls.md index 7c4fa2778..bf6d9ef83 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-custom-urls.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-custom-urls.md @@ -21,7 +21,6 @@ When you create or edit an {{dfanalytics-job}} in {{kib}}, it simplifies the cre For each custom URL, you must supply a label. You can also optionally supply a time range. When you link to **Discover** or a {{kib}} dashboard, you’ll have additional options for specifying the pertinent {{data-source}} or dashboard name and query entities. - ## String substitution in custom URLs [ml-dfa-url-strings] You can use dollar sign ($) delimited tokens in a custom URL. These tokens are substituted for the values of the corresponding fields in the result index. For example, a custom URL might resolve to `discover#/?_g=(time:(from:'$earliest$',mode:absolute,to:'$latest$'))&_a=(filters:!(),index:'4b899bcb-fb10-4094-ae70-207d43183ffc',query:(language:kuery,query:'Carrier:"$Carrier$"'))`. In this case, the pertinent value of the `Carrier` field is passed to the target page when you click the link. @@ -30,7 +29,6 @@ You can use dollar sign ($) delimited tokens in a custom URL. These tokens are s When you create your custom URL in {{kib}}, the **Query entities** option is shown only when there are appropriate fields in the index. :::: - The `$earliest$` and `$latest$` tokens pass the beginning and end of the time span of the data to the target page. The tokens are substituted with date-time strings in ISO-8601 format. For example, the following API updates a job to add a custom URL that uses `$earliest$` and `$latest$` tokens: ```console @@ -51,6 +49,7 @@ POST _ml/data_frame/analytics/flight-delay-regression/_update When you click this custom URL, it opens up the **Discover** page and displays source data for the period one hour before and after the date of the default global settings. ::::{tip} + * The custom URL links use pop-ups. You must configure your web browser so that it does not block pop-up windows or create an exception for your {{kib}} URL. * When creating a link to a {{kib}} dashboard, the URLs for dashboards can be very long. Be careful of typos, end of line characters, and URL encoding. Also ensure you use the appropriate index ID for the target {{kib}} {data-source}. * The dates substituted for `$earliest$` and `$latest$` tokens are in ISO-8601 format and the target system must understand this format. diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-limitations.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-limitations.md index 2f0bdfb7c..1cabc7305 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-limitations.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-limitations.md @@ -4,77 +4,61 @@ mapped_pages: - https://www.elastic.co/guide/en/machine-learning/current/ml-dfa-limitations.html --- - - # Limitations [ml-dfa-limitations] - The following limitations and known problems apply to the 9.0.0-beta1 release of the Elastic {{dfanalytics}} feature. The limitations are grouped into the following categories: * [Platform limitations](#dfa-platform-limitations) are related to the platform that hosts the {{ml}} feature of the {{stack}}. * [Configuration limitations](#dfa-config-limitations) apply to the configuration process of the {{dfanalytics-jobs}}. * [Operational limitations](#dfa-operational-limitations) affect the behavior of the {{dfanalytics-jobs}} that are running. +## Platform limitations [dfa-platform-limitations] -## Platform limitations [dfa-platform-limitations] - - -### CPU scheduling improvements apply to Linux and MacOS only [dfa-scheduling-priority] +### CPU scheduling improvements apply to Linux and MacOS only [dfa-scheduling-priority] When there are many {{ml}} jobs running at the same time and there are insufficient CPU resources, the JVM performance must be prioritized so search and indexing latency remain acceptable. To that end, when CPU is constrained on Linux and MacOS environments, the CPU scheduling priority of native analysis processes is reduced to favor the {{es}} JVM. This improvement does not apply to Windows environments. +## Configuration limitations [dfa-config-limitations] -## Configuration limitations [dfa-config-limitations] - - -### {{ccs-cap}} is not supported [dfa-ccs-limitations] +### {{ccs-cap}} is not supported [dfa-ccs-limitations] {{ccs-cap}} is not supported for {{dfanalytics}}. - -### Nested fields are not supported [dfa-nested-fields-limitations] +### Nested fields are not supported [dfa-nested-fields-limitations] Nested fields are not supported for {{dfanalytics-jobs}}. These fields are ignored during the analysis. If a nested field is selected as the dependent variable for {{classification}} or {{reganalysis}}, an error occurs. - -### {{dfanalytics-jobs-cap}} cannot be updated [dfa-update-limitations] +### {{dfanalytics-jobs-cap}} cannot be updated [dfa-update-limitations] You cannot update {{dfanalytics}} configurations. Instead, delete the {{dfanalytics-job}} and create a new one. - -### {{dfanalytics-cap}} memory limitation [dfa-dataframe-size-limitations] +### {{dfanalytics-cap}} memory limitation [dfa-dataframe-size-limitations] {{dfanalytics-cap}} can only perform analyses that fit into the memory available for {{ml}}. Overspill to disk is not currently possible. For general {{ml}} settings, see [{{ml-cap}} settings in {{es}}](https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-settings.html). When you create a {{dfanalytics-job}} and the inference step of the process fails due to the model is too large to fit into JVM, follow the steps in [this GitHub issue](https://github.com/elastic/elasticsearch/issues/76093) for a workaround. - -### {{dfanalytics-jobs-cap}} cannot use more than 232 documents for training [dfa-training-docs] +### {{dfanalytics-jobs-cap}} cannot use more than 232 documents for training [dfa-training-docs] A {{dfanalytics-job}} that would use more than 232 documents for training cannot be started. The limitation applies only for documents participating in training the model. If your source index contains more than 232 documents, set the `training_percent` to a value that represents less than 232 documents. - -### Trained models created in 7.8 are not backwards compatible [dfa-inference-bwc] +### Trained models created in 7.8 are not backwards compatible [dfa-inference-bwc] Trained models created in version 7.8.0 are not backwards compatible with older node versions. In a mixed cluster environment, all nodes must be at least 7.8.0 to use a model created on a 7.8.0 node. +## Operational limitations [dfa-operational-limitations] -## Operational limitations [dfa-operational-limitations] - - -### Deleting a {{dfanalytics-job}} does not delete the destination index [dfa-deletion-limitations] +### Deleting a {{dfanalytics-job}} does not delete the destination index [dfa-deletion-limitations] The [delete {{dfanalytics-job}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/delete-dfanalytics.html) does not delete the destination index that contains the annotated data of the {{dfanalytics}}. That index must be deleted separately. - -### {{dfanalytics-jobs-cap}} runtime may vary [dfa-time-limitations] +### {{dfanalytics-jobs-cap}} runtime may vary [dfa-time-limitations] The runtime of {{dfanalytics-jobs}} depends on numerous factors, such as the number of data points in the data set, the type of analytics, the number of fields that are included in the analysis, the supplied [hyperparameters](hyperparameters.md), the type of analyzed fields, and so on. For this reason, a general runtime value that applies to all or most of the situations does not exist. The runtime of a {{dfanalytics-job}} may take from a couple of minutes up to many hours in extreme cases. The runtime increases with an increasing number of analyzed fields in a nearly linear fashion. For data sets of more than 100,000 points, start with a low training percent. Run a few {{dfanalytics-jobs}} to see how the runtime scales with the increased number of data points and how the quality of results scales with an increased training percentage. - -### {{dfanalytics-jobs-cap}} may restart after an {{es}} upgrade [dfa-restart] +### {{dfanalytics-jobs-cap}} may restart after an {{es}} upgrade [dfa-restart] A {{dfanalytics-job}} may be restarted from the beginning in the following cases: @@ -84,38 +68,30 @@ A {{dfanalytics-job}} may be restarted from the beginning in the following cases If any of these conditions applies, the destination index of the {{dfanalytics-job}} is deleted and the job starts again from the beginning – regardless of the phase where the job was in. - -### Documents with values of multi-element arrays in analyzed fields are skipped [dfa-multi-arrays-limitations] +### Documents with values of multi-element arrays in analyzed fields are skipped [dfa-multi-arrays-limitations] If the value of an analyzed field (field that is subect of the {{dfanalytics}}) in a document is an array with more than one element, the document that contains this field is skipped during the analysis. - -### {{oldetection-cap}} field types [dfa-od-field-type-docs-limitations] +### {{oldetection-cap}} field types [dfa-od-field-type-docs-limitations] {{oldetection-cap}} requires numeric or boolean data to analyze. The algorithms don’t support missing values, therefore fields that have data types other than numeric or boolean are ignored. Documents where included fields contain missing values, null values, or an array are also ignored. Therefore a destination index may contain documents that don’t have an {{olscore}}. These documents are still reindexed from the source index to the destination index, but they are not included in the {{oldetection}} analysis and therefore no {{olscore}} is computed. - -### {{regression-cap}} field types [dfa-regression-field-type-docs-limitations] +### {{regression-cap}} field types [dfa-regression-field-type-docs-limitations] {{regression-cap}} supports fields that are numeric, boolean, text, keyword and ip. It is also tolerant of missing values. Fields that are supported are included in the analysis, other fields are ignored. Documents where included fields contain an array are also ignored. Documents in the destination index that don’t contain a results field are not included in the {{reganalysis}}. - -### {{classification-cap}} field types [dfa-classification-field-type-docs-limitations] +### {{classification-cap}} field types [dfa-classification-field-type-docs-limitations] {{classification-cap}} supports fields that have numeric, boolean, text, keyword, or ip data types. It is also tolerant of missing values. Fields that are supported are included in the analysis, other fields are ignored. Documents where included fields contain an array are also ignored. Documents in the destination index that don’t contain a results field are not included in the {{classanalysis}}. - -### Imbalanced class sizes affect {{classification}} performance [dfa-classification-imbalanced-classes] +### Imbalanced class sizes affect {{classification}} performance [dfa-classification-imbalanced-classes] If your training data is very imbalanced, {{classanalysis}} may not provide good predictions. Try to avoid highly imbalanced situations. We recommend having at least 50 examples of each class and a ratio of no more than 10 to 1 for the majority to minority class labels in the training data. If your training data set is very imbalanced, consider downsampling the majority class, upsampling the minority class, or gathering more data. - -### Deeply nested objects affect {{infer}} performance [dfa-inference-nested-limitation] +### Deeply nested objects affect {{infer}} performance [dfa-inference-nested-limitation] If the data that you run inference against contains documents that have a series of combinations of dot delimited and nested fields (for example: `{"a.b": "c", "a": {"b": "c"},...}`), the performance of the operation might be slightly slower. Consider using as simple mapping as possible for the best performance profile. - -### Analytics runtime performance may significantly slow down with {{feat-imp}} computation [dfa-feature-importance-limitation] +### Analytics runtime performance may significantly slow down with {{feat-imp}} computation [dfa-feature-importance-limitation] For complex models (such as those with many deep trees), the calculation of {{feat-imp}} takes significantly more time. If a reduction in runtime is important to you, try strategies such as disabling {{feat-imp}}, reducing the amount of training data (for example by decreasing the training percentage), setting [hyperparameter](hyperparameters.md) values, or only selecting fields that are relevant for analysis. - diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-phases.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-phases.md index 9d8b57586..c216b34a8 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-phases.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-phases.md @@ -8,7 +8,6 @@ mapped_pages: # How data frame analytics jobs work [ml-dfa-phases] - A {{dfanalytics-job}} is essentially a persistent {{es}} task. During its life cycle, it goes through four or five main phases depending on the analysis type: * reindexing, @@ -19,20 +18,17 @@ A {{dfanalytics-job}} is essentially a persistent {{es}} task. During its life c Let’s take a look at the phases one-by-one. - -## Reindexing [ml-dfa-phases-reindex] +## Reindexing [ml-dfa-phases-reindex] During the reindexing phase the documents from the source index or indices are copied to the destination index. If you want to define settings or mappings, create the index before you start the job. Otherwise, the job creates it using default settings. Once the destination index is built, the {{dfanalytics-job}} task calls the {{es}} [Reindex API](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html) to launch the reindexing task. - -## Loading data [ml-dfa-phases-load] +## Loading data [ml-dfa-phases-load] After the reindexing is finished, the job fetches the needed data from the destination index. It converts the data into the format that the analysis process expects, then sends it to the analysis process. - -## Analyzing [ml-dfa-phases-analyze] +## Analyzing [ml-dfa-phases-analyze] In this phase, the job generates a {{ml}} model for analyzing the data. The specific phases of analysis vary depending on the type of {{dfanalytics-job}}. @@ -45,15 +41,12 @@ In this phase, the job generates a {{ml}} model for analyzing the data. The spec 3. `fine_tuning_parameters`: Identifies final values for undefined hyperparameters. See [hyperparameter optimization](hyperparameters.md). 4. `final_training`: Trains the {{ml}} model. - -## Writing results [ml-dfa-phases-write] +## Writing results [ml-dfa-phases-write] After the loaded data is analyzed, the analysis process sends back the results. Only the additional fields that the analysis calculated are written back, the ones that have been loaded in the loading data phase are not. The {{dfanalytics-job}} matches the results with the data rows in the destination index, merges them, and indexes them back to the destination index. - -## {{infer-cap}} [ml-dfa-phases-inference] +## {{infer-cap}} [ml-dfa-phases-inference] This phase exists only for {{regression}} and {{classification}} jobs. In this phase, the job validates the trained model against the test split of the data set. -Finally, after all phases are completed, the task is marked as completed and the {{dfanalytics-job}} stops. Your data is ready to be evaluated. - +Finally, after all phases are completed, the task is marked as completed and the {{dfanalytics-job}} stops. Your data is ready to be evaluated. \ No newline at end of file diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-resources.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-resources.md index 03323d949..b11e79db6 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-resources.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-resources.md @@ -8,5 +8,3 @@ mapped_pages: This section contains further resources for using {{dfanalytics}}. * [Limitations](ml-dfa-limitations.md) - - diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-scale.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-scale.md index fa3a2be29..685a8876b 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-scale.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-scale.md @@ -22,14 +22,12 @@ It is important to note that there is a correlation between the training time, t The following recommendations are not sequential – the numbers just help to navigate between the list items; you can take action on one or more of them in any order. - ## 0. Start small and iterate rapidly [rapid-iteration] Training is an iterative process. Experiment with different settings and configuration options (including but not limited to hyperparameters and feature importance), then evaluate the results and decide whether they are good enough or need further experimentation. Every iteration takes time, so it is useful to start with a small set of data so you can iterate rapidly and then build up from here. - ## 1. Set a small training percent [small-training-percent] (This step only applies to {{regression}} and {{classification}} jobs.) @@ -38,7 +36,6 @@ The number of documents used for training a model has an effect on the training Consider starting with a small percentage of training data so you can complete iterations more quickly. Once you are happy with your configuration, increase the training percent. As a rule of thumb, if you have a data set with more than 100,000 data points, start with a training percent of 5 or 10. - ## 2. Disable {{feat-imp}} calculation [disable-feature-importance] (This step only applies to {{regression}} and {{classification}} jobs.) @@ -47,7 +44,6 @@ Consider starting with a small percentage of training data so you can complete i For a shorter runtime, consider disabling {{feat-imp}} for some or all iterations if you do not require it. - ## 3. Optimize the number of included fields [optimize-included-fields] You can speed up runtime by only analyzing relevant fields. @@ -58,8 +54,6 @@ By default, all the fields that are supported by the analysis type are included {{feat-imp-cap}} can help you determine the fields that contribute most to the prediction. However, as calculating {{feat-imp}} increases training time, this is a trade-off that can be evaluated during an iterative training process. :::: - - ## 4. Increase the maximum number of threads [increase-threads] You can set the maximum number of threads that are used during the analysis. The default value of `max_num_threads` is 1. Depending on the characteristics of the data, using more threads may decrease the training time at the cost of increased CPU usage. Note that trying to use more threads than the number of CPU cores has no advantage. @@ -72,15 +66,12 @@ To learn more about the individual phases, refer to [How {{dfanalytics-jobs}} wo If your {{ml}} nodes are running concurrent {{anomaly-detect}} or {{dfanalytics-jobs}}, then you may want to keep the maximum number of threads set to a low number – for example the default 1 – to prevent jobs competing for resources. :::: - - ## 5. Optimize the size of the source index [optimize-source-index] Even if the training percent is low, reindexing the source index – which is a mandatory step in the job creation process – may take a long time. During reindexing, the documents from the source index or indices are copied to the destination index, so you have a static copy of the analyzed data. If your data is large and you do not need to test and train on the whole source index or indices, then reduce the cost of reindexing by using a subset of your source data. This can be done by either defining a filter for the source index in the {{dfanalytics-job}} configuration, or by manually reindexing a subset of this data to use as an alternate source index. - ## 6. Configure hyperparameters [configure-hyperparameters] (This step only applies to {{regression}} and {{classification}} jobs.) diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfanalytics-apis.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfanalytics-apis.md index caacf2238..caf0eb40d 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfanalytics-apis.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfanalytics-apis.md @@ -29,4 +29,3 @@ The evaluation API endpoint has the following base: * [Update {{dfanalytics-jobs}}](https://www.elastic.co/guide/en/elasticsearch/reference/current/update-dfanalytics.html) For information about the APIs related to trained models, refer to [*API quick reference*](../nlp/ml-nlp-apis.md). - diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-feature-encoding.md b/explore-analyze/machine-learning/data-frame-analytics/ml-feature-encoding.md index b511a1743..e6f0feb6a 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-feature-encoding.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-feature-encoding.md @@ -16,4 +16,3 @@ mapped_pages: When the model makes predictions on new data, the data needs to be processed in the same way it was trained. {{ml-cap}} model inference in the {{stack}} does this automatically, so the automatically applied encodings are used in each call for inference. Refer to {{infer}} for [{{classification}}](ml-dfa-classification.md#ml-inference-class) and [{{regression}}](ml-dfa-regression.md#ml-inference-reg). [{{feat-imp-cap}}](ml-feature-importance.md) is calculated for the original categorical fields, not the automatically encoded features. - diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-feature-importance.md b/explore-analyze/machine-learning/data-frame-analytics/ml-feature-importance.md index 922b9619d..f6463c464 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-feature-importance.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-feature-importance.md @@ -40,5 +40,3 @@ For {{classanalysis}}, the sum of the {{feat-imp}} values approximates the predi By default, {{feat-imp}} values are not calculated. To generate this information, when you create a {{dfanalytics-job}} you must specify the `num_top_feature_importance_values` property. For example, see [Performing {{reganalysis}} in the sample flight data set](ml-dfa-regression.md#performing-regression) and [Performing {{classanalysis}} in the sample flight data set](ml-dfa-classification.md#performing-classification). The {{feat-imp}} values are stored in the {{ml}} results field for each document in the destination index. The number of {{feat-imp}} values for each document might be less than the `num_top_feature_importance_values` property value. For example, it returns only features that had a positive or negative effect on the prediction. - - diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-feature-processors.md b/explore-analyze/machine-learning/data-frame-analytics/ml-feature-processors.md index 89bae43d2..2c4ea1f2a 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-feature-processors.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-feature-processors.md @@ -4,11 +4,8 @@ mapped_pages: - https://www.elastic.co/guide/en/machine-learning/current/ml-feature-processors.html --- - - # Feature processors [ml-feature-processors] - {{dfanalytics-cap}} automatically includes a [Feature encoding](ml-feature-encoding.md) phase, which transforms categorical features into numerical ones. If you want to have more control over the encoding methods that are used for specific fields, however, you can define feature processors. If there are any remaining categorical features after your processors run, they are addressed in the automatic feature encoding phase. The feature processors that you defined are the part of the analytics process, when data comes through the aggregation or pipeline, the processors run against the new data. The resulting features are ephemeral; they are not stored in the index. This provides a mechanism to create features that can be used at search and ingest time and don’t take up space in the index. @@ -22,9 +19,3 @@ Available feature processors: * [n-gram encoding](https://www.elastic.co/guide/en/machine-learning/current/ngram-encoding.html) * [One hot encoding](https://www.elastic.co/guide/en/machine-learning/current/one-hot-encoding.html) * [Target mean encoding](https://www.elastic.co/guide/en/machine-learning/current/target-mean-encoding.html) - - - - - - diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-trained-models.md b/explore-analyze/machine-learning/data-frame-analytics/ml-trained-models.md index 0912c9a37..69639ecfc 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-trained-models.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-trained-models.md @@ -11,33 +11,32 @@ In {{kib}}, you can view and manage your trained models in **{{stack-manage-app} Alternatively, you can use APIs like [get trained models](https://www.elastic.co/guide/en/elasticsearch/reference/current/get-trained-models.html) and [delete trained models](https://www.elastic.co/guide/en/elasticsearch/reference/current/delete-trained-models.html). - ## Deploying trained models [deploy-dfa-trained-models] - ### Models trained by {{dfanalytics}} [_models_trained_by_dfanalytics] 1. To deploy {{dfanalytics}} model in a pipeline, navigate to **Machine Learning** > **Model Management** > **Trained models** in the main menu, or use the [global search field](../../overview/kibana-quickstart.md#_finding_your_apps_and_objects) in {{kib}}. + 2. Find the model you want to deploy in the list and click **Deploy model** in the **Actions** menu. - :::{image} ../../../images/machine-learning-ml-dfa-trained-models-ui.png - :alt: The trained models UI in {kib} - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-ml-dfa-trained-models-ui.png +:alt: The trained models UI in {kib} +:class: screenshot +::: 3. Create an {{infer}} pipeline to be able to use the model against new data through the pipeline. Add a name and a description or use the default values. - :::{image} ../../../images/machine-learning-ml-dfa-inference-pipeline.png - :alt: Creating an inference pipeline - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-ml-dfa-inference-pipeline.png +:alt: Creating an inference pipeline +:class: screenshot +::: 4. Configure the pipeline processors or use the default settings. - :::{image} ../../../images/machine-learning-ml-dfa-inference-processor.png - :alt: Configuring an inference processor - :class: screenshot - ::: +:::{image} ../../../images/machine-learning-ml-dfa-inference-processor.png +:alt: Configuring an inference processor +:class: screenshot +::: 5. Configure to handle ingest failures or use the default settings. 6. (Optional) Test your pipeline by running a simulation of the pipeline to confirm it produces the anticipated results. @@ -45,76 +44,72 @@ Alternatively, you can use APIs like [get trained models](https://www.elastic.co The model is deployed and ready to use through the {{infer}} pipeline. - ### Models trained by other methods [_models_trained_by_other_methods] You can also supply trained models that are not created by {{dfanalytics-job}} but adhere to the appropriate [JSON schema](https://github.com/elastic/ml-json-schemas). Likewise, you can use third-party models to perform natural language processing (NLP) tasks. If you want to use these trained models in the {{stack}}, you must store them in {{es}} documents by using the [create trained models API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-trained-models.html). For more information about NLP models, refer to [*Deploy trained models*](../nlp/ml-nlp-deploy-models.md). - ## Exporting and importing models [export-import] Models trained in Elasticsearch are portable and can be transferred between clusters. This is particularly useful when models are trained in isolation from the cluster where they are used for inference. The following instructions show how to use [`curl`](https://curl.se/) and [`jq`](https://stedolan.github.io/jq/) to export a model as JSON and import it to another cluster. 1. Given a model *name*, find the model *ID*. You can use `curl` to call the [get trained model API](https://www.elastic.co/guide/en/elasticsearch/reference/current/get-trained-models.html) to list all models with their IDs. - ```bash +```bash curl -s -u username:password \ -X GET "http://localhost:9200/_ml/trained_models" \ | jq . -C \ | more - ``` +``` If you want to show just the model IDs available, use `jq` to select a subset. - ```bash +```bash curl -s -u username:password \ -X GET "http://localhost:9200/_ml/trained_models" \ | jq -C -r '.trained_model_configs[].model_id' - ``` +``` - ```bash +```bash flights1-1607953694065 flights0-1607953585123 lang_ident_model_1 - ``` +``` In this example, you are exporting the model with ID `flights1-1607953694065`. 2. Using `curl` from the command line, again use the [get trained models API](https://www.elastic.co/guide/en/elasticsearch/reference/current/get-trained-models.html) to export the entire model definition and save it to a JSON file. - ```bash +```bash curl -u username:password \ -X GET "http://localhost:9200/_ml/trained_models/flights1-1607953694065?exclude_generated=true&include=definition&decompress_definition=false" \ | jq '.trained_model_configs[0] | del(.model_id)' \ > flights1.json - ``` +``` - A few observations: +A few observations: - * Exporting models requires using `curl` or a similar tool that can **stream** the model over HTTP into a file. If you use the {{kib}} Console, the browser might be unresponsive due to the size of exported models. - * Note the query parameters that are used during export. These parameters are necessary to export the model in a way that it can later be imported again and used for inference. - * You must unnest the JSON object by one level to extract just the model definition. You must also remove the existing model ID in order to not have ID collisions when you import again. You can do these steps using `jq` inline or alternatively it can be done to the resulting JSON file after downloading using `jq` or other tools. + * Exporting models requires using `curl` or a similar tool that can **stream** the model over HTTP into a file. If you use the {{kib}} Console, the browser might be unresponsive due to the size of exported models. + * Note the query parameters that are used during export. These parameters are necessary to export the model in a way that it can later be imported again and used for inference. + * You must unnest the JSON object by one level to extract just the model definition. You must also remove the existing model ID in order to not have ID collisions when you import again. You can do these steps using `jq` inline or alternatively it can be done to the resulting JSON file after downloading using `jq` or other tools. 3. Import the saved model using `curl` to upload the JSON file to the [created trained model API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-trained-models.html). When you specify the URL, you can also set the model ID to something new using the last path part of the URL. - ```bash +```bash curl -u username:password \ -H 'Content-Type: application/json' \ -X PUT "http://localhost:9200/_ml/trained_models/flights1-imported" \ --data-binary @flights1.json - ``` - +``` ::::{note} + * Models exported from the [get trained models API](https://www.elastic.co/guide/en/elasticsearch/reference/current/get-trained-models.html) are limited in size by the [http.max_content_length](https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html) global configuration value in {{es}}. The default value is `100mb` and may need to be increased depending on the size of model being exported. * Connection timeouts can occur, for example, when model sizes are very large or your cluster is under load. If needed, you can increase [timeout configurations](https://ec.haxx.se/usingcurl/usingcurl-timeouts) for `curl` (for example, `curl --max-time 600`) or your client of choice. :::: - If you also want to copy the {{dfanalytics-job}} to the new cluster, you can export and import jobs in the **{{stack-manage-app}}** app in {{kib}}. Refer to [Exporting and importing {{ml}} jobs](../anomaly-detection/move-jobs.md). - ## Importing an external model to the {{stack}} [import-external-model-to-es] It is possible to import a model to your {{es}} cluster even if the model is not trained by Elastic {{dfanalytics}}. Eland supports [importing models](https://www.elastic.co/guide/en/elasticsearch/client/eland/current/machine-learning.html) directly through its APIs. Please refer to the latest [Eland documentation](https://eland.readthedocs.io/en/latest/index.md) for more information on supported model types and other details of using Eland to import models with. From 368fd9cc72bdd0bb6abf0b832763694951f01546 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Wed, 5 Feb 2025 15:09:26 +0100 Subject: [PATCH 10/15] [E&A] Refines NLP section (#328) * [E&A] Refines NLP section. * [E&A] Adds authentication methods. --- .../anomaly-detection-scale.md | 2 +- explore-analyze/machine-learning/nlp.md | 17 +++++------ .../machine-learning/nlp/ml-nlp-apis.md | 1 - .../machine-learning/nlp/ml-nlp-auto-scale.md | 15 +--------- .../nlp/ml-nlp-classify-text.md | 10 ++----- .../nlp/ml-nlp-deploy-model.md | 2 -- .../nlp/ml-nlp-deploy-models.md | 5 ---- .../nlp/ml-nlp-extract-info.md | 4 --- .../nlp/ml-nlp-import-model.md | 28 +++++++++++++------ .../machine-learning/nlp/ml-nlp-inference.md | 8 ------ .../machine-learning/nlp/ml-nlp-lang-ident.md | 8 +----- .../machine-learning/nlp/ml-nlp-overview.md | 2 -- .../nlp/ml-nlp-search-compare.md | 3 -- .../nlp/ml-nlp-select-model.md | 3 +- .../nlp/ml-nlp-test-inference.md | 1 - 15 files changed, 36 insertions(+), 73 deletions(-) diff --git a/explore-analyze/machine-learning/anomaly-detection/anomaly-detection-scale.md b/explore-analyze/machine-learning/anomaly-detection/anomaly-detection-scale.md index df26ce710..4512dcce5 100644 --- a/explore-analyze/machine-learning/anomaly-detection/anomaly-detection-scale.md +++ b/explore-analyze/machine-learning/anomaly-detection/anomaly-detection-scale.md @@ -9,7 +9,7 @@ There are many advanced configuration options for {{anomaly-jobs}}, some of them In this guide, you’ll learn how to: -* Understand the impact of configuration options on the performance of {anomaly-jobs} +* Understand the impact of configuration options on the performance of {{anomaly-jobs}} Prerequisites: diff --git a/explore-analyze/machine-learning/nlp.md b/explore-analyze/machine-learning/nlp.md index b14799a42..44f6f1325 100644 --- a/explore-analyze/machine-learning/nlp.md +++ b/explore-analyze/machine-learning/nlp.md @@ -7,12 +7,13 @@ mapped_pages: You can use {{stack-ml-features}} to analyze natural language data and make predictions. -* [*Overview*](nlp/ml-nlp-overview.md) -* [*Deploy trained models*](nlp/ml-nlp-deploy-models.md) -* [*Trained model autoscaling*](nlp/ml-nlp-auto-scale.md) -* [*Add NLP {{infer}} to ingest pipelines*](nlp/ml-nlp-inference.md) -* [*API quick reference*](nlp/ml-nlp-apis.md) +* [Overview](nlp/ml-nlp-overview.md) +* [Deploy trained models](nlp/ml-nlp-deploy-models.md) +* [Trained model autoscaling](nlp/ml-nlp-auto-scale.md) +* [Add NLP {{infer}} to ingest pipelines](nlp/ml-nlp-inference.md) +* [API quick reference](nlp/ml-nlp-apis.md) * [ELSER](nlp/ml-nlp-elser.md) -* [*Examples*](nlp/ml-nlp-examples.md) -* [*Limitations*](nlp/ml-nlp-limitations.md) - +* [E5](nlp/ml-nlp-e5.md) +* [Language identification](nlp/ml-nlp-lang-ident.md) +* [Examples](nlp/ml-nlp-examples.md) +* [Limitations](nlp/ml-nlp-limitations.md) diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-apis.md b/explore-analyze/machine-learning/nlp/ml-nlp-apis.md index 71370e704..78e8b550e 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-apis.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-apis.md @@ -34,4 +34,3 @@ The {{infer}} APIs have the following base: * [Delete inference endpoint](https://www.elastic.co/guide/en/elasticsearch/reference/current/delete-inference-api.html) * [Get inference endpoint](https://www.elastic.co/guide/en/elasticsearch/reference/current/get-inference-api.html) * [Perform inference](https://www.elastic.co/guide/en/elasticsearch/reference/current/post-inference-api.html) - diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-auto-scale.md b/explore-analyze/machine-learning/nlp/ml-nlp-auto-scale.md index 5323de43a..885679552 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-auto-scale.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-auto-scale.md @@ -16,8 +16,6 @@ There are two ways to enable autoscaling: To fully leverage model autoscaling, it is highly recommended to enable [{{es}} deployment autoscaling](../../../deploy-manage/autoscaling.md). :::: - - ## Enabling autoscaling through APIs - adaptive allocations [nlp-model-adaptive-allocations] Model allocations are independent units of work for NLP tasks. If you set the numbers of threads and allocations for a model manually, they remain constant even when not all the available resources are fully used or when the load on the model requires more resources. Instead of setting the number of allocations manually, you can enable adaptive allocations to set the number of allocations based on the load on the process. This can help you to manage performance and cost more easily. (Refer to the [pricing calculator](https://cloud.elastic.co/pricing) to learn more about the possible costs.) @@ -31,7 +29,6 @@ You can enable adaptive allocations by using: If the new allocations fit on the current {{ml}} nodes, they are immediately started. If more resource capacity is needed for creating new model allocations, then your {{ml}} node will be scaled up if {{ml}} autoscaling is enabled to provide enough resources for the new allocation. The number of model allocations can be scaled down to 0. They cannot be scaled up to more than 32 allocations, unless you explicitly set the maximum number of allocations to more. Adaptive allocations must be set up independently for each deployment and [{{infer}} endpoint](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html). - ### Optimizing for typical use cases [optimize-use-case] You can optimize your model deployment for typical use cases, such as search and ingest. When you optimize for ingest, the throughput will be higher, which increases the number of {{infer}} requests that can be performed in parallel. When you optimize for search, the latency will be lower during search processes. @@ -39,7 +36,6 @@ You can optimize your model deployment for typical use cases, such as search and * If you want to optimize for ingest, set the number of threads to `1` (`"threads_per_allocation": 1`). * If you want to optimize for search, set the number of threads to greater than `1`. Increasing the number of threads will make the search processes more performant. - ## Enabling autoscaling in {{kib}} - adaptive resources [nlp-model-adaptive-resources] You can enable adaptive resources for your models when starting or updating the model deployment. Adaptive resources make it possible for {{es}} to scale up or down the available resources based on the load on the process. This can help you to manage performance and cost more easily. When adaptive resources are enabled, the number of vCPUs that the model deployment uses is set automatically based on the current load. When the load is high, the number of vCPUs that the process can use is automatically increased. When the load is low, the number of vCPUs that the process can use is automatically decreased. @@ -53,7 +49,6 @@ Refer to the tables in the [Model deployment resource matrix](#auto-scaling-matr :class: screenshot ::: - ## Model deployment resource matrix [auto-scaling-matrix] The used resources for trained model deployments depend on three factors: @@ -68,13 +63,10 @@ If you use {{es}} on-premises, vCPUs level ranges are derived from the `total_ml On Serverless, adaptive allocations are automatically enabled for all project types. However, the "Adaptive resources" control is not displayed in {{kib}} for Observability and Security projects. :::: - - ### Deployments in Cloud optimized for ingest [_deployments_in_cloud_optimized_for_ingest] In case of ingest-optimized deployments, we maximize the number of model allocations. - #### Adaptive resources enabled [_adaptive_resources_enabled] | Level | Allocations | Threads | vCPUs | @@ -85,7 +77,6 @@ In case of ingest-optimized deployments, we maximize the number of model allocat * The Cloud console doesn’t directly set an allocations limit; it only sets a vCPU limit. This vCPU limit indirectly determines the number of allocations, calculated as the vCPU limit divided by the number of threads. - #### Adaptive resources disabled [_adaptive_resources_disabled] | Level | Allocations | Threads | vCPUs | @@ -96,12 +87,10 @@ In case of ingest-optimized deployments, we maximize the number of model allocat * The Cloud console doesn’t directly set an allocations limit; it only sets a vCPU limit. This vCPU limit indirectly determines the number of allocations, calculated as the vCPU limit divided by the number of threads. - ### Deployments in Cloud optimized for search [_deployments_in_cloud_optimized_for_search] In case of search-optimized deployments, we maximize the number of threads. The maximum number of threads that can be claimed depends on the hardware your architecture has. - #### Adaptive resources enabled [_adaptive_resources_enabled_2] | Level | Allocations | Threads | vCPUs | @@ -112,7 +101,6 @@ In case of search-optimized deployments, we maximize the number of threads. The * The Cloud console doesn’t directly set an allocations limit; it only sets a vCPU limit. This vCPU limit indirectly determines the number of allocations, calculated as the vCPU limit divided by the number of threads. - #### Adaptive resources disabled [_adaptive_resources_disabled_2] | Level | Allocations | Threads | vCPUs | @@ -121,5 +109,4 @@ In case of search-optimized deployments, we maximize the number of threads. The | Medium | 2 (if threads=16) statically | maximum that the hardware allows (for example, 16) | 32 if available | | High | Maximum available set in the Cloud console *, statically | maximum that the hardware allows (for example, 16) | Maximum available set in the Cloud console, statically | -* The Cloud console doesn’t directly set an allocations limit; it only sets a vCPU limit. This vCPU limit indirectly determines the number of allocations, calculated as the vCPU limit divided by the number of threads. - +\* The Cloud console doesn’t directly set an allocations limit; it only sets a vCPU limit. This vCPU limit indirectly determines the number of allocations, calculated as the vCPU limit divided by the number of threads. diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-classify-text.md b/explore-analyze/machine-learning/nlp/ml-nlp-classify-text.md index a037c428f..750a0c5c1 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-classify-text.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-classify-text.md @@ -11,13 +11,11 @@ These NLP tasks enable you to identify the language of text and classify or labe * [Text classification](#ml-nlp-text-classification) * [Zero-shot text classification](#ml-nlp-zero-shot) - -## {{lang-ident-cap}} [_lang_ident_cap] +## {{lang-ident-cap}} [_lang_ident_cap] The {{lang-ident}} model is provided out-of-the box in your {{es}} cluster. You can find the documentation of the model on the [{{lang-ident-cap}}](ml-nlp-lang-ident.md) page under the Built-in models section. - -## Text classification [ml-nlp-text-classification] +## Text classification [ml-nlp-text-classification] Text classification assigns the input text to one of multiple classes that best describe the text. The classes used depend on the model and the data set that was used to train it. Based on the number of classes, two main types of classification exist: binary classification, where the number of classes is exactly two, and multi-class classification, where the number of classes is more than two. @@ -39,8 +37,7 @@ Likewise, you might use a trained model to perform multi-class classification an ... ``` - -## Zero-shot text classification [ml-nlp-zero-shot] +## Zero-shot text classification [ml-nlp-zero-shot] The zero-shot classification task offers the ability to classify text without training a model on a specific set of classes. Instead, you provide the classes when you deploy the model or at {{infer}} time. It uses a model trained on a large data set that has gained a general language understanding and asks the model how well the labels you provided fit with your text. @@ -95,4 +92,3 @@ The task returns the following result: ``` Since you can adjust the labels while you perform {{infer}}, this type of task is exceptionally flexible. If you are consistently using the same labels, however, it might be better to use a fine-tuned text classification model. - diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-deploy-model.md b/explore-analyze/machine-learning/nlp/ml-nlp-deploy-model.md index 8d6dbde7d..b24ee042d 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-deploy-model.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-deploy-model.md @@ -22,7 +22,6 @@ Each deployment will be fine-tuned automatically based on its specific purpose y Since eland uses APIs to deploy the models, you cannot see the models in {{kib}} until the saved objects are synchronized. You can follow the prompts in {{kib}}, wait for automatic synchronization, or use the [sync {{ml}} saved objects API](https://www.elastic.co/guide/en/kibana/current/machine-learning-api-sync.html). :::: - You can define the resource usage level of the NLP model during model deployment. The resource usage levels behave differently depending on [adaptive resources](ml-nlp-auto-scale.md#nlp-model-adaptive-resources) being enabled or disabled. When adaptive resources are disabled but {{ml}} autoscaling is enabled, vCPU usage of Cloud deployments derived from the Cloud console and functions as follows: * Low: This level limits resources to two vCPUs, which may be suitable for development, testing, and demos depending on your parameters. It is not recommended for production use @@ -31,7 +30,6 @@ You can define the resource usage level of the NLP model during model deployment For the resource levels when adaptive resources are enabled, refer to <[*Trained model autoscaling*](ml-nlp-auto-scale.md). - ## Request queues and search priority [infer-request-queues] Each allocation of a model deployment has a dedicated queue to buffer {{infer}} requests. The size of this queue is determined by the `queue_capacity` parameter in the [start trained model deployment API](https://www.elastic.co/guide/en/elasticsearch/reference/current/start-trained-model-deployment.html). When the queue reaches its maximum capacity, new requests are declined until some of the queued requests are processed, creating available capacity once again. When multiple ingest pipelines reference the same deployment, the queue can fill up, resulting in rejected requests. Consider using dedicated deployments to prevent this situation. diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-deploy-models.md b/explore-analyze/machine-learning/nlp/ml-nlp-deploy-models.md index 0cecc80ab..ebe39c2b6 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-deploy-models.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-deploy-models.md @@ -11,8 +11,3 @@ If you want to perform {{nlp}} tasks in your cluster, you must deploy an appropr 2. [Import the trained model and vocabulary](ml-nlp-import-model.md). 3. [Deploy the model in your cluster](ml-nlp-deploy-model.md). 4. [Try it out](ml-nlp-test-inference.md). - - - - - diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-extract-info.md b/explore-analyze/machine-learning/nlp/ml-nlp-extract-info.md index 4c538757c..b66356afd 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-extract-info.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-extract-info.md @@ -11,7 +11,6 @@ These NLP tasks enable you to extract information from your unstructured text: * [Fill-mask](#ml-nlp-mask) * [Question answering](#ml-nlp-question-answering) - ## Named entity recognition [ml-nlp-ner] The named entity recognition (NER) task can identify and categorize certain entities - typically proper nouns - in your unstructured text. Named entities usually refer to objects in the real world such as persons, locations, organizations, and other miscellaneous entities that are consistently referenced by a proper name. @@ -53,7 +52,6 @@ The task returns the following result: ... ``` - ## Fill-mask [ml-nlp-mask] The objective of the fill-mask task is to predict a missing word from a text sequence. The model uses the context of the masked word to predict the most likely word to complete the text. @@ -80,7 +78,6 @@ The task returns the following result: ... ``` - ## Question answering [ml-nlp-question-answering] The question answering (or extractive question answering) task makes it possible to get answers to certain questions by extracting information from the provided text. @@ -105,4 +102,3 @@ The answer is shown by the object below: } ... ``` - diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-import-model.md b/explore-analyze/machine-learning/nlp/ml-nlp-import-model.md index 9f5b02171..b61d79069 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-import-model.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-import-model.md @@ -9,17 +9,14 @@ mapped_pages: If you want to install a trained model in a restricted or closed network, refer to [these instructions](https://www.elastic.co/guide/en/elasticsearch/client/eland/current/machine-learning.html#ml-nlp-pytorch-air-gapped). :::: - After you choose a model, you must import it and its tokenizer vocabulary to your cluster. When you import the model, it must be chunked and imported one chunk at a time for storage in parts due to its size. ::::{note} Trained models must be in a TorchScript representation for use with {{stack-ml-features}}. :::: - [Eland](https://github.com/elastic/eland) is an {{es}} Python client that provides a simple script to perform the conversion of Hugging Face transformer models to their TorchScript representations, the chunking process, and upload to {{es}}; it is therefore the recommended import method. You can either install the Python Eland client on your machine or use a Docker image to build Eland and run the model import script. - ## Import with the Eland client installed [ml-nlp-import-script] 1. Install the [Eland Python client](https://www.elastic.co/guide/en/elasticsearch/client/eland/current/installation.html) with PyTorch extra dependencies. @@ -30,7 +27,7 @@ Trained models must be in a TorchScript representation for use with {{stack-ml-f 2. Run the `eland_import_hub_model` script to download the model from Hugging Face, convert it to TorchScript format, and upload to the {{es}} cluster. For example: - ```shell + ``` eland_import_hub_model \ --cloud-id \ <1> -u -p \ <2> @@ -43,10 +40,8 @@ Trained models must be in a TorchScript representation for use with {{stack-ml-f 3. Specify the identifier for the model in the Hugging Face model hub. 4. Specify the type of NLP task. Supported values are `fill_mask`, `ner`, `question_answering`, `text_classification`, `text_embedding`, `text_expansion`, `text_similarity`, and `zero_shot_classification`. - For more details, refer to [https://www.elastic.co/guide/en/elasticsearch/client/eland/current/machine-learning.html#ml-nlp-pytorch](https://www.elastic.co/guide/en/elasticsearch/client/eland/current/machine-learning.html#ml-nlp-pytorch). - ## Import with Docker [ml-nlp-import-docker] If you want to use Eland without installing it, run the following command: @@ -65,9 +60,26 @@ docker run -it --rm docker.elastic.co/eland/eland \ --start ``` -Replace the `$ELASTICSEARCH_URL` with the URL for your {{es}} cluster. Refer to [Authentication methods](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-authentication.html) to learn more. +Replace the `$ELASTICSEARCH_URL` with the URL for your {{es}} cluster. Refer to [Authentication methods](#ml-nlp-authentication) to learn more. + +## Authentication methods [ml-nlp-authentication] +The following authentication options are available when using the import script: +* username/password authentication (specified with the `-u` and `-p` options): + +```bash +eland_import_hub_model --url https://: -u -p ... +``` + +* username/password authentication (embedded in the URL): + +```bash +eland_import_hub_model --url https://:@: ... +``` +* API key authentication: -$$$ml-nlp-authentication$$$ \ No newline at end of file +```bash +eland_import_hub_model --url https://: --es-api-key ... +``` diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-inference.md b/explore-analyze/machine-learning/nlp/ml-nlp-inference.md index b66dbf4b4..b407bcac6 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-inference.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-inference.md @@ -12,7 +12,6 @@ After you [deploy a trained model in your cluster](ml-nlp-deploy-models.md), you 3. [Ingest documents](#ml-nlp-inference-ingest-docs). 4. [View the results](#ml-nlp-inference-discover). - ## Add an {{infer}} processor to an ingest pipeline [ml-nlp-inference-processor] In {{kib}}, you can create and edit pipelines in **{{stack-manage-app}}** > **Ingest Pipelines**. To open **Ingest Pipelines**, find **{{stack-manage-app}}** in the main menu, or use the [global search field](../../overview/kibana-quickstart.md#_finding_your_apps_and_objects). @@ -94,8 +93,6 @@ In {{kib}}, you can create and edit pipelines in **{{stack-manage-app}}** > **In 3. If everything looks correct, close the panel, and click **Create pipeline**. The pipeline is now ready for use. - - ## Ingest documents [ml-nlp-inference-ingest-docs] You can now use your ingest pipeline to perform NLP tasks on your data. @@ -120,7 +117,6 @@ PUT ner-test To use the `annotated_text` data type in this example, you must install the [mapper annotated text plugin](https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper-annotated-text.html). For more installation details, refer to [Add plugins provided with {{ess}}](https://www.elastic.co/guide/en/cloud/current/ec-adding-elastic-plugins.html). :::: - You can then use the new pipeline to index some documents. For example, use a bulk indexing request with the `pipeline` query parameter for your NER pipeline: ```console @@ -168,8 +164,6 @@ However, those web log messages are unlikely to contain enough words for the mod Set the reindex `size` option to a value smaller than the `queue_capacity` for the trained model deployment. Otherwise, requests might be rejected with a "too many requests" 429 error code. :::: - - ## View the results [ml-nlp-inference-discover] Before you can verify the results of the pipelines, you must [create {{data-sources}}](../../find-and-organize/data-views.md). Then you can explore your data in **Discover**: @@ -190,7 +184,6 @@ In this {{lang-ident}} example, the `ml.inference.predicted_value` contains the To learn more about ingest pipelines and all of the other processors that you can add, refer to [Ingest pipelines](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md). - ## Common problems [ml-nlp-inference-common-problems] If you encounter problems while using your trained model in an ingest pipeline, check the following possible causes: @@ -201,7 +194,6 @@ If you encounter problems while using your trained model in an ingest pipeline, These common failure scenarios and others can be captured by adding failure processors to your pipeline. For more examples, refer to [Handling pipeline failures](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md#handling-pipeline-failures). - ## Further reading [nlp-example-reading] * [How to deploy NLP: Text Embeddings and Vector Search](https://www.elastic.co/blog/how-to-deploy-nlp-text-embeddings-and-vector-search) diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-lang-ident.md b/explore-analyze/machine-learning/nlp/ml-nlp-lang-ident.md index 1b49ec918..d2699017c 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-lang-ident.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-lang-ident.md @@ -13,7 +13,6 @@ The longer the text passed into the {{lang-ident}} model, the more accurately th {{lang-ident-cap}} takes into account Unicode boundaries when the feature set is built. If the text has diacritical marks, then the model uses that information for identifying the language of the text. In certain cases, the model can detect the source language even if it is not written in the script that the language traditionally uses. These languages are marked in the supported languages table (see below) with the `Latn` subtag. {{lang-ident-cap}} supports Unicode input. - ## Supported languages [ml-lang-ident-supported-languages] The table below contains the ISO codes and the English names of the languages that {{lang-ident}} supports. If a language has a 2-letter `ISO 639-1` code, the table contains that identifier. Otherwise, the 3-letter `ISO 639-2` code is used. The `Latn` subtag indicates that the language is transliterated into Latin script. @@ -59,8 +58,7 @@ The table below contains the ISO codes and the English names of the languages th | hi-Latn | Hindi | no | Norwegian | | | | hmn | Hmong | ny | Chichewa | | | - -## Example of {{lang-ident}} [ml-lang-ident-example] +## Example of {{lang-ident}} [ml-lang-ident-example] In the following example, we feed the {{lang-ident}} trained model a short Hungarian text that contains diacritics and a couple of English words. The model identifies the text correctly as Hungarian with high probability. @@ -97,7 +95,6 @@ POST _ingest/pipeline/_simulate 2. Specifies the number of languages to report by descending order of probability. 3. The source object that contains the text to identify. - In the example above, the `num_top_classes` value indicates that only the top five languages (that is to say, the ones with the highest probability) are reported. The request returns the following response: @@ -158,9 +155,6 @@ The request returns the following response: 1. Contains scores for the most probable languages. 2. The ISO identifier of the language with the highest probability. - - ## Further reading [ml-lang-ident-readings] * [Multilingual search using {{lang-ident}} in {{es}}](https://www.elastic.co/blog/multilingual-search-using-language-identification-in-elasticsearch) - diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-overview.md b/explore-analyze/machine-learning/nlp/ml-nlp-overview.md index dc7459e92..ba8dc6a99 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-overview.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-overview.md @@ -7,7 +7,6 @@ mapped_pages: {{nlp-cap}} (NLP) refers to the way in which we can use software to understand natural language in spoken word or written text. - ## NLP in the {{stack}} [nlp-elastic-stack] Elastic offers a wide range of possibilities to leverage natural language processing. @@ -20,7 +19,6 @@ You can **upload and manage NLP models** using the Eland client and the [{{stack You can **store embeddings in your {{es}} vector database** if you generate [dense vector](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html) or [sparse vector](https://www.elastic.co/guide/en/elasticsearch/reference/current/sparse-vector.html) model embeddings outside of {{es}}. - ## What is NLP? [what-is-nlp] Classically, NLP was performed using linguistic rules, dictionaries, regular expressions, and {{ml}} for specific tasks such as automatic categorization or summarization of text. In recent years, however, deep learning techniques have taken over much of the NLP landscape. Deep learning capitalizes on the availability of large scale data sets, cheap computation, and techniques for learning at scale with less human involvement. Pre-trained language models that use a transformer architecture have been particularly successful. For example, BERT is a pre-trained language model that was released by Google in 2018. Since that time, it has become the inspiration for most of today’s modern NLP techniques. The {{stack}} {ml} features are structured around BERT and transformer models. These features support BERT’s tokenization scheme (called WordPiece) and transformer models that conform to the standard BERT model interface. For the current list of supported architectures, refer to [Compatible third party models](ml-nlp-model-ref.md). diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-search-compare.md b/explore-analyze/machine-learning/nlp/ml-nlp-search-compare.md index dd3b90e78..b0e181e10 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-search-compare.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-search-compare.md @@ -10,7 +10,6 @@ The {{stack-ml-features}} can generate embeddings, which you can use to search i * [Text embedding](#ml-nlp-text-embedding) * [Text similarity](#ml-nlp-text-similarity) - ## Text embedding [ml-nlp-text-embedding] Text embedding is a task which produces a mathematical representation of text called an embedding. The {{ml}} model turns the text into an array of numerical values (also known as a *vector*). Pieces of content with similar meaning have similar representations. This means it is possible to determine whether different pieces of text are either semantically similar, different, or even opposite by using a mathematical similarity function. @@ -37,7 +36,6 @@ The task returns the following result: ... ``` - ## Text similarity [ml-nlp-text-similarity] The text similarity task estimates how similar two pieces of text are to each other and expresses the similarity in a numeric value. This is commonly referred to as cross-encoding. This task is useful for ranking document text when comparing it to another provided text input. @@ -67,4 +65,3 @@ In the example above, every string in the `docs` array is compared individually } ... ``` - diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-select-model.md b/explore-analyze/machine-learning/nlp/ml-nlp-select-model.md index 3e307f3ea..9b45527ba 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-select-model.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-select-model.md @@ -5,9 +5,8 @@ mapped_pages: # Select a trained model [ml-nlp-select-model] -Per the [*Overview*](ml-nlp-overview.md), there are multiple ways that you can use NLP features within the {{stack}}. After you determine which type of NLP task you want to perform, you must choose an appropriate trained model. +Per the [Overview](ml-nlp-overview.md), there are multiple ways that you can use NLP features within the {{stack}}. After you determine which type of NLP task you want to perform, you must choose an appropriate trained model. The simplest method is to use a model that has already been fine-tuned for the type of analysis that you want to perform. For example, there are models and data sets available for specific NLP tasks on [Hugging Face](https://huggingface.co/models). These instructions assume you’re using one of those models and do not describe how to create new models. For the current list of supported model architectures, refer to [Compatible third party models](ml-nlp-model-ref.md). If you choose to perform {{lang-ident}} by using the `lang_ident_model_1` that is provided in the cluster, no further steps are required to import or deploy the model. You can skip to using the model in [ingestion pipelines](ml-nlp-inference.md). - diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-test-inference.md b/explore-analyze/machine-learning/nlp/ml-nlp-test-inference.md index efeb45add..bd0185f0a 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-test-inference.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-test-inference.md @@ -52,4 +52,3 @@ In this example, the response contains the annotated text output and the recogni ``` If you are satisfied with the results, you can add these NLP tasks in your [ingestion pipelines](ml-nlp-inference.md). - From 1305342d685197446c435d1f0afe5d8d13e59a82 Mon Sep 17 00:00:00 2001 From: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Date: Wed, 5 Feb 2025 15:36:41 +0100 Subject: [PATCH 11/15] [Search] Update/consolidate ingest section, move search pipelines content (#326) --- .../nlp/inference-processing.md | 4 +- .../es-ingestion-overview.md | 41 ----------------- raw-migrated-files/toc.yml | 2 - solutions/search/ingest-for-search.md | 46 +++++++++++-------- .../search/search-pipelines.md | 42 ++++++++--------- solutions/toc.yml | 2 + 6 files changed, 52 insertions(+), 85 deletions(-) delete mode 100644 raw-migrated-files/elasticsearch/elasticsearch-reference/es-ingestion-overview.md rename raw-migrated-files/elasticsearch/elasticsearch-reference/ingest-pipeline-search.md => solutions/search/search-pipelines.md (81%) diff --git a/explore-analyze/machine-learning/nlp/inference-processing.md b/explore-analyze/machine-learning/nlp/inference-processing.md index e9f668b52..6452b7cee 100644 --- a/explore-analyze/machine-learning/nlp/inference-processing.md +++ b/explore-analyze/machine-learning/nlp/inference-processing.md @@ -5,7 +5,7 @@ mapped_pages: # Inference processing [ingest-pipeline-search-inference] -When you create an index through the **Content** UI, a set of default ingest pipelines are also created, including a ML inference pipeline. The [ML inference pipeline](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-specific-ml-reference) uses inference processors to analyze fields and enrich documents with the output. Inference processors use ML trained models, so you need to use a built-in model or [deploy a trained model in your cluster^](ml-nlp-deploy-models.md) to use this feature. +When you create an index through the **Content** UI, a set of default ingest pipelines are also created, including a ML inference pipeline. The [ML inference pipeline](/solutions/search/search-pipelines.md#ingest-pipeline-search-details-specific-ml-reference) uses inference processors to analyze fields and enrich documents with the output. Inference processors use ML trained models, so you need to use a built-in model or [deploy a trained model in your cluster^](ml-nlp-deploy-models.md) to use this feature. This guide focuses on the ML inference pipeline, its use, and how to manage it. @@ -129,7 +129,7 @@ To ensure the ML inference pipeline will be run when ingesting documents, you mu ## Learn More [ingest-pipeline-search-inference-learn-more] -* See [Overview](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-in-enterprise-search) for information on the various pipelines that are created. +* See [Overview](/solutions/search/search-pipelines.md#ingest-pipeline-search-in-enterprise-search) for information on the various pipelines that are created. * Learn about [ELSER](ml-nlp-elser.md), Elastic’s proprietary retrieval model for semantic search with sparse vectors. * [NER HuggingFace Models](https://huggingface.co/models?library=pytorch&pipeline_tag=token-classification&sort=downloads) * [Text Classification HuggingFace Models](https://huggingface.co/models?library=pytorch&pipeline_tag=text-classification&sort=downloads) diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/es-ingestion-overview.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/es-ingestion-overview.md deleted file mode 100644 index 5f6cac5c9..000000000 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/es-ingestion-overview.md +++ /dev/null @@ -1,41 +0,0 @@ -# Add data to {{es}} [es-ingestion-overview] - -There are multiple ways to ingest data into {{es}}. The option that you choose depends on whether you’re working with timestamped data or non-timestamped data, where the data is coming from, its complexity, and more. - -::::{tip} -You can load [sample data](../../../manage-data/ingest.md#_add_sample_data) into your {{es}} cluster using {{kib}}, to get started quickly. - -:::: - - - -## General content [es-ingestion-overview-general-content] - -General content is data that does not have a timestamp. This could be data like vector embeddings, website content, product catalogs, and more. For general content, you have the following options for adding data to {{es}} indices: - -* [API](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs.html): Use the {{es}} [Document APIs](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs.html) to index documents directly, using the Dev Tools [Console](../../../explore-analyze/query-filter/tools/console.md), or cURL. - - If you’re building a website or app, then you can call Elasticsearch APIs using an [{{es}} client](https://www.elastic.co/guide/en/elasticsearch/client/index.html) in the programming language of your choice. If you use the Python client, then check out the `elasticsearch-labs` repo for various [example notebooks](https://github.com/elastic/elasticsearch-labs/tree/main/notebooks/search/python-examples). - -* [File upload](../../../manage-data/ingest.md#upload-data-kibana): Use the {{kib}} file uploader to index single files for one-off testing and exploration. The GUI guides you through setting up your index and field mappings. -* [Web crawler](https://github.com/elastic/crawler): Extract and index web page content into {{es}} documents. -* [Connectors](https://www.elastic.co/guide/en/elasticsearch/reference/current/es-connectors.html): Sync data from various third-party data sources to create searchable, read-only replicas in {{es}}. - - -## Timestamped data [es-ingestion-overview-timestamped] - -Timestamped data in {{es}} refers to datasets that include a timestamp field. If you use the [Elastic Common Schema (ECS)](https://www.elastic.co/guide/en/ecs/{{ecs_version}}/ecs-reference.html), this field is named `@timestamp`. This could be data like logs, metrics, and traces. - -For timestamped data, you have the following options for adding data to {{es}} data streams: - -* [Elastic Agent and Fleet](https://www.elastic.co/guide/en/fleet/current/fleet-overview.html): The preferred way to index timestamped data. Each Elastic Agent based integration includes default ingestion rules, dashboards, and visualizations to start analyzing your data right away. You can use the Fleet UI in {{kib}} to centrally manage Elastic Agents and their policies. -* [Beats](https://www.elastic.co/guide/en/beats/libbeat/current/beats-reference.html): If your data source isn’t supported by Elastic Agent, use Beats to collect and ship data to Elasticsearch. You install a separate Beat for each type of data to collect. -* [Logstash](https://www.elastic.co/guide/en/logstash/current/introduction.html): Logstash is an open source data collection engine with real-time pipelining capabilities that supports a wide variety of data sources. You might use this option because neither Elastic Agent nor Beats supports your data source. You can also use Logstash to persist incoming data, or if you need to send the data to multiple destinations. -* [Language clients](../../../manage-data/ingest/ingesting-data-from-applications.md): The linked tutorials demonstrate how to use {{es}} programming language clients to ingest data from an application. In these examples, {{es}} is running on Elastic Cloud, but the same principles apply to any {{es}} deployment. - -::::{tip} -If you’re interested in data ingestion pipelines for timestamped data, use the decision tree in the [Elastic Cloud docs](../../../manage-data/ingest.md#ec-data-ingest-pipeline) to understand your options. - -:::: - - diff --git a/raw-migrated-files/toc.yml b/raw-migrated-files/toc.yml index 5cd90f0c9..22b82aa65 100644 --- a/raw-migrated-files/toc.yml +++ b/raw-migrated-files/toc.yml @@ -603,7 +603,6 @@ toc: - file: elasticsearch/elasticsearch-reference/document-level-security.md - file: elasticsearch/elasticsearch-reference/documents-indices.md - file: elasticsearch/elasticsearch-reference/elasticsearch-intro-deploy.md - - file: elasticsearch/elasticsearch-reference/es-ingestion-overview.md - file: elasticsearch/elasticsearch-reference/es-security-principles.md - file: elasticsearch/elasticsearch-reference/esql-using.md - file: elasticsearch/elasticsearch-reference/field-and-document-access-control.md @@ -618,7 +617,6 @@ toc: - file: elasticsearch/elasticsearch-reference/index-modules-analysis.md - file: elasticsearch/elasticsearch-reference/index-modules-mapper.md - file: elasticsearch/elasticsearch-reference/ingest-enriching-data.md - - file: elasticsearch/elasticsearch-reference/ingest-pipeline-search.md - file: elasticsearch/elasticsearch-reference/ingest.md - file: elasticsearch/elasticsearch-reference/install-elasticsearch.md - file: elasticsearch/elasticsearch-reference/ip-filtering.md diff --git a/solutions/search/ingest-for-search.md b/solutions/search/ingest-for-search.md index a1190d32a..0fc7cd312 100644 --- a/solutions/search/ingest-for-search.md +++ b/solutions/search/ingest-for-search.md @@ -6,35 +6,45 @@ mapped_urls: - https://www.elastic.co/guide/en/serverless/current/elasticsearch-ingest-your-data.html --- -# Ingest for search +# Ingest for search use cases -% What needs to be done: Lift-and-shift +% ---- +% navigation_title: "Ingest for search use cases" +% ---- -% Scope notes: guidance on what ingest options you might want to use for search - connectors, crawler ... +$$$elasticsearch-ingest-time-series-data$$$ +::::{note} +This page covers ingest methods specifically for search use cases. If you're working with a different use case, refer to the [ingestion overview](/manage-data/ingest.md) for more options. +:::: -% Use migrated content from existing pages that map to this page: +Search use cases usually focus on general **content**, typically text-heavy data that does not have a timestamp. This could be data like knowledge bases, website content, product catalogs, and more. -% - [ ] ./raw-migrated-files/elasticsearch/elasticsearch-reference/es-ingestion-overview.md -% - [ ] ./raw-migrated-files/docs-content/serverless/elasticsearch-ingest-data-through-api.md -% - [ ] ./raw-migrated-files/elasticsearch/elasticsearch-reference/ingest-pipeline-search.md -% - [ ] ./raw-migrated-files/docs-content/serverless/elasticsearch-ingest-your-data.md +Once you've decided how to [deploy Elastic](/deploy-manage/index.md), the next step is getting your content into {{es}}. Your choice of ingestion method depends on where your content lives and how you need to access it. -% Internal links rely on the following IDs being on this page (e.g. as a heading ID, paragraph ID, etc): +There are several methods to ingest data into {{es}} for search use cases. Choose one or more based on your requirements. -$$$elasticsearch-ingest-time-series-data$$$ +::::{tip} +If you just want to do a quick test, you can load [sample data](/manage-data/ingest/sample-data.md) into your {{es}} cluster using the UI. +:::: + +## Use APIs [es-ingestion-overview-apis] -$$$ingest-pipeline-search-details-specific-ml-reference$$$ +You can use the [`_bulk` API](https://www.elastic.co/docs/api/doc/elasticsearch/v8/group/endpoint-document) to add data to your {{es}} indices, using any HTTP client, including the [{{es}} client libraries](/solutions/search/site-or-app/clients.md). -$$$ingest-pipeline-search-in-enterprise-search$$$ +While the {{es}} APIs can be used for any data type, Elastic provides specialized tools that optimize ingestion for specific use cases. -$$$ingest-pipeline-search-details-generic-reference$$$ +## Use specialized tools [es-ingestion-overview-general-content] -$$$ingest-pipeline-search-details-specific-custom-reference$$$ +You can use these specialized tools to add general content to {{es}} indices. -$$$ingest-pipeline-search-details-specific-reference-processors$$$ +| Method | Description | Notes | +|--------|-------------|-------| +| [**Web crawler**](https://github.com/elastic/crawler) | Programmatically discover and index content from websites and knowledge bases | Crawl public-facing web content or internal sites accessible via HTTP proxy | +| [**Search connectors**]() | Third-party integrations to popular content sources like databases, cloud storage, and business applications | Choose from a range of Elastic-built connectors or build your own in Python using the Elastic connector framework| +| [**File upload**](/manage-data/ingest/tools/upload-data-files.md)| One-off manual uploads through the UI | Useful for testing or very small-scale use cases, but not recommended for production workflows | -$$$ingest-pipeline-search-details-specific$$$ +### Process data at ingest time -$$$ingest-pipeline-search-pipeline-settings-using-the-api$$$ +You can also transform and enrich your content at ingest time using [ingest pipelines](/manage-data/ingest/transform-enrich/ingest-pipelines.md). -$$$ingest-pipeline-search-pipeline-settings$$$ \ No newline at end of file +The Elastic UI has a set of tools for creating and managing indices optimized for search use cases. You can also manage your ingest pipelines in this UI. Learn more in [](search-pipelines.md). \ No newline at end of file diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/ingest-pipeline-search.md b/solutions/search/search-pipelines.md similarity index 81% rename from raw-migrated-files/elasticsearch/elasticsearch-reference/ingest-pipeline-search.md rename to solutions/search/search-pipelines.md index 5e27a802f..78ede152e 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/ingest-pipeline-search.md +++ b/solutions/search/search-pipelines.md @@ -1,9 +1,8 @@ -# Ingest pipelines in Search [ingest-pipeline-search] +# Ingest pipelines for search use cases [ingest-pipeline-search] You can manage ingest pipelines through Elasticsearch APIs or Kibana UIs. -The **Content** UI under **Search** has a set of tools for creating and managing indices optimized for search use cases (non time series data). You can also manage your ingest pipelines in this UI. - +The **Content** UI under **Search** has a set of tools for creating and managing indices optimized for search use cases (non-time series data). You can also manage your ingest pipelines in this UI. ## Find pipelines in Content UI [ingest-pipeline-search-where] @@ -18,12 +17,11 @@ To find this tab in the Kibana UI: The tab is highlighted in this screenshot: -:::{image} ../../../images/elasticsearch-reference-ingest-pipeline-ent-search-ui.png +:::{image} /images/elasticsearch-reference-ingest-pipeline-ent-search-ui.png :alt: ingest pipeline ent search ui :class: screenshot ::: - ## Overview [ingest-pipeline-search-in-enterprise-search] These tools can be particularly helpful by providing a layer of customization and post-processing of documents. For example: @@ -34,11 +32,11 @@ These tools can be particularly helpful by providing a layer of customization an It can be a lot of work to set up and manage production-ready pipelines from scratch. Considerations such as error handling, conditional execution, sequencing, versioning, and modularization must all be taken into account. -To this end, when you create indices for search use cases, (including [Elastic web crawler](https://www.elastic.co/guide/en/enterprise-search/current/crawler.html), [connectors](https://www.elastic.co/guide/en/elasticsearch/reference/current/es-connectors.html). , and API indices), each index already has a pipeline set up with several processors that optimize your content for search. +To this end, when you create indices for search use cases, (including web crawler, search connectors and API indices), each index already has a pipeline set up with several processors that optimize your content for search. -This pipeline is called `search-default-ingestion`. While it is a "managed" pipeline (meaning it should not be tampered with), you can view its details via the Kibana UI or the Elasticsearch API. You can also [read more about its contents below](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-generic-reference). +This pipeline is called `search-default-ingestion`. While it is a "managed" pipeline (meaning it should not be tampered with), you can view its details via the Kibana UI or the Elasticsearch API. You can also [read more about its contents below](#ingest-pipeline-search-details-generic-reference). -You can control whether you run some of these processors. While all features are enabled by default, they are eligible for opt-out. For [Elastic crawler](https://www.elastic.co/guide/en/enterprise-search/current/crawler.html) and [connectors](https://www.elastic.co/guide/en/elasticsearch/reference/current/es-connectors.html). , you can opt out (or back in) per index, and your choices are saved. For API indices, you can opt out (or back in) by including specific fields in your documents. [See below for details](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-pipeline-settings-using-the-api). +You can control whether you run some of these processors. While all features are enabled by default, they are eligible for opt-out. For [Elastic crawler](https://www.elastic.co/guide/en/enterprise-search/current/crawler.html) and [connectors](https://www.elastic.co/guide/en/elasticsearch/reference/current/es-connectors.html). , you can opt out (or back in) per index, and your choices are saved. For API indices, you can opt out (or back in) by including specific fields in your documents. [See below for details](#ingest-pipeline-search-pipeline-settings-using-the-api). At the deployment level, you can change the default settings for all new indices. This will not effect existing indices. @@ -48,7 +46,7 @@ Each index also provides the capability to easily create index-specific ingest p 2. `@custom` 3. `@ml-inference` -Like `search-default-ingestion`, the first of these is "managed", but the other two can and should be modified to fit your needs. You can view these pipelines using the platform tools (Kibana UI, Elasticsearch API), and can also [read more about their content below](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-specific). +Like `search-default-ingestion`, the first of these is "managed", but the other two can and should be modified to fit your needs. You can view these pipelines using the platform tools (Kibana UI, Elasticsearch API), and can also [read more about their content below](#ingest-pipeline-search-details-specific). ## Pipeline Settings [ingest-pipeline-search-pipeline-settings] @@ -97,10 +95,10 @@ If the pipeline is not specified, the underscore-prefixed fields will actually b ### `search-default-ingestion` Reference [ingest-pipeline-search-details-generic-reference] -You can access this pipeline with the [Elasticsearch Ingest Pipelines API](https://www.elastic.co/guide/en/elasticsearch/reference/current/get-pipeline-api.html) or via Kibana’s [Stack Management > Ingest Pipelines](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md#create-manage-ingest-pipelines) UI. +You can access this pipeline with the [Elasticsearch Ingest Pipelines API](https://www.elastic.co/guide/en/elasticsearch/reference/current/get-pipeline-api.html) or via Kibana’s [Stack Management > Ingest Pipelines](/manage-data/ingest/transform-enrich/ingest-pipelines.md#create-manage-ingest-pipelines) UI. ::::{warning} -This pipeline is a "managed" pipeline. That means that it is not intended to be edited. Editing/updating this pipeline manually could result in unintended behaviors, or difficulty in upgrading in the future. If you want to make customizations, we recommend you utilize index-specific pipelines (see below), specifically [the `@custom` pipeline](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-specific-custom-reference). +This pipeline is a "managed" pipeline. That means that it is not intended to be edited. Editing/updating this pipeline manually could result in unintended behaviors, or difficulty in upgrading in the future. If you want to make customizations, we recommend you utilize index-specific pipelines (see below), specifically [the `@custom` pipeline](#ingest-pipeline-search-details-specific-custom-reference). :::: @@ -118,12 +116,12 @@ This pipeline is a "managed" pipeline. That means that it is not intended to be #### Control flow parameters [ingest-pipeline-search-details-generic-reference-params] -The `search-default-ingestion` pipeline does not always run all processors. It utilizes a feature of ingest pipelines to [conditionally run processors](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md#conditionally-run-processor) based on the contents of each individual document. +The `search-default-ingestion` pipeline does not always run all processors. It utilizes a feature of ingest pipelines to [conditionally run processors](/manage-data/ingest/transform-enrich/ingest-pipelines.md#conditionally-run-processor) based on the contents of each individual document. * `_extract_binary_content` - if this field is present and has a value of `true` on a source document, the pipeline will attempt to run the `attachment`, `set_body`, and `remove_replacement_chars` processors. Note that the document will also need an `_attachment` field populated with base64-encoded binary data in order for the `attachment` processor to have any output. If the `_extract_binary_content` field is missing or `false` on a source document, these processors will be skipped. * `_reduce_whitespace` - if this field is present and has a value of `true` on a source document, the pipeline will attempt to run the `remove_extra_whitespace` and `trim` processors. These processors only apply to the `body` field. If the `_reduce_whitespace` field is missing or `false` on a source document, these processors will be skipped. -Crawler, Native Connectors, and Connector Clients will automatically add these control flow parameters based on the settings in the index’s Pipeline tab. To control what settings any new indices will have upon creation, see the deployment wide content settings. See [Pipeline Settings](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-pipeline-settings). +Crawler, Native Connectors, and Connector Clients will automatically add these control flow parameters based on the settings in the index’s Pipeline tab. To control what settings any new indices will have upon creation, see the deployment wide content settings. See [Pipeline Settings](#ingest-pipeline-search-pipeline-settings). ### Index-specific ingest pipelines [ingest-pipeline-search-details-specific] @@ -139,7 +137,7 @@ The "copy and customize" button is not available at all Elastic subscription lev #### `` Reference [ingest-pipeline-search-details-specific-reference] -This pipeline looks and behaves a lot like the [`search-default-ingestion` pipeline](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-generic-reference), but with [two additional processors](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-specific-reference-processors). +This pipeline looks and behaves a lot like the [`search-default-ingestion` pipeline](#ingest-pipeline-search-details-generic-reference), but with [two additional processors](#ingest-pipeline-search-details-specific-reference-processors). ::::{warning} You should not rename this pipeline. @@ -148,7 +146,7 @@ You should not rename this pipeline. ::::{warning} -This pipeline is a "managed" pipeline. That means that it is not intended to be edited. Editing/updating this pipeline manually could result in unintended behaviors, or difficulty in upgrading in the future. If you want to make customizations, we recommend you utilize [the `@custom` pipeline](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-specific-custom-reference). +This pipeline is a "managed" pipeline. That means that it is not intended to be edited. Editing/updating this pipeline manually could result in unintended behaviors, or difficulty in upgrading in the future. If you want to make customizations, we recommend you utilize [the `@custom` pipeline](#ingest-pipeline-search-details-specific-custom-reference). :::: @@ -156,7 +154,7 @@ This pipeline is a "managed" pipeline. That means that it is not intended to be ##### Processors [ingest-pipeline-search-details-specific-reference-processors] -In addition to the processors inherited from the [`search-default-ingestion` pipeline](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-generic-reference), the index-specific pipeline also defines: +In addition to the processors inherited from the [`search-default-ingestion` pipeline](#ingest-pipeline-search-details-generic-reference), the index-specific pipeline also defines: * `index_ml_inference_pipeline` - this uses the [Pipeline](https://www.elastic.co/guide/en/elasticsearch/reference/current/pipeline-processor.html) processor to run the `@ml-inference` pipeline. This processor will only be run if the source document includes a `_run_ml_inference` field with the value `true`. * `index_custom_pipeline` - this uses the [Pipeline](https://www.elastic.co/guide/en/elasticsearch/reference/current/pipeline-processor.html) processor to run the `@custom` pipeline. @@ -168,7 +166,7 @@ Like the `search-default-ingestion` pipeline, the `` pipeline does n * `_run_ml_inference` - if this field is present and has a value of `true` on a source document, the pipeline will attempt to run the `index_ml_inference_pipeline` processor. If the `_run_ml_inference` field is missing or `false` on a source document, this processor will be skipped. -Crawler, Native Connectors, and Connector Clients will automatically add these control flow parameters based on the settings in the index’s Pipeline tab. To control what settings any new indices will have upon creation, see the deployment wide content settings. See [Pipeline Settings](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-pipeline-settings). +Crawler, Native Connectors, and Connector Clients will automatically add these control flow parameters based on the settings in the index’s Pipeline tab. To control what settings any new indices will have upon creation, see the deployment wide content settings. See [Pipeline Settings](#ingest-pipeline-search-pipeline-settings). #### `@ml-inference` Reference [ingest-pipeline-search-details-specific-ml-reference] @@ -194,7 +192,7 @@ The `monitor_ml` Elasticsearch cluster permission is required in order to manage This pipeline is empty to start (no processors), but can be added to via the Kibana UI either through the Pipelines tab of your index, or from the **Stack Management > Ingest Pipelines** page. Unlike the `search-default-ingestion` pipeline and the `` pipeline, this pipeline is NOT "managed". -You are encouraged to make additions and edits to this pipeline, provided its name remains the same. This provides a convenient hook from which to add custom processing and transformations for your data. Be sure to read the [docs for ingest pipelines](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md) to see what options are available. +You are encouraged to make additions and edits to this pipeline, provided its name remains the same. This provides a convenient hook from which to add custom processing and transformations for your data. Be sure to read the [docs for ingest pipelines](/manage-data/ingest/transform-enrich/ingest-pipelines.md) to see what options are available. ::::{warning} You should not rename this pipeline. @@ -206,9 +204,9 @@ You should not rename this pipeline. ## Upgrading notes [ingest-pipeline-search-upgrading-notes] ::::{dropdown} Expand to see upgrading notes -* `app_search_crawler` - Since 8.3, {{app-search-crawler}} has utilized this pipeline to power its binary content extraction. You can read more about this pipeline and its usage in the [App Search Guide](https://www.elastic.co/guide/en/app-search/current/web-crawler-reference.html#web-crawler-reference-binary-content-extraction). When upgrading from 8.3 to 8.5+, be sure to note any changes that you made to the `app_search_crawler` pipeline. These changes should be re-applied to each index’s `@custom` pipeline in order to ensure a consistent data processing experience. In 8.5+, the [index setting to enable binary content](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-pipeline-settings) is required **in addition** to the configurations mentioned in the [App Search Guide](https://www.elastic.co/guide/en/app-search/current/web-crawler-reference.html#web-crawler-reference-binary-content-extraction). -* `ent_search_crawler` - Since 8.4, the Elastic web crawler has utilized this pipeline to power its binary content extraction. You can read more about this pipeline and its usage in the [Elastic web crawler Guide](https://www.elastic.co/guide/en/enterprise-search/current/crawler-managing.html#crawler-managing-binary-content). When upgrading from 8.4 to 8.5+, be sure to note any changes that you made to the `ent_search_crawler` pipeline. These changes should be re-applied to each index’s `@custom` pipeline in order to ensure a consistent data processing experience. In 8.5+, the [index setting to enable binary content](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-pipeline-settings) is required **in addition** to the configurations mentioned in the [Elastic web crawler Guide](https://www.elastic.co/guide/en/enterprise-search/current/crawler-managing.html#crawler-managing-binary-content). +* `app_search_crawler` - Since 8.3, {{app-search-crawler}} has utilized this pipeline to power its binary content extraction. You can read more about this pipeline and its usage in the [App Search Guide](https://www.elastic.co/guide/en/app-search/current/web-crawler-reference.html#web-crawler-reference-binary-content-extraction). When upgrading from 8.3 to 8.5+, be sure to note any changes that you made to the `app_search_crawler` pipeline. These changes should be re-applied to each index’s `@custom` pipeline in order to ensure a consistent data processing experience. In 8.5+, the [index setting to enable binary content](#ingest-pipeline-search-pipeline-settings) is required **in addition** to the configurations mentioned in the [App Search Guide](https://www.elastic.co/guide/en/app-search/current/web-crawler-reference.html#web-crawler-reference-binary-content-extraction). +* `ent_search_crawler` - Since 8.4, the Elastic web crawler has utilized this pipeline to power its binary content extraction. You can read more about this pipeline and its usage in the [Elastic web crawler Guide](https://www.elastic.co/guide/en/enterprise-search/current/crawler-managing.html#crawler-managing-binary-content). When upgrading from 8.4 to 8.5+, be sure to note any changes that you made to the `ent_search_crawler` pipeline. These changes should be re-applied to each index’s `@custom` pipeline in order to ensure a consistent data processing experience. In 8.5+, the [index setting to enable binary content](#ingest-pipeline-search-pipeline-settings) is required **in addition** to the configurations mentioned in the [Elastic web crawler Guide](https://www.elastic.co/guide/en/enterprise-search/current/crawler-managing.html#crawler-managing-binary-content). * `ent-search-generic-ingestion` - Since 8.5, Native Connectors, Connector Clients, and new (>8.4) Elastic web crawler indices all made use of this pipeline by default. This pipeline evolved into the `search-default-ingestion` pipeline. -* `search-default-ingestion` - Since 9.0, Connectors have made use of this pipeline by default. You can [read more about this pipeline](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-generic-reference) above. As this pipeline is "managed", any modifications that were made to `app_search_crawler` and/or `ent_search_crawler` should NOT be made to `search-default-ingestion`. Instead, if such customizations are desired, you should utilize [Index-specific ingest pipelines](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-specific), placing all modifications in the `@custom` pipeline(s). +* `search-default-ingestion` - Since 9.0, Connectors have made use of this pipeline by default. You can [read more about this pipeline](#ingest-pipeline-search-details-generic-reference) above. As this pipeline is "managed", any modifications that were made to `app_search_crawler` and/or `ent_search_crawler` should NOT be made to `search-default-ingestion`. Instead, if such customizations are desired, you should utilize [Index-specific ingest pipelines](#ingest-pipeline-search-details-specific), placing all modifications in the `@custom` pipeline(s). -:::: +:::: \ No newline at end of file diff --git a/solutions/toc.yml b/solutions/toc.yml index dd37ca2d8..237052b94 100644 --- a/solutions/toc.yml +++ b/solutions/toc.yml @@ -619,6 +619,8 @@ toc: - file: search/building-search-in-your-app-or-site.md - file: search/search-templates.md - file: search/ingest-for-search.md + children: + - file: search/search-pipelines.md - file: search/full-text.md children: - file: search/full-text/search-with-synonyms.md From cb20f34b25bd6c40bab7032b0dc3ad49eee88973 Mon Sep 17 00:00:00 2001 From: kosabogi <105062005+kosabogi@users.noreply.github.com> Date: Wed, 5 Feb 2025 15:37:46 +0100 Subject: [PATCH 12/15] Adds 'Backup, high availability, and resilience tools' landing page (#301) * Adds Backup and high availability landing page --------- Co-authored-by: Brandon Morelli Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> --- deploy-manage/tools.md | 53 +++++++++++++++++++++++++++++++----------- 1 file changed, 40 insertions(+), 13 deletions(-) diff --git a/deploy-manage/tools.md b/deploy-manage/tools.md index dbbe87b9f..85342ade8 100644 --- a/deploy-manage/tools.md +++ b/deploy-manage/tools.md @@ -1,32 +1,59 @@ --- mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/high-availability.html + +applies: + stack: all + hosted: all + ece: all + eck: all --- + # Backup, high availability, and resilience tools [high-availability] -Your data is important to you. Keeping it safe and available is important to Elastic. Sometimes your cluster may experience hardware failure or a power loss. To help you plan for this, {{es}} offers a number of features to achieve high availability despite failures. Depending on your deployment type, you might need to provision servers in different zones or configure external repositories to meet your organization’s availability needs. +Elastic provides comprehensive tools to safeguard data, ensure continuous availability, and maintain resilience. These tools are designed to support disaster recovery strategies, enabling businesses to protect critical information and minimize downtime, and maintain high availability in case of unexpected failures. In this section, you'll learn about these tools and how to implement them in your environment. + +For strategies to design resilient clusters, see **[Availability and resilience](production-guidance/availability-and-resilience.md)**. + +::::{note} +The snapshot and restore and cross-cluster replication features are currently not available for Elastic Cloud Serverless projects. These features will be introduced in the future. For more information, refer to [Serverless differences](/deploy-manage/deploy/elastic-cloud/differences-from-other-elasticsearch-offerings.md#elasticsearch-differences-serverless-feature-categories). +:::: + +## Snapshot and restore -* **[Design for resilience](production-guidance/availability-and-resilience.md)** +Snapshots in Elasticsearch are point-in-time backups that include your cluster's data, settings, and overall state. They capture all the information necessary to restore your cluster to a specific moment in time, making them essential for protecting data, recovering from unexpected issues, and transferring data between clusters. Snapshots are a reliable way to ensure the safety of your data and maintain the continuity of your operations. - Distributed systems like Elasticsearch are designed to keep working even if some of their components have failed. An Elasticsearch cluster can continue operating normally if some of its nodes are unavailable or disconnected, as long as there are enough well-connected nodes to take over the unavailable node’s responsibilities. +You can perform the following tasks to manage snapshots and snapshot repositories: - If you’re designing a smaller cluster, you might focus on making your cluster resilient to single-node failures. Designers of larger clusters must also consider cases where multiple nodes fail at the same time. +- **[Register a repository](tools/snapshot-and-restore/manage-snapshot-repositories.md):** Configure storage repositories (for example, S3, Azure, Google Cloud) to store snapshots. The way that you register repositories differs depending on your deployment method: + - **[Elastic Cloud Hosted](tools/snapshot-and-restore/elastic-cloud-hosted.md):** Deployments come with a preconfigured S3 repository for automatic backups, simplifying the setup process. You can also register external repositories, such as Azure, and Google Cloud, for more flexibility. + - **[Elastic Cloud Enterprise](tools/snapshot-and-restore/cloud-enterprise.md):** Repository configuration is managed through the Elastic Cloud Enterprise user interface and automatically linked to deployments. + - **[Elastic Cloud on Kubernetes](tools/snapshot-and-restore/cloud-on-k8s.md) and [self-managed](tools/snapshot-and-restore/self-managed.md) deployments:** Repositories must be configured manually. -* **[Cross-cluster replication](tools/cross-cluster-replication.md)** +- **[Create snapshots](tools/snapshot-and-restore/create-snapshots.md):** Manually or automatically create backups of your cluster. +- **[Restore a snapshot](tools/snapshot-and-restore/restore-snapshot.md):** Recover indices, data streams, or the entire cluster to revert to a previous state. You can choose to restore specific parts of a snapshot, such as a single index, or perform a full restore. - To effectively distribute read and write operations across nodes, the nodes in a cluster need good, reliable connections to each other. To provide better connections, you typically co-locate the nodes in the same data center or nearby data centers. +To reduce storage costs for infrequently accessed data while maintaining access, you can also create **[searchable snapshots](tools/snapshot-and-restore/searchable-snapshots.md)**. - Co-locating nodes in a single location exposes you to the risk of a single outage taking your entire cluster offline. To maintain high availability, you can prepare a second cluster that can take over in case of disaster by implementing {{ccr}} (CCR). +::::{note} +Snapshot configurations vary across Elastic Cloud Hosted, Elastic Cloud Enterprise (ECE), Elastic Cloud on Kubernetes (ECK), and self-managed deployments. +:::: - CCR provides a way to automatically synchronize indices from a leader cluster to a follower cluster. This cluster could be in a different data center or even a different content from the leader cluster. If the primary cluster fails, the secondary cluster can take over. +## Cross-cluster replication (CCR) - ::::{tip} - You can also use CCR to create secondary clusters to serve read requests in geo-proximity to your users. - :::: +**[Cross-cluster replication (CCR)](tools/cross-cluster-replication.md)** is a feature in Elasticsearch that allows you to replicate data in real time from a leader cluster to one or more follower clusters. This replication ensures that data is synchronized across clusters, providing continuity, redundancy, and enhanced data accessibility. -* **[Snapshots](tools/snapshot-and-restore.md)** +CCR provides a way to automatically synchronize indices from a leader cluster to a follower cluster. This cluster could be in a different data center or even a different continent from the leader cluster. If the primary cluster fails, the secondary cluster can take over. - Take snapshots of your cluster that can be restored in case of failure. +::::{note} +CCR relies on **[remote clusters](remote-clusters.md)** functionality to establish and manage connections between the leader and the follower clusters. +:::: +You can perform the following tasks to manage cross-cluster replication: +- **[Set up CCR](tools/cross-cluster-replication/set-up-cross-cluster-replication.md):** Configure leader and follower clusters for data replication. +- **[Manage CCR](tools/cross-cluster-replication/manage-cross-cluster-replication.md):** Monitor and manage replicated indices. +- **[Automate replication](tools/cross-cluster-replication/manage-auto-follow-patterns.md):** Use auto-follow patterns to automatically replicate newly created indices. +- **Set up failover clusters:** Configure **[uni-directional](tools/cross-cluster-replication/uni-directional-disaster-recovery.md)** or **[bi-directional](tools/cross-cluster-replication/bi-directional-disaster-recovery.md)** CCR for redundancy and disaster recovery. +- **[Review cluster upgrade considerations when using CCR](upgrade.md):** If you're using CCR, then you might need to upgrade your clusters in a specific order to prevent errors. Review the considerations and recommended procedures for performing upgrades on CCR leaders and followers. \ No newline at end of file From 6e9cc30e967e944ddb83de709c5c0b7ae7b32114 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Wed, 5 Feb 2025 15:52:48 +0100 Subject: [PATCH 13/15] [E&A] Refines NLP model conceptual docs. (#330) --- explore-analyze/machine-learning/nlp.md | 1 + .../nlp/ml-nlp-built-in-models.md | 5 - .../machine-learning/nlp/ml-nlp-e5.md | 46 ++---- .../machine-learning/nlp/ml-nlp-elser.md | 79 +++------- .../machine-learning/nlp/ml-nlp-model-ref.md | 26 +--- .../machine-learning/nlp/ml-nlp-rerank.md | 144 +++++++----------- .../machine-learning/nlp/nlp-example.md | 23 +-- 7 files changed, 99 insertions(+), 225 deletions(-) diff --git a/explore-analyze/machine-learning/nlp.md b/explore-analyze/machine-learning/nlp.md index 44f6f1325..0c9bc6c90 100644 --- a/explore-analyze/machine-learning/nlp.md +++ b/explore-analyze/machine-learning/nlp.md @@ -13,6 +13,7 @@ You can use {{stack-ml-features}} to analyze natural language data and make pred * [Add NLP {{infer}} to ingest pipelines](nlp/ml-nlp-inference.md) * [API quick reference](nlp/ml-nlp-apis.md) * [ELSER](nlp/ml-nlp-elser.md) +* [Elastic Rerank](nlp/ml-nlp-rerank.md) * [E5](nlp/ml-nlp-e5.md) * [Language identification](nlp/ml-nlp-lang-ident.md) * [Examples](nlp/ml-nlp-examples.md) diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-built-in-models.md b/explore-analyze/machine-learning/nlp/ml-nlp-built-in-models.md index a1cbaefc1..6aaf6ccd5 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-built-in-models.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-built-in-models.md @@ -10,8 +10,3 @@ There are {{nlp}} models that are available for use in every cluster out-of-the- * [ELSER](ml-nlp-elser.md) trained by Elastic * [E5](ml-nlp-e5.md) * [{{lang-ident-cap}}](ml-nlp-lang-ident.md) - - - - - diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-e5.md b/explore-analyze/machine-learning/nlp/ml-nlp-e5.md index fe269a514..47b6d1f3e 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-e5.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-e5.md @@ -4,11 +4,8 @@ mapped_pages: - https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-e5.html --- - - # E5 [ml-nlp-e5] - EmbEddings from bidirEctional Encoder rEpresentations - or E5 - is a {{nlp}} model that enables you to perform multi-lingual semantic search by using dense vector representations. This model is recommended for non-English language documents and queries. If you want to perform semantic search on English language documents, use the [ELSER](ml-nlp-elser.md) model. [Semantic search](../../../solutions/search/semantic-search.md) provides you search results based on contextual meaning and user intent, rather than exact keyword matches. @@ -17,14 +14,12 @@ E5 has two versions: one cross-platform version which runs on any hardware and o Refer to the model cards of the [multilingual-e5-small](https://huggingface.co/elastic/multilingual-e5-small) and the [multilingual-e5-small-optimized](https://huggingface.co/elastic/multilingual-e5-small-optimized) models on HuggingFace for further information including licensing. - ## Requirements [e5-req] To use E5, you must have the [appropriate subscription](https://www.elastic.co/subscriptions) level for semantic search or the trial period activated. Enabling trained model autoscaling for your E5 deployment is recommended. Refer to [*Trained model autoscaling*](ml-nlp-auto-scale.md) to learn more. - ## Download and deploy E5 [download-deploy-e5] The easiest and recommended way to download and deploy E5 is to use the [{{infer}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-apis.html). @@ -32,8 +27,8 @@ The easiest and recommended way to download and deploy E5 is to use the [{{infer 1. In {{kib}}, navigate to the **Dev Console**. 2. Create an {{infer}} endpoint with the `elasticsearch` service by running the following API request: - ```console - PUT _inference/text_embedding/my-e5-model +```console +PUT _inference/text_embedding/my-e5-model { "service": "elasticsearch", "service_settings": { @@ -42,16 +37,14 @@ The easiest and recommended way to download and deploy E5 is to use the [{{infer "model_id": ".multilingual-e5-small" } } - ``` +``` The API request automatically initiates the model download and then deploy the model. - Refer to the [`elasticsearch` {{infer}} service documentation](../../../solutions/search/inference-api/elasticsearch-inference-integration.md) to learn more about the available settings. After you created the E5 {{infer}} endpoint, it’s ready to be used for semantic search. The easiest way to perform semantic search in the {{stack}} is to [follow the `semantic_text` workflow](../../../solutions/search/semantic-search/semantic-search-semantic-text.md). - ### Alternative methods to download and deploy E5 [alternative-download-deploy-e5] You can also download and deploy the E5 model either from **{{ml-app}}** > **Trained Models**, from **Search** > **Indices**, or by using the trained models API in Dev Console. @@ -60,7 +53,6 @@ You can also download and deploy the E5 model either from **{{ml-app}}** > **Tra For most cases, the preferred version is the **Intel and Linux optimized** model, it is recommended to download and deploy that version. :::: - ::::{dropdown} Using the Trained Models page #### Using the Trained Models page [trained-model-e5] @@ -87,7 +79,6 @@ For most cases, the preferred version is the **Intel and Linux optimized** model :::: - ::::{dropdown} Using the search indices UI #### Using the search indices UI [elasticsearch-e5] @@ -111,12 +102,10 @@ Alternatively, you can download and deploy the E5 model to an {{infer}} pipeline :class: screenshot ::: - When your E5 model is deployed and started, it is ready to be used in a pipeline. :::: - ::::{dropdown} Using the trained models API in Dev Console #### Using the trained models API in Dev Console [dev-console-e5] @@ -124,38 +113,37 @@ When your E5 model is deployed and started, it is ready to be used in a pipeline 1. In {{kib}}, navigate to the **Dev Console**. 2. Create the E5 model configuration by running the following API call: - ```console - PUT _ml/trained_models/.multilingual-e5-small +```console +PUT _ml/trained_models/.multilingual-e5-small { "input": { "field_names": ["text_field"] } } - ``` +``` The API call automatically initiates the model download if the model is not downloaded yet. 3. Deploy the model by using the [start trained model deployment API](https://www.elastic.co/guide/en/elasticsearch/reference/current/start-trained-model-deployment.html) with a delpoyment ID: - ```console - POST _ml/trained_models/.multilingual-e5-small/deployment/_start?deployment_id=for_search - ``` - +```console +POST _ml/trained_models/.multilingual-e5-small/deployment/_start?deployment_id=for_search +``` :::: - - ## Deploy the E5 model in an air-gapped environment [air-gapped-install-e5] -If you want to install E5 in an air-gapped environment, you have the following options: * put the model artifacts into a directory inside the config directory on all master-eligible nodes (for `multilingual-e5-small` and `multilingual-e5-small-linux-x86-64`) * install the model by using HuggingFace (for `multilingual-e5-small` model only). +If you want to install E5 in an air-gapped environment, you have the following options: +* put the model artifacts into a directory inside the config directory on all master-eligible nodes (for `multilingual-e5-small` and `multilingual-e5-small-linux-x86-64`) +* install the model by using HuggingFace (for `multilingual-e5-small` model only). ### Model artifact files [e5-model-artifacts] For the `multilingual-e5-small` model, you need the following files in your system: -``` +```url https://ml-models.elastic.co/multilingual-e5-small.metadata.json https://ml-models.elastic.co/multilingual-e5-small.pt https://ml-models.elastic.co/multilingual-e5-small.vocab.json @@ -163,13 +151,12 @@ https://ml-models.elastic.co/multilingual-e5-small.vocab.json For the optimized version, you need the following files in your system: -``` +```url https://ml-models.elastic.co/multilingual-e5-small_linux-x86_64.metadata.json https://ml-models.elastic.co/multilingual-e5-small_linux-x86_64.pt https://ml-models.elastic.co/multilingual-e5-small_linux-x86_64.vocab.json ``` - ### Using file-based access [_using_file_based_access_3] For a file-based access, follow these steps: @@ -178,7 +165,7 @@ For a file-based access, follow these steps: 2. Put the files into a `models` subdirectory inside the `config` directory of your {{es}} deployment. 3. Point your {{es}} deployment to the model directory by adding the following line to the `config/elasticsearch.yml` file: - ``` + ```yml xpack.ml.model_repository: file://${path.home}/config/models/` ``` @@ -190,7 +177,6 @@ For a file-based access, follow these steps: 9. Provide a deployment ID, select the priority, and set the number of allocations and threads per allocation values. 10. Click **Start**. - ### Using the HuggingFace repository [_using_the_huggingface_repository] You can install the `multilingual-e5-small` model in a restricted or closed network by pointing the `eland_import_hub_model` script to the model’s local files. @@ -230,8 +216,6 @@ For an offline install, the model first needs to be cloned locally, Git and [Git Once it’s uploaded to {{es}}, the model will have the ID specified by `--es-model-id`. If it is not set, the model ID is derived from `--hub-model-id`; spaces and path delimiters are converted to double underscores `__`. - - ## Disclaimer [terms-of-use-e5] Customers may add third party trained models for management in Elastic. These models are not owned by Elastic. While Elastic will support the integration with these models in the performance according to the documentation, you understand and agree that Elastic has no control over, or liability for, the third party models or the underlying training data they may utilize. diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-elser.md b/explore-analyze/machine-learning/nlp/ml-nlp-elser.md index 750fc1c2f..8f8cb44c7 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-elser.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-elser.md @@ -4,11 +4,8 @@ mapped_pages: - https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-elser.html --- - - # ELSER [ml-nlp-elser] - Elastic Learned Sparse EncodeR - or ELSER - is a retrieval model trained by Elastic that enables you to perform [semantic search](../../../solutions/search/vector/sparse-vector-elser.md) to retrieve more relevant search results. This search type provides you search results based on contextual meaning and user intent, rather than exact keyword matches. ELSER is an out-of-domain model which means it does not require fine-tuning on your own data, making it adaptable for various use cases out of the box. @@ -19,15 +16,12 @@ This model is recommended for English language documents and queries. If you wan While ELSER V2 is generally available, ELSER V1 is in [preview] and will remain in technical preview. :::: - - ## Tokens - not synonyms [elser-tokens] ELSER expands the indexed and searched passages into collections of terms that are learned to co-occur frequently within a diverse set of training data. The terms that the text is expanded into by the model *are not* synonyms for the search terms; they are learned associations capturing relevance. These expanded terms are weighted as some of them are more significant than others. Then the {{es}} [sparse vector](https://www.elastic.co/guide/en/elasticsearch/reference/current/sparse-vector.html) (or [rank features](https://www.elastic.co/guide/en/elasticsearch/reference/current/rank-features.html)) field type is used to store the terms and weights at index time, and to search against later. This approach provides a more understandable search experience compared to vector embeddings. However, attempting to directly interpret the tokens and weights can be misleading, as the expansion essentially results in a vector in a very high-dimensional space. Consequently, certain tokens, especially those with low weight, contain information that is intertwined with other low-weight tokens in the representation. In this regard, they function similarly to a dense vector representation, making it challenging to separate their individual contributions. This complexity can potentially lead to misinterpretations if not carefully considered during analysis. - ## Requirements [elser-req] To use ELSER, you must have the [appropriate subscription](https://www.elastic.co/subscriptions) level for semantic search or the trial period activated. @@ -36,10 +30,8 @@ To use ELSER, you must have the [appropriate subscription](https://www.elastic.c The minimum dedicated ML node size for deploying and using the ELSER model is 4 GB in Elasticsearch Service if [deployment autoscaling](../../../deploy-manage/autoscaling.md) is turned off. Turning on autoscaling is recommended because it allows your deployment to dynamically adjust resources based on demand. Better performance can be achieved by using more allocations or more threads per allocation, which requires bigger ML nodes. Autoscaling provides bigger nodes when required. If autoscaling is turned off, you must provide suitably sized nodes yourself. :::: - Enabling trained model autoscaling for your ELSER deployment is recommended. Refer to [*Trained model autoscaling*](ml-nlp-auto-scale.md) to learn more. - ## ELSER v2 [elser-v2] Compared to the initial version of the model, ELSER v2 offers improved retrieval accuracy and more efficient indexing. This enhancement is attributed to the extension of the training data set, which includes high-quality question and answer pairs and the improved FLOPS regularizer which reduces the cost of computing the similarity between a query and a document. @@ -48,14 +40,12 @@ ELSER v2 has two versions: one cross-platform version which runs on any hardware If you want to learn more about the ELSER V2 improvements, refer to [this blog post](https://www.elastic.co/search-labs/blog/introducing-elser-v2-part-1). - ### Upgrading to ELSER v2 [upgrade-elser-v2] ELSER v2 is not backward compatible. If you indexed your data with ELSER v1, you need to reindex it with an ingest pipeline referencing ELSER v2 to be able to use v2 for search. This [tutorial](../../../solutions/search/vector/sparse-vector-elser.md) shows you how to create an ingest pipeline with an {{infer}} processor that uses ELSER v2, and how to reindex your data through the pipeline. Additionally, the `elasticearch-labs` GitHub repository contains an interactive [Python notebook](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/model-upgrades/upgrading-index-to-use-elser.ipynb) that walks through upgrading an index to ELSER V2. - ## Download and deploy ELSER [download-deploy-elser] The easiest and recommended way to download and deploy ELSER is to use the [{{infer}} API](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-apis.html). @@ -63,8 +53,8 @@ The easiest and recommended way to download and deploy ELSER is to use the [{{in 1. In {{kib}}, navigate to the **Dev Console**. 2. Create an {{infer}} endpoint with the ELSER service by running the following API request: - ```console - PUT _inference/sparse_embedding/my-elser-model +```console +PUT _inference/sparse_embedding/my-elser-model { "service": "elasticsearch", "service_settings": { @@ -77,16 +67,14 @@ The easiest and recommended way to download and deploy ELSER is to use the [{{in "model_id": ".elser_model_2_linux-x86_64" } } - ``` - - The API request automatically initiates the model download and then deploy the model. This example uses [autoscaling](ml-nlp-auto-scale.md) through adaptive allocation. +``` +The API request automatically initiates the model download and then deploy the model. This example uses [autoscaling](ml-nlp-auto-scale.md) through adaptive allocation. -Refer to the [ELSER {{infer}} service documentation](../../../solutions/search/inference-api/elser-inference-integration.md) to learn more about the available settings. +Refer to the [ELSER {{infer}} integration documentation](../../../solutions/search/inference-api/elser-inference-integration.md) to learn more about the available settings. After you created the ELSER {{infer}} endpoint, it’s ready to be used for semantic search. The easiest way to perform semantic search in the {{stack}} is to [follow the `semantic_text` workflow](../../../solutions/search/semantic-search/semantic-search-semantic-text.md). - ### Alternative methods to download and deploy ELSER [alternative-download-deploy] You can also download and deploy ELSER either from **{{ml-app}}** > **Trained Models**, from **Search** > **Indices**, or by using the trained models API in Dev Console. @@ -97,7 +85,6 @@ You can also download and deploy ELSER either from **{{ml-app}}** > **Trained Mo :::: - ::::{dropdown} Using the Trained Models page #### Using the Trained Models page [trained-model] @@ -124,7 +111,6 @@ You can also download and deploy ELSER either from **{{ml-app}}** > **Trained Mo :::: - ::::{dropdown} Using the search indices UI #### Using the search indices UI [elasticsearch] @@ -148,10 +134,8 @@ Alternatively, you can download and deploy ELSER to an {{infer}} pipeline using :class: screenshot ::: - :::: - ::::{dropdown} Using the trained models API in Dev Console #### Using the trained models API in Dev Console [dev-console] @@ -159,30 +143,27 @@ Alternatively, you can download and deploy ELSER to an {{infer}} pipeline using 1. In {{kib}}, navigate to the **Dev Console**. 2. Create the ELSER model configuration by running the following API call: - ```console - PUT _ml/trained_models/.elser_model_2 +```console +PUT _ml/trained_models/.elser_model_2 { "input": { "field_names": ["text_field"] } } - ``` +``` The API call automatically initiates the model download if the model is not downloaded yet. 3. Deploy the model by using the [start trained model deployment API](https://www.elastic.co/guide/en/elasticsearch/reference/current/start-trained-model-deployment.html) with a delpoyment ID: - ```console - POST _ml/trained_models/.elser_model_2/deployment/_start?deployment_id=for_search - ``` +```console +POST _ml/trained_models/.elser_model_2/deployment/_start?deployment_id=for_search +``` You can deploy the model multiple times with different deployment IDs. - :::: - - ## Deploy ELSER in an air-gapped environment [air-gapped-install] If you want to deploy ELSER in a restricted or closed network, you have two options: @@ -190,12 +171,11 @@ If you want to deploy ELSER in a restricted or closed network, you have two opti * create your own HTTP/HTTPS endpoint with the model artifacts on it, * put the model artifacts into a directory inside the config directory on all [master-eligible nodes](https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html#master-node). - ### Model artifact files [elser-model-artifacts] For the cross-platform verison, you need the following files in your system: -``` +```url https://ml-models.elastic.co/elser_model_2.metadata.json https://ml-models.elastic.co/elser_model_2.pt https://ml-models.elastic.co/elser_model_2.vocab.json @@ -203,13 +183,12 @@ https://ml-models.elastic.co/elser_model_2.vocab.json For the optimized version, you need the following files in your system: -``` +```url https://ml-models.elastic.co/elser_model_2_linux-x86_64.metadata.json https://ml-models.elastic.co/elser_model_2_linux-x86_64.pt https://ml-models.elastic.co/elser_model_2_linux-x86_64.vocab.json ``` - ### Using an HTTP server [_using_an_http_server] INFO: If you use an existing HTTP server, note that the model downloader only supports passwordless HTTP servers. @@ -231,7 +210,7 @@ You can use any HTTP service to deploy ELSER. This example uses the official Ngi 4. Verify that Nginx runs properly by visiting the following URL in your browser: - ``` + ```url http://{IP_ADDRESS_OR_HOSTNAME}:8080/elser_model_2.metadata.json ``` @@ -239,7 +218,7 @@ You can use any HTTP service to deploy ELSER. This example uses the official Ngi 5. Point your Elasticsearch deployment to the model artifacts on the HTTP server by adding the following line to the `config/elasticsearch.yml` file: - ``` + ```yml xpack.ml.model_repository: http://{IP_ADDRESS_OR_HOSTNAME}:8080 ``` @@ -259,7 +238,6 @@ The HTTP server is only required for downloading the model. After the download h docker stop ml-models ``` - ### Using file-based access [_using_file_based_access] For a file-based access, follow these steps: @@ -268,7 +246,7 @@ For a file-based access, follow these steps: 2. Put the files into a `models` subdirectory inside the `config` directory of your Elasticsearch deployment. 3. Point your Elasticsearch deployment to the model directory by adding the following line to the `config/elasticsearch.yml` file: - ``` + ```yml xpack.ml.model_repository: file://${path.home}/config/models/ ``` @@ -280,7 +258,6 @@ For a file-based access, follow these steps: 9. Provide a deployment ID, select the priority, and set the number of allocations and threads per allocation values. 10. Click **Start**. - ## Testing ELSER [_testing_elser] You can test the deployed model in {{kib}}. Navigate to **Model Management** > **Trained Models** from the main menu, or use the [global search field](../../overview/kibana-quickstart.md#_finding_your_apps_and_objects) in {{kib}}. Locate the deployed ELSER model in the list of trained models, then select **Test model** from the Actions menu. @@ -294,7 +271,6 @@ The results contain a list of ten random values for the selected field along wit :class: screenshot ::: - ## Performance considerations [performance] * ELSER works best on small-to-medium sized fields that contain natural language. For connector or web crawler use cases, this aligns best with fields like *title*, *description*, *summary*, or *abstract*. As ELSER encodes the first 512 tokens of a field, it may not provide as relevant of results for large fields. For example, `body_content` on web crawler documents, or body fields resulting from extracting text from office documents with connectors. For larger fields like these, consider "chunking" the content into multiple values, where each chunk can be under 512 tokens. @@ -303,12 +279,10 @@ The results contain a list of ten random values for the selected field along wit To learn more about ELSER performance, refer to the [Benchmark information](#elser-benchmarks). - ## Pre-cleaning input text [pre-cleaning] The quality of the input text significantly affects the quality of the embeddings. To achieve the best results, it’s recommended to clean the input text before generating embeddings. The exact preprocessing you may need to do heavily depends on your text. For example, if your text contains HTML tags, use the [HTML strip processor](https://www.elastic.co/guide/en/elasticsearch/reference/current/htmlstrip-processor.html) in an ingest pipeline to remove unnecessary elements. Always review and clean your input text before ingestion to eliminate any irrelevant entities that might affect the results. - ## Recommendations for using ELSER [elser-recommendations] To gain the biggest value out of ELSER trained models, consider to follow this list of recommendations. @@ -317,39 +291,31 @@ To gain the biggest value out of ELSER trained models, consider to follow this l * Setting `min_allocations` to `0` can save on costs for non-critical use cases or testing environments. * Enabling [autoscaling](ml-nlp-auto-scale.md) through adaptive allocations or adaptive resources makes it possible for {{es}} to scale up or down the available resources of your ELSER deployment based on the load on the process. * Use dedicated, optimized ELSER {{infer}} endpoints for ingest and search use cases. - - * When deploying a trained model in {{kib}}, you can select for which case you want to optimize your ELSER deployment. - * If you use the trained model or {{infer}} APIs and want to optimize your ELSER trained model deployment or {{infer}} endpoint for ingest, set the number of threads to `1` (`"num_threads": 1`). - * If you use the trained model or {{infer}} APIs and want to optimize your ELSER trained model deployment or {{infer}} endpoint for search, set the number of threads to greater than `1`. - - + * When deploying a trained model in {{kib}}, you can select for which case you want to optimize your ELSER deployment. + * If you use the trained model or {{infer}} APIs and want to optimize your ELSER trained model deployment or {{infer}} endpoint for ingest, set the number of threads to `1` (`"num_threads": 1`). + * If you use the trained model or {{infer}} APIs and want to optimize your ELSER trained model deployment or {{infer}} endpoint for search, set the number of threads to greater than `1`. ## Further reading [further-readings] * [Perform semantic search with `semantic_text` using the ELSER endpoint](../../../solutions/search/semantic-search/semantic-search-semantic-text.md) * [Perform semantic search with ELSER](../../../solutions/search/vector/sparse-vector-elser.md) - ## Benchmark information [elser-benchmarks] ::::{important} The recommended way to use ELSER is through the [{{infer}} API](../../../solutions/search/inference-api/elser-inference-integration.md) as a service. :::: - The following sections provide information about how ELSER performs on different hardwares and compares the model performance to {{es}} BM25 and other strong baselines. - ### Version overview [version-overview] ELSER V2 has a **optimized** version that is designed to run only on Linux with an x86-64 CPU architecture and a **cross-platform** version that can be run on any platform. - #### ELSER V2 [version-overview-v2] Besides the performance improvements, the biggest change in ELSER V2 is the introduction of the first platform specific ELSER model - that is, a model optimized to run only on Linux with an x86-64 CPU architecture. The optimized model is designed to work best on newer Intel CPUs, but it works on AMD CPUs as well. It is recommended to use the new optimized Linux-x86-64 model for all new users of ELSER as it is significantly faster than the cross-platform model which can be run on any platform. ELSER V2 produces significantly higher quality embeddings than ELSER V1. Regardless of which ELSER V2 model you use (optimized or cross-platform), the particular embeddings produced are the same. - ### Qualitative benchmarks [elser-qualitative-benchmarks] The metric that is used to evaluate ELSER’s ranking ability is the Normalized Discounted Cumulative Gain (NDCG) which can handle multiple relevant documents and fine-grained document ratings. The metric is applied to a fixed-sized list of retrieved documents which, in this case, is the top 10 documents (NDCG@10). @@ -359,9 +325,7 @@ The table below shows the performance of ELSER V2 compared to BM 25. ELSER V2 ha :::{image} ../../../images/machine-learning-ml-nlp-bm25-elser-v2.png :alt: ELSER V2 benchmarks compared to BM25 ::: - -*NDCG@10 for BEIR data sets for BM25 and ELSER V2 - higher values are better)* - +*NDCG@10 for BEIR data sets for BM25 and ELSER V2 - higher values are better* ### Hardware benchmarks [elser-hw-benchmarks] @@ -369,8 +333,6 @@ The table below shows the performance of ELSER V2 compared to BM 25. ELSER V2 ha While the goal is to create a model that is as performant as possible, retrieval accuracy always take precedence over speed, this is one of the design principles of ELSER. Consult with the tables below to learn more about the expected model performance. The values refer to operations performed on two data sets and different hardware configurations. Your data set has an impact on the model performance. Run tests on your own data to have a more realistic view on the model performance for your use case. :::: - - #### ELSER V2 [_elser_v2] Overall the optimized V2 model ingested at a max rate of 26 docs/s, compared with the ELSER V1 max rate of 14 docs/s from the ELSER V1 benchamrk, resulting in a 90% increase in throughput. @@ -381,7 +343,6 @@ The performance of virtual cores (that is, when the number of allocations is gre The length of the documents in your particular dataset will have a significant impact on your throughput numbers. :::: - Refer to [this blog post](https://www.elastic.co/search-labs/blog/introducing-elser-v2-part-1) to learn more about ELSER V2 improved performance. :::{image} ../../../images/machine-learning-ml-nlp-elser-bm-summary.png diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-model-ref.md b/explore-analyze/machine-learning/nlp/ml-nlp-model-ref.md index 39d9fe7d5..3e5e7f058 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-model-ref.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-model-ref.md @@ -4,16 +4,12 @@ mapped_pages: - https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-model-ref.html --- - - # Compatible third party models [ml-nlp-model-ref] - ::::{note} The minimum dedicated ML node size for deploying and using the {{nlp}} models is 16 GB in Elasticsearch Service if [deployment autoscaling](../../../deploy-manage/autoscaling.md) is turned off. Turning on autoscaling is recommended because it allows your deployment to dynamically adjust resources based on demand. Better performance can be achieved by using more allocations or more threads per allocation, which requires bigger ML nodes. Autoscaling provides bigger nodes when required. If autoscaling is turned off, you must provide suitably sized nodes yourself. :::: - The {{stack-ml-features}} support transformer models that conform to the standard BERT model interface and use the WordPiece tokenization algorithm. The current list of supported architectures is: @@ -37,7 +33,6 @@ These models are listed by NLP task; for more information about those tasks, ref **Models highlighted in bold** in the list below are recommended for evaluation purposes and to get started with the Elastic {{nlp}} features. - ## Third party fill-mask models [ml-nlp-model-ref-mask] * [BERT base model](https://huggingface.co/bert-base-uncased) @@ -45,7 +40,6 @@ These models are listed by NLP task; for more information about those tasks, ref * [MPNet base model](https://huggingface.co/microsoft/mpnet-base) * [RoBERTa large model](https://huggingface.co/roberta-large) - ## Third party named entity recognition models [ml-nlp-model-ref-ner] * [BERT base NER](https://huggingface.co/dslim/bert-base-NER) @@ -54,7 +48,6 @@ These models are listed by NLP task; for more information about those tasks, ref * [**DistilBERT base uncased finetuned conll03 English**](https://huggingface.co/elastic/distilbert-base-uncased-finetuned-conll03-english) * [DistilBERT fa zwnj base NER](https://huggingface.co/HooshvareLab/distilbert-fa-zwnj-base-ner) - ## Third party question answering models [ml-nlp-model-ref-question-answering] * [BERT large model (uncased) whole word masking finetuned on SQuAD](https://huggingface.co/bert-large-uncased-whole-word-masking-finetuned-squad) @@ -62,7 +55,6 @@ These models are listed by NLP task; for more information about those tasks, ref * [Electra base squad2](https://huggingface.co/deepset/electra-base-squad2) * [TinyRoBERTa squad2](https://huggingface.co/deepset/tinyroberta-squad2) - ## Third party sparse embedding models [ml-nlp-model-ref-sparse-embedding] Sparse embedding models should be configured with the `text_expansion` task type. @@ -71,7 +63,6 @@ Sparse embedding models should be configured with the `text_expansion` task type * [aken12/splade-japanese-v3](https://huggingface.co/aken12/splade-japanese-v3) * [hotchpotch/japanese-splade-v2](https://huggingface.co/hotchpotch/japanese-splade-v2) - ## Third party text embedding models [ml-nlp-model-ref-text-embedding] Text Embedding models are designed to work with specific scoring functions for calculating the similarity between the embeddings they produce. Examples of typical scoring functions are: `cosine`, `dot product` and `euclidean distance` (also known as `l2_norm`). @@ -103,7 +94,6 @@ Using `DPREncoderWrapper`: * [dpr-question_encoder single nq base](https://huggingface.co/facebook/dpr-question_encoder-single-nq-base) * [dpr-question_encoder multiset base](https://huggingface.co/facebook/dpr-question_encoder-multiset-base) - ## Third party text classification models [ml-nlp-model-ref-text-classification] * [BERT base uncased emotion](https://huggingface.co/nateraw/bert-base-uncased-emotion) @@ -113,7 +103,6 @@ Using `DPREncoderWrapper`: * [FinBERT](https://huggingface.co/ProsusAI/finbert) * [Twitter roBERTa base for Sentiment Analysis](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment) - ## Third party text similarity models [ml-nlp-model-ref-text-similarity] You can use these text similarity models for [semantic re-ranking](../../../solutions/search/ranking/semantic-reranking.md#semantic-reranking-in-es). @@ -122,7 +111,6 @@ You can use these text similarity models for [semantic re-ranking](../../../solu * [ms marco MiniLM L6 v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-6-v2) * [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) - ## Third party zero-shot text classification models [ml-nlp-model-ref-zero-shot] * [BART large mnli](https://huggingface.co/facebook/bart-large-mnli) @@ -133,14 +121,12 @@ You can use these text similarity models for [semantic re-ranking](../../../solu * [NLI RoBERTa base](https://huggingface.co/cross-encoder/nli-roberta-base) * [SqueezeBERT](https://huggingface.co/typeform/squeezebert-mnli) - ## Expected model output [_expected_model_output] Models used for each NLP task type must output tensors of a specific format to be used in the Elasticsearch NLP pipelines. Here are the expected outputs for each task type. - ### Fill mask expected model output [_fill_mask_expected_model_output] Fill mask is a specific kind of token classification; it is the base training task of many transformer models. @@ -151,7 +137,7 @@ Here is an example with a single sequence `"The capital of [MASK] is Paris"` and Should output: -``` +```json [ [ [ 0, 0, 0, 0, 0, 0, 0 ], // The @@ -166,7 +152,6 @@ Should output: The predicted value here for `[MASK]` is `"France"` with a score of 1.2. - ### Named entity recognition expected model output [_named_entity_recognition_expected_model_output] Named entity recognition is a specific token classification task. Each token in the sequence is scored related to a specific set of classification labels. For the Elastic Stack, we use Inside-Outside-Beginning (IOB) tagging. Elastic supports any NER entities as long as they are IOB tagged. The default values are: "O", "B_MISC", "I_MISC", "B_PER", "I_PER", "B_ORG", "I_ORG", "B_LOC", "I_LOC". @@ -177,7 +162,7 @@ The response format must be a float tensor with `shape(, , )`. Here is an example with two sequences for a binary classification model of "happy" and "sad": -``` +```json [ [ // happy, sad @@ -215,14 +198,13 @@ Here is an example with two sequences for a binary classification model of "happ ] ``` - ### Zero-shot text classification expected model output [_zero_shot_text_classification_expected_model_output] Zero-shot text classification allows text to be classified for arbitrary labels not necessarily part of the original training. Each sequence is combined with the label given some hypothesis template. The model then scores each of these combinations according to `[entailment, neutral, contradiction]`. The output of the model must be a float tensor with `shape(, , 3)`. Here is an example with a single sequence classified against 4 labels: -``` +```json [ [ // entailment, neutral, contradiction diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-rerank.md b/explore-analyze/machine-learning/nlp/ml-nlp-rerank.md index 24490c339..421e8dd43 100644 --- a/explore-analyze/machine-learning/nlp/ml-nlp-rerank.md +++ b/explore-analyze/machine-learning/nlp/ml-nlp-rerank.md @@ -17,35 +17,29 @@ The model can significantly improve search result quality by reordering results When reranking BM25 results, it provides an average 40% improvement in ranking quality on a diverse benchmark of retrieval tasks— matching the performance of models 11x its size. +## Availability and requirements [ml-nlp-rerank-availability] -## Availability and requirements [ml-nlp-rerank-availability] - -::::{warning} +::::{warning} This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. :::: - - -### Elastic Cloud Serverless [ml-nlp-rerank-availability-serverless] +### Elastic Cloud Serverless [ml-nlp-rerank-availability-serverless] Elastic Rerank is available in {{es}} Serverless projects as of November 25, 2024. - -### Elastic Cloud Hosted and self-managed deployments [ml-nlp-rerank-availability-elastic-stack] +### Elastic Cloud Hosted and self-managed deployments [ml-nlp-rerank-availability-elastic-stack] Elastic Rerank is available in Elastic Stack version 8.17+: * To use Elastic Rerank, you must have the appropriate subscription level or the trial period activated. * A 4GB ML node - ::::{important} + ::::{important} Deploying the Elastic Rerank model in combination with ELSER (or other hosted models) requires at minimum an 8GB ML node. The current maximum size for trial ML nodes is 4GB (defaults to 1GB). :::: - - -## Download and deploy [ml-nlp-rerank-deploy] +## Download and deploy [ml-nlp-rerank-deploy] To download and deploy Elastic Rerank, use the [create inference API](../../../solutions/search/inference-api/elasticsearch-inference-integration.md) to create an {{es}} service `rerank` endpoint. @@ -54,15 +48,13 @@ Refer to this [Python notebook](https://github.com/elastic/elasticsearch-labs/bl :::: - - -### Create an inference endpoint [ml-nlp-rerank-deploy-steps] +### Create an inference endpoint [ml-nlp-rerank-deploy-steps] 1. In {{kib}}, navigate to the **Dev Console**. 2. Create an {{infer}} endpoint with the Elastic Rerank service by running: - ```console - PUT _inference/rerank/my-rerank-model +```console +PUT _inference/rerank/my-rerank-model { "service": "elasticsearch", "service_settings": { @@ -75,42 +67,37 @@ Refer to this [Python notebook](https://github.com/elastic/elasticsearch-labs/bl "model_id": ".rerank-v1" } } - ``` - - ::::{note} - The API request automatically downloads and deploys the model. This example uses [autoscaling](ml-nlp-auto-scale.md) through adaptive allocation. - :::: +``` +::::{note} +The API request automatically downloads and deploys the model. This example uses [autoscaling](ml-nlp-auto-scale.md) through adaptive allocation. +:::: -::::{note} +::::{note} You might see a 502 bad gateway error in the response when using the {{kib}} Console. This error usually just reflects a timeout, while the model downloads in the background. You can check the download progress in the {{ml-app}} UI. If using the Python client, you can set the `timeout` parameter to a higher value. :::: - After creating the Elastic Rerank {{infer}} endpoint, it’s ready to use with a [`text_similarity_reranker`](https://www.elastic.co/guide/en/elasticsearch/reference/current/retriever.html#text-similarity-reranker-retriever-example-elastic-rerank) retriever. - -## Deploy in an air-gapped environment [ml-nlp-rerank-deploy-verify] +## Deploy in an air-gapped environment [ml-nlp-rerank-deploy-verify] If you want to deploy the Elastic Rerank model in a restricted or closed network, you have two options: * Create your own HTTP/HTTPS endpoint with the model artifacts on it * Put the model artifacts into a directory inside the config directory on all master-eligible nodes. - -### Model artifact files [ml-nlp-rerank-model-artifacts] +### Model artifact files [ml-nlp-rerank-model-artifacts] For the cross-platform version, you need the following files in your system: -``` +```url https://ml-models.elastic.co/rerank-v1.metadata.json https://ml-models.elastic.co/rerank-v1.pt https://ml-models.elastic.co/rerank-v1.vocab.json ``` - -### Using an HTTP server [_using_an_http_server_2] +### Using an HTTP server [_using_an_http_server_2] INFO: If you use an existing HTTP server, note that the model downloader only supports passwordless HTTP servers. @@ -131,7 +118,7 @@ You can use any HTTP service to deploy the model. This example uses the official 4. Verify that Nginx runs properly by visiting the following URL in your browser: - ``` + ```url http://{IP_ADDRESS_OR_HOSTNAME}:8080/rerank-v1.metadata.json ``` @@ -139,7 +126,7 @@ You can use any HTTP service to deploy the model. This example uses the official 5. Point your {{es}} deployment to the model artifacts on the HTTP server by adding the following line to the `config/elasticsearch.yml` file: - ``` + ```yml xpack.ml.model_repository: http://{IP_ADDRESS_OR_HOSTNAME}:8080 ``` @@ -155,8 +142,7 @@ The HTTP server is only required for downloading the model. After the download h docker stop ml-models ``` - -### Using file-based access [_using_file_based_access_2] +### Using file-based access [_using_file_based_access_2] For a file-based access, follow these steps: @@ -164,7 +150,7 @@ For a file-based access, follow these steps: 2. Put the files into a `models` subdirectory inside the `config` directory of your {{es}} deployment. 3. Point your {{es}} deployment to the model directory by adding the following line to the `config/elasticsearch.yml` file: - ``` + ```yml xpack.ml.model_repository: file://${path.home}/config/models/ ``` @@ -172,8 +158,7 @@ For a file-based access, follow these steps: 5. [Restart](../../../deploy-manage/maintenance/start-stop-services/full-cluster-restart-rolling-restart-procedures.md#restart-cluster-rolling) the master-eligible nodes one by one. 6. Create an inference endpoint to deploy the model per [these steps](#ml-nlp-rerank-deploy-steps). - -## Limitations [ml-nlp-rerank-limitations] +## Limitations [ml-nlp-rerank-limitations] * English language only * Maximum context window of 512 tokens @@ -182,61 +167,49 @@ For a file-based access, follow these steps: When the combined inputs exceed the 512 token limit, a balanced truncation strategy is used. If both the query and input text are longer than 255 tokens each then both are truncated, otherwise the longest is truncated. - - -## Performance considerations [ml-nlp-rerank-perf-considerations] +## Performance considerations [ml-nlp-rerank-perf-considerations] It’s important to note that if you rerank to depth `n` then you will need to run `n` inferences per query. This will include the document text and will therefore be significantly more expensive than inference for query embeddings. Hardware can be scaled to run these inferences in parallel, but we would recommend shallow reranking for CPU inference: no more than top-30 results. You may find that the preview version is cost prohibitive for high query rates and low query latency requirements. We plan to address performance issues for GA. - -## Model specifications [ml-nlp-rerank-model-specs] +## Model specifications [ml-nlp-rerank-model-specs] * Purpose-built for English language content * Relatively small: 184M parameters (86M backbone + 98M embedding layer) * Matches performance of billion-parameter reranking models * Built directly into {{es}} - no external services or dependencies needed - -## Model architecture [ml-nlp-rerank-arch-overview] +## Model architecture [ml-nlp-rerank-arch-overview] Elastic Rerank is built on the [DeBERTa v3](https://arxiv.org/abs/2111.09543) language model architecture. The model employs several key architectural features that make it particularly effective for reranking: * **Disentangled attention mechanism** enables the model to: - - * Process word content and position separately - * Learn more nuanced relationships between query and document text - * Better understand the semantic importance of word positions and relationships + * Process word content and position separately + * Learn more nuanced relationships between query and document text + * Better understand the semantic importance of word positions and relationships * **ELECTRA-style pre-training** uses: + * A GAN-like approach to token prediction + * Simultaneous training of token generation and detection + * Enhanced parameter efficiency compared to traditional masked language modeling - * A GAN-like approach to token prediction - * Simultaneous training of token generation and detection - * Enhanced parameter efficiency compared to traditional masked language modeling - - - -## Training process [ml-nlp-rerank-arch-training] +## Training process [ml-nlp-rerank-arch-training] Here is an overview of the Elastic Rerank model training process: * **Initial relevance extraction** - - * Fine-tunes the pre-trained DeBERTa [CLS] token representation - * Uses a GeLU activation and dropout layer - * Preserves important pre-trained knowledge while adapting to the reranking task + * Fine-tunes the pre-trained DeBERTa [CLS] token representation + * Uses a GeLU activation and dropout layer + * Preserves important pre-trained knowledge while adapting to the reranking task * **Trained by distillation** + * Uses an ensemble of bi-encoder and cross-encoder models as a teacher + * Bi-encoder provides nuanced negative example assessment + * Cross-encoder helps differentiate between positive and negative examples + * Combines strengths of both model types - * Uses an ensemble of bi-encoder and cross-encoder models as a teacher - * Bi-encoder provides nuanced negative example assessment - * Cross-encoder helps differentiate between positive and negative examples - * Combines strengths of both model types - - - -### Training data [ml-nlp-rerank-arch-data] +### Training data [ml-nlp-rerank-arch-data] The training data consists of: @@ -250,14 +223,11 @@ The data preparation process includes: * Basic cleaning and fuzzy deduplication * Multi-stage prompting for diverse topics (on the synthetic portion of the training data only) * Varied query types: + * Keyword search + * Exact phrase matching + * Short and long natural language questions - * Keyword search - * Exact phrase matching - * Short and long natural language questions - - - -### Negative sampling [ml-nlp-rerank-arch-sampling] +### Negative sampling [ml-nlp-rerank-arch-sampling] The model uses an advanced sampling strategy to ensure high-quality rankings: @@ -265,14 +235,11 @@ The model uses an advanced sampling strategy to ensure high-quality rankings: * Uses five negative samples per query - more than typical approaches * Applies probability distribution shaped by document scores for sampling * Deep sampling benefits: + * Improves model robustness across different retrieval depths + * Enhances score calibration + * Provides better handling of document diversity - * Improves model robustness across different retrieval depths - * Enhances score calibration - * Provides better handling of document diversity - - - -### Training optimization [ml-nlp-rerank-arch-optimization] +### Training optimization [ml-nlp-rerank-arch-optimization] The training process incorporates several key optimizations: @@ -286,20 +253,17 @@ Implemented parameter averaging along optimization trajectory: * Eliminates need for traditional learning rate scheduling and provides improvement in the final model quality - -## Performance [ml-nlp-rerank-performance] +## Performance [ml-nlp-rerank-performance] Elastic Rerank shows significant improvements in search quality across a wide range of retrieval tasks. - -### Overview [ml-nlp-rerank-performance-overview] +### Overview [ml-nlp-rerank-performance-overview] * Average 40% improvement in ranking quality when reranking BM25 results * 184M parameter model matches performance of 2B parameter alternatives * Evaluated across 21 different datasets using the BEIR benchmark suite - -### Key benchmark results [ml-nlp-rerank-performance-benchmarks] +### Key benchmark results [ml-nlp-rerank-performance-benchmarks] * Natural Questions: 90% improvement * MS MARCO: 85% improvement @@ -308,8 +272,7 @@ Elastic Rerank shows significant improvements in search quality across a wide ra For detailed benchmark information, including complete dataset results and methodology, refer to the [Introducing Elastic Rerank blog](https://www.elastic.co/search-labs/blog/elastic-semantic-reranker-part-2). - -## Further resources [ml-nlp-rerank-resources] +## Further resources [ml-nlp-rerank-resources] **Documentation**: @@ -325,4 +288,3 @@ For detailed benchmark information, including complete dataset results and metho **Python notebooks**: * [End-to-end example using Elastic Rerank in Python](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/12-semantic-reranking-elastic-rerank.ipynb) - diff --git a/explore-analyze/machine-learning/nlp/nlp-example.md b/explore-analyze/machine-learning/nlp/nlp-example.md index 73e5b70b0..f9094ecdc 100644 --- a/explore-analyze/machine-learning/nlp/nlp-example.md +++ b/explore-analyze/machine-learning/nlp/nlp-example.md @@ -4,11 +4,8 @@ mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/nlp-example.html --- - - # NLP example [nlp-example] - This guide focuses on a concrete task: getting a machine learning trained model loaded into Elasticsearch and set up to enrich your documents. Elasticsearch supports many different ways to use machine learning models. In this guide, we will use a trained model to enrich documents at ingest time using ingest pipelines configured within Kibana’s **Content** UI. @@ -32,8 +29,7 @@ Follow the instructions to load a text classification model and set it up to enr * [Summary](#nlp-example-summary) * [Learn more](#nlp-example-learn-more) - -## Create an {{ecloud}} deployment [nlp-example-cloud-deployment] +## Create an {{ecloud}} deployment [nlp-example-cloud-deployment] Your deployment will need a machine learning instance to upload and deploy trained models. @@ -45,7 +41,6 @@ Follow the steps to **Create** a new deployment. Make sure to add capacity to th Enriching documents using machine learning was introduced in Enterprise Search **8.5.0**, so be sure to use version **8.5.0 or later**. - ## Clone Eland [nlp-example-clone-eland] Elastic’s [Eland](https://github.com/elastic/eland) tool makes it easy to upload trained models to your deployment via Docker. @@ -60,8 +55,7 @@ cd eland docker build -t elastic/eland . ``` - -## Deploy the trained model [nlp-example-deploy-model] +## Deploy the trained model [nlp-example-deploy-model] Now that you have a deployment and a way to upload models, you will need to choose a trained model that fits your data. [Hugging Face](https://huggingface.co/) has a large repository of publicly available trained models. The model you choose will depend on your data and what you would like to do with it. @@ -89,8 +83,7 @@ docker run -it --rm --network host \ This script should take roughly 2-3 minutes to run. Once your model has been successfully deployed to your Elastic deployment, navigate to Kibana’s **Trained Models** page to verify it is ready. You can find this page under **Machine Learning > Analytics** menu and then **Trained Models > Model Management**. If you do not see your model in the list, you may need to click **Synchronize your jobs and trained models**. Your model is now ready to be used. - -## Create an index and define an ML inference pipeline [nlp-example-create-index-and-define-ml-inference-pipeline] +## Create an index and define an ML inference pipeline [nlp-example-create-index-and-define-ml-inference-pipeline] We are now ready to use Kibana’s **Content** UI to enrich our documents with inference data. Before we ingest photo comments into Elasticsearch, we will first create an ML inference pipeline. The pipeline will enrich the incoming photo comments with inference data indicating if the comments are positive. @@ -136,8 +129,7 @@ Next, we’ll add an inference pipeline. You can also run example documents through a simulator and review the pipeline before creating it. - -## Index documents [nlp-example-index-documents] +## Index documents [nlp-example-index-documents] At this point, everything is ready to enrich documents at index time. @@ -184,8 +176,7 @@ The document has new fields with the enriched data. The `ml.inference.positivity From here, we can write search queries to boost on `ml.inference.positivity_result.predicted_value`. This field will also be stored in a top-level `positivity_result` field if the model was confident enough. - -## Summary [nlp-example-summary] +## Summary [nlp-example-summary] In this guide, we covered how to: @@ -195,8 +186,7 @@ In this guide, we covered how to: * Enrich documents with inference results from the trained model at ingest time. * Query your search engine and sort by `positivity_result`. - -## Learn more [nlp-example-learn-more] +## Learn more [nlp-example-learn-more] * [Compatible third party models^](ml-nlp-model-ref.md) * [NLP Overview^](ml-nlp-overview.md) @@ -204,4 +194,3 @@ In this guide, we covered how to: * [Deploying a model ML guide^](ml-nlp-deploy-models.md) * [Eland Authentication methods^](ml-nlp-import-model.md#ml-nlp-authentication) * [Adding inference pipelines](inference-processing.md#ingest-pipeline-search-inference-add-inference-processors) - From d8770dd9a9516b6f1c4fb4ef5658d5b8d32237a9 Mon Sep 17 00:00:00 2001 From: florent-leborgne Date: Wed, 5 Feb 2025 17:02:32 +0100 Subject: [PATCH 14/15] Review Discover section and formatting (#331) --- .../discover/discover-get-started.md | 89 +++++++----------- .../discover/discover-search-for-relevance.md | 18 ++-- explore-analyze/discover/document-explorer.md | 40 ++++---- .../discover/run-pattern-analysis-discover.md | 2 +- explore-analyze/discover/save-open-search.md | 5 +- explore-analyze/discover/search-sessions.md | 18 +--- .../discover/show-field-statistics.md | 32 +++---- explore-analyze/discover/try-esql.md | 31 +++--- explore-analyze/query-filter/tools/console.md | 6 +- ...icon.png => kibana-discover-add-field.png} | Bin 10 files changed, 97 insertions(+), 144 deletions(-) rename images/{kibana-discover-add-icon.png => kibana-discover-add-field.png} (100%) diff --git a/explore-analyze/discover/discover-get-started.md b/explore-analyze/discover/discover-get-started.md index 095ce25d0..92d604e3b 100644 --- a/explore-analyze/discover/discover-get-started.md +++ b/explore-analyze/discover/discover-get-started.md @@ -24,21 +24,19 @@ Select the data you want to explore, and then specify the time range in which to 1. Find **Discover** in the navigation menu or by using the [global search field](../../get-started/the-stack.md#kibana-navigation-search). 2. Select the data view that contains the data you want to explore. - - ::::{tip} - {{kib}} requires a [{{data-source}}](../find-and-organize/data-views.md) to access your Elasticsearch data. A {{data-source}} can point to one or more indices, [data streams](../../manage-data/data-store/index-types/data-streams.md), or [index aliases](https://www.elastic.co/guide/en/elasticsearch/reference/current/alias.html). When adding data to {{es}} using one of the many integrations available, sometimes data views are created automatically, but you can also create your own. - :::: - - - If you’re using sample data, data views are automatically created and are ready to use. - - :::{image} ../../images/kibana-discover-data-view.png - :alt: How to set the {{data-source}} in Discover - :class: screenshot - ::: + ::::{tip} + By default, {{kib}} requires a [{{data-source}}](../find-and-organize/data-views.md) to access your Elasticsearch data. A {{data-source}} can point to one or more indices, [data streams](../../manage-data/data-store/index-types/data-streams.md), or [index aliases](https://www.elastic.co/guide/en/elasticsearch/reference/current/alias.html). When adding data to {{es}} using one of the many integrations available, sometimes data views are created automatically, but you can also create your own. + + You can also [try {{esql}}](try-esql.md), that let's you query any data you have in {{es}} without specifying a {{data-source}} first. + :::: + If you’re using sample data, data views are automatically created and are ready to use. + :::{image} ../../images/kibana-discover-data-view.png + :alt: How to set the {{data-source}} in Discover + :class: screenshot + :width: 300px + ::: 3. If needed, adjust the [time range](../query-filter/filtering.md), for example by setting it to the **Last 7 days**. - The range selection is based on the default time field in your data view. If you are using the sample data, this value was set when the data view was created. If you are using your own data view, and it does not have a time field, the range selection is not available. @@ -56,29 +54,19 @@ You can later filter the data that shows in the chart and in the table by specif **Discover** provides utilities designed to help you make sense of your data: 1. In the sidebar, check the available fields. It’s very common to have hundreds of fields. Use the search at the top of that sidebar to look for specific terms in the field names. - - In this example, we’ve entered `ma` in the search field to find the `manufacturer` field. - - ![Fields list that displays the top five search results](../../images/kibana-discover-sidebar-available-fields.png "") - - ::::{tip} - You can combine multiple keywords or characters. For example, `geo dest` finds `geo.dest` and `geo.src.dest`. - :::: + In this example, we’ve entered `ma` in the search field to find the `manufacturer` field. + ![Fields list that displays the top five search results](../../images/kibana-discover-sidebar-available-fields.png "title =40%") + ::::{tip} + You can combine multiple keywords or characters. For example, `geo dest` finds `geo.dest` and `geo.src.dest`. + :::: 2. Select a field to view its most frequent values. - - **Discover** shows the top 10 values and the number of records used to calculate those values. + **Discover** shows the top 10 values and the number of records used to calculate those values. 3. Select the **Plus** icon to add fields to the results table. You can also drag them from the list into the table. - - :::{image} ../../images/kibana-discover-add-icon.png - :alt: How to add a field as a column in the table - :class: screenshot - ::: - - When you add fields to the table, the **Summary** column is replaced. - - ![Document table with fields for manufacturer](../../images/kibana-document-table.png "") + ![How to add a field as a column in the table](../../images/kibana-discover-add-field.png "title =50%") + When you add fields to the table, the **Summary** column is replaced. + ![Document table with fields for manufacturer](../../images/kibana-document-table.png "") 4. Arrange the view to your liking to display the fields and data you care most about using the various display options of **Discover**. For example, you can change the order and size of columns, expand the table to be in full screen or collapse the chart and the list of fields. Check [Customize the Discover view](document-explorer.md). 5. **Save** your changes to be able to open the same view later on and explore your data further. @@ -92,9 +80,8 @@ What happens if you forgot to define an important value as a separate field? Or, 2. Select the **Type** of the new field. 3. **Name** the field. Name it in a way that corresponds to the way other fields of the data view are named. You can set a custom label and description for the field to make it more recognizable in your data view. 4. Define the value that you want the field to show. By default, the field value is retrieved from the source data if it already contains a field with the same name. You can customize this with the following options: - - * **Set value**: Define a script that will determine the value to show for the field. For more information on adding fields and Painless scripting language examples, refer to [Explore your data with runtime fields](../find-and-organize/data-views.md#runtime-fields). - * **Set format**: Set your preferred format for displaying the value. Changing the format can affect the value and prevent highlighting in Discover. + - **Set value**: Define a script that will determine the value to show for the field. For more information on adding fields and Painless scripting language examples, refer to [Explore your data with runtime fields](../find-and-organize/data-views.md#runtime-fields). + - **Set format**: Set your preferred format for displaying the value. Changing the format can affect the value and prevent highlighting in Discover. 5. In the advanced settings, you can adjust the field popularity to make it appear higher or lower in the fields list. By default, Discover orders popular fields from most selected to least selected. 6. **Save** your new field. @@ -135,16 +122,13 @@ In the following example, we’re adding 2 fields: A simple "Hello world" field, If a field can be [aggregated](../aggregations.md), you can quickly visualize it in detail by opening it in **Lens** from **Discover**. **Lens** is the default visualization editor in {{kib}}. 1. In the list of fields, find an aggregatable field. For example, with the sample data, you can look for `day_of_week`. - - ![Top values for the day_of_week field](../../images/kibana-discover-day-of-week.png "") + ![Top values for the day_of_week field](../../images/kibana-discover-day-of-week.png "title =60%") 2. In the popup, click **Visualize**. - - {{kib}} creates a **Lens** visualization best suited for this field. + {{kib}} creates a **Lens** visualization best suited for this field. 3. In **Lens**, from the **Available fields** list, drag and drop more fields to refine the visualization. In this example, we’re adding the `manufacturer.keyword` field onto the workspace, which automatically adds a breakdown of the top values to the visualization. - - ![Visualization that opens from Discover based on your data](../../images/kibana-discover-from-visualize.png "") + ![Visualization that opens from Discover based on your data](../../images/kibana-discover-from-visualize.png "") 4. Save the visualization if you’d like to add it to a dashboard or keep it in the Visualize library for later use. @@ -160,13 +144,12 @@ You can use **Discover** to compare and diff the field values of multiple result 1. Select the results you want to compare from the Documents or Results tab in Discover. 2. From the **Selected** menu in the table toolbar, choose **Compare selected**. The comparison view opens and shows the selected results next to each other. 3. Compare the values of each field. By default the first result selected shows as the reference for displaying differences in the other results. When the value remains the same for a given field, it’s displayed in green. When the value differs, it’s displayed in red. - - ::::{tip} - You can change the result used as reference by selecting **Pin for comparison** in the contextual menu of any other result. - :::: + ::::{tip} + You can change the result used as reference by selecting **Pin for comparison** in the contextual menu of any other result. + :::: - ![Comparison view in Discover](../../images/kibana-discover-compare-rows.png "") + ![Comparison view in Discover](../../images/kibana-discover-compare-rows.png "") 4. Optionally, customize the **Comparison settings** to your liking. You can for example choose to not highlight the differences, to show them more granularly at the line, word, or character level, or even to hide fields where the value matches for all results. 5. Exit the comparison view at any time using the **Exit comparison mode** button. @@ -193,15 +176,15 @@ Dive into an individual document to view its fields and the documents that occur 2. Scan through the fields and their values. You can filter the table in several ways: - * If you find a field of interest, hover your mouse over the **Field** or **Value** columns for filters and additional options. - * Use the search above the table to filter for specific fields or values, or filter by field type using the options to the right of the search field. - * You can pin some fields by clicking the left column to keep them displayed even if you filter the table. + * If you find a field of interest, hover your mouse over the **Field** or **Value** columns for filters and additional options. + * Use the search above the table to filter for specific fields or values, or filter by field type using the options to the right of the search field. + * You can pin some fields by clicking the left column to keep them displayed even if you filter the table. - ::::{tip} - You can restrict the fields listed in the detailed view to just the fields that you explicitly added to the **Discover** table, using the **Selected only** toggle. In ES|QL mode, you also have an option to hide fields with null values. - :::: + ::::{tip} + You can restrict the fields listed in the detailed view to just the fields that you explicitly added to the **Discover** table, using the **Selected only** toggle. In ES|QL mode, you also have an option to hide fields with null values. + :::: -3. To navigate to a view of the document that you can bookmark and share, select ** View single document**. +3. To navigate to a view of the document that you can bookmark and share, select **View single document**. 4. To view documents that occurred before or after the event you are looking at, select **View surrounding documents**. diff --git a/explore-analyze/discover/discover-search-for-relevance.md b/explore-analyze/discover/discover-search-for-relevance.md index 93a4e2740..ec51da7b4 100644 --- a/explore-analyze/discover/discover-search-for-relevance.md +++ b/explore-analyze/discover/discover-search-for-relevance.md @@ -28,17 +28,11 @@ This example shows how to use **Discover** to list your documents from most rele 6. To turn off sorting by the `timestamp` field, click the **field sorted** option, and then click **Clear sorting.** 7. Open the **Pick fields to sort by** menu, and then click **_score**. 8. Select **High-Low**. - - :::{image} ../../images/kibana-field-sorting-popover.png - :alt: Field sorting popover - :class: screenshot - ::: - - Your table now sorts documents from most to least relevant. - - :::{image} ../../images/kibana-discover-search-for-relevance.png - :alt: Documents are sorted from most relevant to least relevant. - :class: screenshot - ::: + ![Field sorting popover](../../images/kibana-field-sorting-popover.png "title =50%") + Your table now sorts documents from most to least relevant. + :::{image} ../../images/kibana-discover-search-for-relevance.png + :alt: Documents are sorted from most relevant to least relevant. + :class: screenshot + ::: diff --git a/explore-analyze/discover/document-explorer.md b/explore-analyze/discover/document-explorer.md index 5d637b21e..1d9eaa527 100644 --- a/explore-analyze/discover/document-explorer.md +++ b/explore-analyze/discover/document-explorer.md @@ -31,9 +31,9 @@ Customize the appearance of the document table and its contents to your liking. * To move a single column, drag its header and drop it to the position you want. You can also open the column’s contextual options, and select **Move left** or **Move right** in the available options. * To move multiple columns, click **Columns**. In the pop-up, drag the column names to their new order. * To resize a column, drag the right edge of the column header until the column is the width that you want. - - Column widths are stored with a Discover session. When you add a Discover session as a dashboard panel, it appears the same as in **Discover**. - + ::::{tip} + Column widths are stored with a Discover session. When you add a Discover session as a dashboard panel, it appears the same as in **Discover**. + :::: ### Customize the table density [document-explorer-density] @@ -54,7 +54,7 @@ When the number of results returned by your search query (displayed at the top o On the last page of the table, a message indicates that you’ve reached the end of the loaded search results. From that message, you can choose to load more results to continue exploring. -![Limit sample size in Discover](../../images/kibana-discover-limit-sample-size.png "") +![Limit sample size in Discover](../../images/kibana-discover-limit-sample-size.png "title =50%") ### Sort the fields [document-explorer-sort-data] @@ -66,20 +66,15 @@ To add or remove a sort on a single field, click the column header, and then sel To sort by multiple fields: 1. Click the **Sort fields** option. - - :::{image} ../../images/kibana-document-explorer-sort-data.png - :alt: Pop-up in document table for sorting columns - :class: screenshot - ::: + ![Pop-up in document table for sorting columns](../../images/kibana-document-explorer-sort-data.png "title =50%") 2. To add fields to the sort, select their names from the dropdown menu. - - By default, columns are sorted in the order they are added. - - :::{image} ../../images/kibana-document-explorer-multi-field.png - :alt: Multi field sort in the document table - :class: screenshot - ::: + By default, columns are sorted in the order they are added. + :::{image} ../../images/kibana-document-explorer-multi-field.png + :alt: Multi field sort in the document table + :class: screenshot + :width: 50% + ::: 3. To change the sort order, select a field in the pop-up, and then drag it to the new location. @@ -90,8 +85,7 @@ Change how {{kib}} displays a field. 1. Click the column header for the field, and then select **Edit data view field.** 2. In the **Edit field** form, change the field name and format. - - For detailed information on formatting options, refer to [Format data fields](../find-and-organize/data-views.md#managing-fields). + For detailed information on formatting options, refer to [Format data fields](../find-and-organize/data-views.md#managing-fields). @@ -101,11 +95,11 @@ Narrow your results to a subset of documents so you’re comparing just the data 1. Select the documents you want to compare. 2. Click the **Selected** option, and then select **Show selected documents only**. - - :::{image} ../../images/kibana-document-explorer-compare-data.png - :alt: Compare data in the document table - :class: screenshot - ::: + :::{image} ../../images/kibana-document-explorer-compare-data.png + :alt: Compare data in the document table + :class: screenshot + :width: 50% + ::: You can also compare individual field values using the [**Compare selected** option](discover-get-started.md#compare-documents-in-discover). diff --git a/explore-analyze/discover/run-pattern-analysis-discover.md b/explore-analyze/discover/run-pattern-analysis-discover.md index 3cefd5df1..ee5e0d198 100644 --- a/explore-analyze/discover/run-pattern-analysis-discover.md +++ b/explore-analyze/discover/run-pattern-analysis-discover.md @@ -21,5 +21,5 @@ This example uses the [sample web logs data](../overview/kibana-quickstart.md#gs :class: screenshot ::: -1. (optional) Apply filters to one or more patterns. **Discover** only displays documents that match the selected patterns. Additionally, you can remove selected patterns from **Discover**, resulting in the display of only those documents that don’t match the selected pattern. These options enable you to remove unimportant messages and focus on the more important, actionable data during troubleshooting. You can also create a categorization {{anomaly-job}} directly from the **Patterns** tab to find anomalous behavior in the selected pattern. +5. (optional) Apply filters to one or more patterns. **Discover** only displays documents that match the selected patterns. Additionally, you can remove selected patterns from **Discover**, resulting in the display of only those documents that don’t match the selected pattern. These options enable you to remove unimportant messages and focus on the more important, actionable data during troubleshooting. You can also create a categorization {{anomaly-job}} directly from the **Patterns** tab to find anomalous behavior in the selected pattern. diff --git a/explore-analyze/discover/save-open-search.md b/explore-analyze/discover/save-open-search.md index b3ce0dcd5..35191f2a9 100644 --- a/explore-analyze/discover/save-open-search.md +++ b/explore-analyze/discover/save-open-search.md @@ -1,9 +1,10 @@ --- +navigation_title: Save a search for reuse mapped_pages: - https://www.elastic.co/guide/en/kibana/current/save-open-search.html --- -# Save a search for reuse [save-open-search] +# Discover sessions: Save a search for reuse [save-open-search] A saved Discover session is a convenient way to reuse a search that you’ve created in **Discover**. Discover sessions are good for saving a configured view of Discover to use later or adding search results to a dashboard, and can also serve as a foundation for building visualizations. @@ -28,7 +29,7 @@ By default, a Discover session stores the query text, filters, and current view 4. Click **Save**. 5. To reload your search results in **Discover**, click **Open** in the toolbar, and select the saved Discover session. - If the saved Discover session is associated with a different {{data-source}} than is currently selected, opening the saved Discover session changes the selected {{data-source}}. The query language used for the saved Discover session is also automatically selected. +If the saved Discover session is associated with a different {{data-source}} than is currently selected, opening the saved Discover session changes the selected {{data-source}}. The query language used for the saved Discover session is also automatically selected. diff --git a/explore-analyze/discover/search-sessions.md b/explore-analyze/discover/search-sessions.md index 68189b360..ca24ecfe7 100644 --- a/explore-analyze/discover/search-sessions.md +++ b/explore-analyze/discover/search-sessions.md @@ -33,24 +33,14 @@ Save your search session from **Discover** or **Dashboard**, and when your sessi You’re trying to understand a trend you see on a dashboard. You need to look at several years of data, currently in [cold storage](../../manage-data/lifecycle/data-tiers.md#cold-tier), but you don’t have time to wait. You want {{kib}} to continue working in the background, so tomorrow you can open your browser and pick up where you left off. 1. Load your dashboard. - - Your search session begins automatically. The icon after the dashboard title displays the current state of the search session. A clock icon indicates the search session is in progress. A checkmark indicates that the search session is complete. + Your search session begins automatically. The icon after the dashboard title displays the current state of the search session. A clock icon indicates the search session is in progress. A checkmark indicates that the search session is complete. 2. To continue a search in the background, click the clock icon, and then click **Save session**. - - :::{image} ../../images/kibana-search-session-awhile.png - :alt: Search Session indicator displaying the current state of the search - :class: screenshot - ::: - - Once you save a search session, you can start a new search, navigate to a different application, or close the browser. + ![Search Session indicator displaying the current state of the search](../../images/kibana-search-session-awhile.png "title =50%") + Once you save a search session, you can start a new search, navigate to a different application, or close the browser. 3. To view your saved search sessions, go to the **Search Sessions** management page using the navigation menu or the [global search field](../../get-started/the-stack.md#kibana-navigation-search). For a saved or completed session, you can also open this view from the search sessions popup. - - :::{image} ../../images/kibana-search-sessions-menu.png - :alt: Search Sessions management view with actions for inspecting - :class: screenshot - ::: + ![Search Sessions management view with actions for inspecting](../../images/kibana-search-sessions-menu.png "") 4. Use the edit menu in **Search Sessions** to: diff --git a/explore-analyze/discover/show-field-statistics.md b/explore-analyze/discover/show-field-statistics.md index a54cd55dd..1be92ebdf 100644 --- a/explore-analyze/discover/show-field-statistics.md +++ b/explore-analyze/discover/show-field-statistics.md @@ -16,31 +16,29 @@ This example explores the fields in the [sample web logs data](../overview/kiban 2. Expand the {{data-source}} dropdown, and select **Kibana Sample Data Logs**. 3. If you don’t see any results, expand the time range, for example, to **Last 7 days**. 4. Click **Field statistics**. + The table summarizes how many documents in the sample contain each field for the selected time period the number of distinct values, and the distribution. - The table summarizes how many documents in the sample contain each field for the selected time period the number of distinct values, and the distribution. - - :::{image} ../../images/kibana-field-statistics-view.png - :alt: Field statistics view in Discover showing a summary of document data. - :class: screenshot - ::: + :::{image} ../../images/kibana-field-statistics-view.png + :alt: Field statistics view in Discover showing a summary of document data. + :class: screenshot + ::: 5. Expand the `hour_of_day` field. + For numeric fields, **Discover** provides the document statistics, minimum, median, and maximum values, a list of top values, and a distribution chart. Use this chart to get a better idea of how the values in the data are clustered. - For numeric fields, **Discover** provides the document statistics, minimum, median, and maximum values, a list of top values, and a distribution chart. Use this chart to get a better idea of how the values in the data are clustered. - - :::{image} ../../images/kibana-field-statistics-numeric.png - :alt: Field statistics for a numeric field. - :class: screenshot - ::: + :::{image} ../../images/kibana-field-statistics-numeric.png + :alt: Field statistics for a numeric field. + :class: screenshot + ::: 6. Expand the `geo.coordinates` field. - For geo fields, **Discover** provides the document statistics, examples, and a map of the coordinates. + For geo fields, **Discover** provides the document statistics, examples, and a map of the coordinates. - :::{image} ../../images/kibana-field-statistics-geo.png - :alt: Field statistics for a geo field. - :class: screenshot - ::: + :::{image} ../../images/kibana-field-statistics-geo.png + :alt: Field statistics for a geo field. + :class: screenshot + ::: 7. Explore additional field types to see the statistics that **Discover** provides. 8. To create a visualization of the field data, click ![Click the magnifying glass icon to create a visualization of the data in Lens](../../images/kibana-visualization-icon.png "") or ![Click the Maps icon to explore the data in a map](../../images/kibana-map-icon.png "") in the **Actions** column. diff --git a/explore-analyze/discover/try-esql.md b/explore-analyze/discover/try-esql.md index 0fe49d71f..1ef61c3c9 100644 --- a/explore-analyze/discover/try-esql.md +++ b/explore-analyze/discover/try-esql.md @@ -10,7 +10,7 @@ The Elasticsearch Query Language, {{esql}}, makes it easier to explore your data In this tutorial we’ll use the {{kib}} sample web logs in Discover and Lens to explore the data and create visualizations. ::::{tip} -For the complete {{esql}} documentation, including tutorials, examples and the full syntax reference, refer to the [{{es}} documentation](../query-filter/languages/esql.md). For a more detailed overview of {{esql}} in {{kib}}, refer to [Use {{esql}} in Kibana](../query-filter/languages/esql-kibana.md). +For the complete {{esql}} documentation, refer to the [{{esql}} documentation](../query-filter/languages/esql.md). For a more detailed overview of {{esql}} in {{kib}}, refer to [Use {{esql}} in Kibana](../query-filter/languages/esql-kibana.md). :::: @@ -42,19 +42,15 @@ Let’s say we want to find out what operating system users have and how much RA 1. We’re specifically looking for data from the sample web logs we just installed. 2. We’re only keeping the `machine.os` and `machine.ram` fields in the results table. - - ::::{tip} - Put each processing command on a new line for better readability. - :::: + ::::{tip} + Put each processing command on a new line for better readability. + :::: 3. Click **▶Run**. - - ![An image of the query result](../../images/kibana-esql-machine-os-ram.png "") - - ::::{note} - {{esql}} keywords are not case sensitive. - - :::: + ![An image of the query result](../../images/kibana-esql-machine-os-ram.png "") + ::::{note} + {{esql}} keywords are not case sensitive. + :::: Let’s add `geo.dest` to our query, to find out the geographical destination of the visits, and limit the results. @@ -68,13 +64,10 @@ Let’s add `geo.dest` to our query, to find out the geographical destination of ``` 2. Click **▶Run** again. You can notice that the table is now limited to 10 results. The visualization also updated automatically based on the query, and broke down the data for you. - - ::::{note} - When you don’t specify any specific fields to retain using `KEEP`, the visualization isn’t broken down automatically. Instead, an additional option appears above the visualization and lets you select a field manually. - :::: - - - ![An image of the extended query result](../../images/kibana-esql-limit.png "") + ::::{note} + When you don’t specify any specific fields to retain using `KEEP`, the visualization isn’t broken down automatically. Instead, an additional option appears above the visualization and lets you select a field manually. + :::: + ![An image of the extended query result](../../images/kibana-esql-limit.png "") We will now take it a step further to sort the data by machine ram and filter out the `GB` destination. diff --git a/explore-analyze/query-filter/tools/console.md b/explore-analyze/query-filter/tools/console.md index 180b3b196..718fe11a3 100644 --- a/explore-analyze/query-filter/tools/console.md +++ b/explore-analyze/query-filter/tools/console.md @@ -218,9 +218,9 @@ You can export requests: * **to a TXT file**, by using the **Export requests** button. When using this method, all content of the input panel is copied, including comments, requests, and payloads. All of the formatting is preserved and allows you to re-import the file later, or to a different environment, using the **Import requests** button. - ::::{tip} - When importing a TXT file containing Console requests, the current content of the input panel is replaced. Export it first if you don’t want to lose it, or find it in the **History** tab if you already ran the requests. - :::: + ::::{tip} + When importing a TXT file containing Console requests, the current content of the input panel is replaced. Export it first if you don’t want to lose it, or find it in the **History** tab if you already ran the requests. + :::: * by copying them individually as **curl**, **JavaScript**, or **Python**. To do this, select a request, then open the contextual menu and select **Copy as**. When using this action, requests are copied individually to your clipboard. You can save your favorite language to make the copy action faster the next time you use it. diff --git a/images/kibana-discover-add-icon.png b/images/kibana-discover-add-field.png similarity index 100% rename from images/kibana-discover-add-icon.png rename to images/kibana-discover-add-field.png From 1edfde1b61ea9da0eeb7858fe1d8ed267a5bc158 Mon Sep 17 00:00:00 2001 From: Jan Calanog Date: Wed, 5 Feb 2025 19:55:46 +0100 Subject: [PATCH 15/15] Update docs-build.yml (#342) --- .github/workflows/docs-build.yml | 1 + 1 file changed, 1 insertion(+) diff --git a/.github/workflows/docs-build.yml b/.github/workflows/docs-build.yml index b0eb00e1d..b78b30b37 100644 --- a/.github/workflows/docs-build.yml +++ b/.github/workflows/docs-build.yml @@ -11,3 +11,4 @@ jobs: strict: false permissions: contents: read + pull-requests: read