Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[manage data] Add missing content #550

Merged
merged 3 commits into from
Feb 20, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
132 changes: 128 additions & 4 deletions manage-data/data-store/manage-data-from-the-command-line.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,133 @@ mapped_urls:

# Manage data from the command line

% What needs to be done: Lift-and-shift
Learn how to index, update, retrieve, search, and delete documents in an {{es}} cluster from the command line.

% Use migrated content from existing pages that map to this page:
::::{tip}
If you are looking for a user interface for {{es}} and your data, head on over to [Kibana](/get-started/the-stack.md)! Not only are there amazing visualization and index management tools, Kibana includes realistic sample data sets to play with so that you can get to know what you *could* do with your data.
::::

## Before you begin [before-you-begin]

On the **Overview** page for your new cluster in the Cloud UI, copy the {{es}} endpoint URL under **Endpoints**.

These examples use the `elastic` user. If you didn’t copy down the password for the `elastic` user, you can [reset the password](/deploy-manage/users-roles/cluster-or-deployment-auth/built-in-users.md).

To use these examples, you also need to have the [curl](http://curl.haxx.se/) command installed.


## Indexing [indexing]

To index a document into {{es}}, `POST` your document:

```bash
curl -u USER:PASSWORD https://ELASTICSEARCH_URL/my_index/_doc -XPOST -H 'Content-Type: application/json' -d '{
"title": "One", "tags": ["ruby"]
}'
```

To show that the operation worked, {{es}} returns a JSON response that looks like `{"_index":"my_index","_type":"_doc","_id":"0KNPhW4BnhCSymaq_3SI","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":0,"_primary_term":1}`.

In this example, the index `my_index` is created dynamically when the first document is inserted into it. All documents in {{es}} have a `type` and an `id`, which is echoed as `"_type":"_doc"` and `_id":"0KNPhW4BnhCSymaq_3SI` in the JSON response. If no ID is specified during indexing, a random `id` is generated.


### Bulk indexing [bulk-indexing]

To achieve the best possible performance, use the bulk API.

To index some additional documents with the bulk API:

```bash
curl -u USER:PASSWORD https://ELASTICSEARCH_URL/my_index/_doc/_bulk -XPOST -H 'Content-Type: application/json' -d '
{"index": {}}
{"title": "Two", "tags": ["ruby", "python"] }
{"index": {}}
{"title": "Three", "tags": ["java"] }
{"index": {}}
{"title": "Four", "tags": ["ruby", "php"] }
'
```

Elasticsearch returns a JSON response similar to this one:

```json
{"took":694,"errors":false,"items":[{"index":{"_index":"my_index","_type":"_doc","_id":"0aNqhW4BnhCSymaqFHQn","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1,"status":201}},{"index":{"_index":"my_index","_type":"_doc","_id":"0qNqhW4BnhCSymaqFHQn","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":1,"_primary_term":1,"status":201}},{"index":{"_index":"my_index","_type":"_doc","_id":"06NqhW4BnhCSymaqFHQn","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":2,"_primary_term":1,"status":201}}]}
```


## Updating [updating]

To update an existing document in {{es}}, `POST` the updated document to `http://ELASTICSEARCH_URL/my_index/_doc/ID`, where the ID is the `_id` of the document.

For example, to update the last document indexed from the previous example with `"_id":"06NqhW4BnhCSymaqFHQn"`:

```bash
curl -u USER:PASSWORD https://ELASTICSEARCH_URL/my_index/_doc/06NqhW4BnhCSymaqFHQn -XPOST -H 'Content-Type: application/json' -d '{
"title": "Four updated", "tags": ["ruby", "php", "python"]
}'
```

The JSON response shows that the version counter for the document got incremented to `_version":2` to reflect the update.


## Retrieving documents [retrieving-documents]

To take a look at a specific document you indexed, here the last document we updated with the ID `0KNPhW4BnhCSymaq_3SI`:

```bash
curl -u USER:PASSWORD https://ELASTICSEARCH_URL/my_index/_doc/06NqhW4BnhCSymaqFHQn
```

This request didn’t include `GET`, as the method is implied if you don’t specify anything else. If the document you are looking for exists, {{es}} returns `found":true` along with the document as part of the JSON response. Otherwise, the JSON response contains `"found":false`.


## Searching [searching]

You issue search requests for documents with one of these {{es}} endpoints:

```bash
https://ELASTICSEARCH_URL/_search
https://ELASTICSEARCH_URL/INDEX_NAME/_search
```

Either a `GET` or a `POST` request with some URI search parameters works, or omit the method to default to `GET` request:

```bash
curl -u USER:PASSWORD https://ELASTICSEARCH_URL/my_index/_doc/_search?q=title:T*
```

For an explanation of the allowed parameters, check [URI Search](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search).

To make {{es}} return a more human readable JSON response, add `?pretty=true` to the request:

```bash
curl -u USER:PASSWORD https://ELASTICSEARCH_URL/my_index/_doc/_search?pretty=true -H 'Content-Type: application/json' -d '{
"query": {
"query_string": {"query": "*"}
}
}'
```

For performance reasons, `?pretty=true` is not recommended in production. You can verify the performance difference yourself by checking the `took` field in the JSON response which tells you how long Elasticsearch took to evaluate the search in milliseconds. When we tested these examples ourselves, the difference was `"took" : 4` against `"took" : 18`, a substantial difference.

For a full explanation of how the request body is structured, check [Elasticsearch Request Body documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html). You can also execute multiple queries in one request with the [Multi Search API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-msearch).


## Deleting [deleting]

You delete documents from {{es}} by sending `DELETE` requests.

To delete a single document by ID from an earlier example:

```bash
curl -u USER:PASSWORD https://ELASTICSEARCH_URL/my_index/_doc/06NqhW4BnhCSymaqFHQn -XDELETE
```

To delete a whole index, here `my_index`:

```bash
curl -u USER:PASSWORD https://ELASTICSEARCH_URL/my_index -XDELETE
```

The JSON response returns `{"acknowledged":true}` to indicate that the index deletion was a success.

% - [ ] ./raw-migrated-files/cloud/cloud/ec-working-with-elasticsearch.md
% - [ ] ./raw-migrated-files/cloud/cloud-enterprise/ece-working-with-elasticsearch.md
82 changes: 76 additions & 6 deletions manage-data/ingest/transform-enrich/data-enrichment.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,85 @@ mapped_urls:

# Data enrichment

% What needs to be done: Lift-and-shift
You can use the [enrich processor](asciidocalypse://docs/elasticsearch/docs/reference/ingestion-tools/enrich-processor/enrich-processor.md) to add data from your existing indices to incoming documents during ingest.

% Use migrated content from existing pages that map to this page:
For example, you can use the enrich processor to:

% - [ ] ./raw-migrated-files/elasticsearch/elasticsearch-reference/ingest-enriching-data.md
% - [ ] ./raw-migrated-files/elasticsearch/elasticsearch-reference/index-mgmt.md
* Identify web services or vendors based on known IP addresses
* Add product information to retail orders based on product IDs
* Supplement contact information based on an email address
* Add postal codes based on user coordinates

% Internal links rely on the following IDs being on this page (e.g. as a heading ID, paragraph ID, etc):

## How the enrich processor works [how-enrich-works]

Most processors are self-contained and only change *existing* data in incoming documents.

:::{image} ../../../images/elasticsearch-reference-ingest-process.svg
:alt: ingest process
:::

The enrich processor adds *new* data to incoming documents and requires a few special components:

:::{image} ../../../images/elasticsearch-reference-enrich-process.svg
:alt: enrich process
:::

$$$enrich-policy$$$

enrich policy
: A set of configuration options used to add the right enrich data to the right incoming documents.

An enrich policy contains:

* A list of one or more *source indices* which store enrich data as documents
* The *policy type* which determines how the processor matches the enrich data to incoming documents
* A *match field* from the source indices used to match incoming documents
* *Enrich fields* containing enrich data from the source indices you want to add to incoming documents

Before it can be used with an enrich processor, an enrich policy must be [executed](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-enrich-execute-policy). When executed, an enrich policy uses enrich data from the policy’s source indices to create a streamlined system index called the *enrich index*. The processor uses this index to match and enrich incoming documents.


$$$source-index$$$

source index
: An index which stores enrich data you’d like to add to incoming documents. You can create and manage these indices just like a regular {{es}} index. You can use multiple source indices in an enrich policy. You also can use the same source index in multiple enrich policies.

$$$enrich-index$$$

$$$enrich-policy$$$
enrich index
: A special system index tied to a specific enrich policy.

Directly matching incoming documents to documents in source indices could be slow and resource intensive. To speed things up, the enrich processor uses an enrich index.

Enrich indices contain enrich data from source indices but have a few special properties to help streamline them:

* They are system indices, meaning they’re managed internally by {{es}} and only intended for use with enrich processors and the {{esql}} `ENRICH` command.
* They always begin with `.enrich-*`.
* They are read-only, meaning you can’t directly change them.
* They are [force merged](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-forcemerge) for fast retrieval.

## Manage enrich policies [manage-enrich-policies]

Use the **Enrich Policies** view to add data from your existing indices to incoming documents during ingest. An enrich policy contains:

* The policy type that determines how the policy matches the enrich data to incoming documents
* The source indices that store enrich data as documents
* The fields from the source indices used to match incoming documents
* The enrich fields containing enrich data from the source indices that you want to add to incoming documents
* An optional [query](asciidocalypse://docs/elasticsearch/docs/reference/query-languages/query-dsl-match-all-query.md).

:::{image} ../../../images/elasticsearch-reference-management-enrich-policies.png
:alt: Enrich policies
:class: screenshot
:::

When creating an enrich policy, the UI walks you through the configuration setup and selecting the fields. Before you can use the policy with an enrich processor or {{esql}} query, you must execute the policy.

When executed, an enrich policy uses enrich data from the policy’s source indices to create a streamlined system index called the enrich index. The policy uses this index to match and enrich incoming documents.

Check out these examples:

* [Example: Enrich your data based on geolocation](/manage-data/ingest/transform-enrich/example-enrich-data-based-on-geolocation.md)
* [Example: Enrich your data based on exact values](/manage-data/ingest/transform-enrich/example-enrich-data-based-on-exact-values.md)
* [Example: Enrich your data by matching a value to a range](/manage-data/ingest/transform-enrich/example-enrich-data-by-matching-value-to-range.md)

This file was deleted.

Loading