Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add transform to spec #307

Merged
merged 23 commits into from
Jul 6, 2022
Merged

Conversation

eyalkraft
Copy link
Contributor

@eyalkraft eyalkraft commented Mar 30, 2022

What does this PR do?

This is the first iteration change to add Elasticsearch transform to the spec.

This PR allows adding a transform asset to packages, with an optional destination index template to be installed with it. It also allows installing a transform without starting it.

Upon installation of a package that contains a transform asset, the create behavior is as follows:

  1. If a destination index template is included:
    1. Create the destination index template.
      The name of the created index template will be derived from the data type, name of the package and the name of the containing folder,
      For example: logs-example_package.example_name.
    2. Add the appropriate _meta fields to mark the matching indices as managed by Fleet.
    3. If a fields directory is included:
      1. Use the defined fields as the destination index template mappings.
    4. if the destination index doesn't already exist:
      1. Initialize the index.
  2. Create the transform in the default namespace using kibana_system privileges.
    The name of the created transform will be derived from the data type, the name of the package and the name of the containing folder,
    For example: logs-example_package.example_name-default-0.0.1.
    Namespace and version are added by fleet.
  3. If the transform is configured to be started upon installation (no configuration defaults to true):
    1. Start the transform.

Upon upgrading a package that contains a transform asset, the upgrade behavior is as follows:

  1. If a destination index template is included:
    1. Delete the destination index template.
    2. Matching indices are not deleted.
  2. If the transform is started:
    1. Stop the transform.
  3. Delete the transform.
  4. From here, follow the creation behavior.

The recommended naming convention for the source and destination indices is the data stream naming scheme, or patterns matching the data stream naming scheme.
Data streams are named in the following way: {type}-{dataset}-{namespace},
For example: logs-example_package.example_name-default.
For more information see the introduction to the Elastic data stream naming scheme.

What doesn't this PR allow?

This list of possible changes will be further discussed in follow up issues:

  1. Transform installations to different namespaces other than default.
  2. Installing a transform without starting it. Included.
  3. Installing a transform using any other user but kibana_system (such as the installing user/ some service user). This limits the source indices possible to use by package-installed transforms.
  4. Smart upgrading trasforms (opposing to destroy and re-create).
  5. Chained transform (or other assets) dependencies and controlling their installation order.
  6. Installing stored scripts.

Why is it important?

  1. Required for the Cloud Security Posture integration
  2. Already documented as supported

Checklist

Related issues

@eyalkraft eyalkraft requested a review from a team as a code owner March 30, 2022 14:15
@elasticmachine
Copy link

elasticmachine commented Mar 30, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-06-27T17:22:40.192+0000

  • Duration: 11 min 43 sec

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@mtojek mtojek requested review from joshdover and ruflin March 30, 2022 14:58
versions/1/changelog.yml Outdated Show resolved Hide resolved
@ruflin
Copy link
Contributor

ruflin commented Mar 31, 2022

I'm happy to see that we complete the package spec with the things we already support / have implemented. We should have a follow up discussion on how transforms should work in the future. Even thought it works today I would argue it is less then ideal. Many conversations happened in the past in #23 and linked issues. There were also discussions happening with the team behind transforms on how it could be improved but not sure where we landed there. @kevinlog you might know?

@eyalkraft eyalkraft requested a review from mtojek March 31, 2022 08:44
Copy link
Contributor

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few nit-picks on my side. Regarding the transform logic, I'm leaving the final decision about this PR for @joshdover.

versions/1/changelog.yml Outdated Show resolved Hide resolved
# https://www.elastic.co/guide/en/elasticsearch/reference/current/transforms.html
type: file
contentMediaType: "application/json"
pattern: '^.+\.json$'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should apply any naming requirements? For example, to prepend a package name to the file. Is it possible that another package can override this transform?

Copy link
Contributor Author

@eyalkraft eyalkraft Mar 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure but I think the filename doesn't effect the created Transform at all.
As you can see in our integration (works locally) we have two different transforms in different directories with the same file name (default.json) and they don't override each other.
Probably @joshdover could explain better the exact meaning and implications the filenames and the containing directory names have.
endpoint security integration transforms look the same.

Copy link
Contributor

@mtojek mtojek Mar 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

endpoint security integration transforms look the same.

From the spec perspective, all Endpoint security integration transforms are illegal as they haven't been documented. This is the reason why you need to deal with it now. We will enable the strict package validation soon and it will block the publishing of packages with undocumented (= illegal) features. cc @pzl

I'm not sure but I think the filename doesn't effect the created Transform at all.
As you can see in our integration (works locally) we have two different transforms in different directories with the same file name (default.json) and they don't override each other.

Let me elaborate here, what will happen if we have another package foobar with exact transforms and directories? When we install it, will it overwrite your transforms?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @pzl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me elaborate here, what will happen if we have another package foobar with exact transforms and directories? When we install it, will it overwrite your transforms?

To my understanding the answer is no.
Seems like the id of the transform is a result of both the integration package name and the transform directory name.

Screen Shot 2022-03-31 at 15 04 25

Additionally after having a conversation with @CohenIdo about the naming of the transform files I understood he was instructed to name them default.json so I'll change the spec to reflect this requirement.

Anyways I'm sure Josh will shed more light over this topic.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eyalkraft thanks for doing this, it's something I should have done some time back when my team originally added transforms.

@mtojek

I'm not sure but I think the filename doesn't effect the created Transform at all.
Let me elaborate here, what will happen if we have another package foobar with exact transforms and directories? When we install it, will it overwrite your transforms?

Currently, when transforms are installed, the package name will be prepended (along with the directory path) to actual name of the transform. Thus, it doesn't matter what the definition .json file is named. So, as long as the packages themselves are named differently, I don't think there will be any collisions. Here is what our transforms are named when installed. (@pzl let me know if I have anything wrong here)

image

That being said, I am OK with any naming convention we want to enforce and we can update the names of files to match.

We will enable the strict package validation soon and it will block the publishing of packages with undocumented (= illegal) features.

For clarity, adding this PR to the package spec will "document" the transform feature and will no longer make it illegal, correct?

Further, let us know when we can test against strict package enforcement so we can fix any other problems on our end and be prepared for when it is live. I would ask that we can publish packages as is for the next couple weeks so that we can push our package with changes for the 8.2 release and in case we need to push anything additional to fix bugs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That being said, I am OK with any naming convention we want to enforce and we can update the names of files to match.

That's the killer argument for this issue. I guess we can resolve the conversation. Thanks for jumping in, Kevin.

For clarity, adding this PR to the package spec will "document" the transform feature and will no longer make it illegal, correct?

Yes, that's the idea.

Further, let us know when we can test against strict package enforcement so we can fix any other problems on our end and be prepared for when it is live. I would ask that we can publish packages as is for the next couple weeks so that we can push our package with changes for the 8.2 release and in case we need to push anything additional to fix bugs.

@eyalkraft It would be great if we can copy Endpoint's transforms here as test cases to make sure they are aligned with new spec requirements. @kevinlog What would be the best source to copy from?

Copy link

@kevinlog kevinlog Mar 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eyalkraft @mtojek

@ eyalkraft It would be great if we can copy Endpoint's transforms here as test cases to make sure they are aligned with new spec requirements. @ kevinlog What would be the best source to copy from?

Either of the definitions here are good: https://github.com/elastic/endpoint-package/tree/master/package/endpoint/elasticsearch/transform

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added both

@eyalkraft eyalkraft self-assigned this Mar 31, 2022
Copy link
Contributor

@joshdover joshdover left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been meaning to sit down and write some thoughts on the open issue for this #23, but now is a good as time as any. I see two main problems with the existing implementation of transforms as used by the Endpoint package:

An index template is required, but there is no explicit link in the package (or the package install implementation) between the template and the transform

Transforms require an index template as well (which is another 'illegal' feature), yet in Endpoint's current transform implementation there is no explicit link between the index template and transform. This creates problems when a transform's source index is another transform's destination index, as @CohenIdo noted in #23 (comment). In such a scenario, it'd be ideal for our package install code to be able to be aware of both the dependencies between transforms and between a transform and its index template. This would allow us to install the index templates and create the destination indices before setting up the transforms.

The CSP plugin is currently hacking around this limitation by manually creating the index template in their Kibana plugin. This means the index template is currently duplicated between the Kibana codebase and the CSP package source code.

Adding support for namespace-specific transforms

We need to keep an eye towards this future requirement with an idea of how we will support this in the future (see private issue linked below). Currently the transforms contain the default namespace directly in the transform's name and source/dest indices. We should instead provide some sort of naming convention/pattern that does not include the namespace and rely on the package install code to add the namespace at package install time. In the future, we'll be providing an API to add namespaces and the associated resources, see elastic/kibana#121118


I think we should redesign how this works a bit, what about supporting a yaml file like this?

description: Latest Endpoint metadata document per host
frequency: 10s
source:
  # namespace would be appended as `metrics-endpoint.metadata-default*`
  index_prefix: metrics-endpoint.metadata-
  query: |
    {
      "range": {
        "@timestamp": {
          "gt": "now-90d/d"
        }
      }
    }
dest:
  # namespace would be appended as `metrics-endpoint.metadata_current-default`
  index_prefix: metrics-endpoint.metadata_current-
  # Raw index template json
  index_template: |
latest: |
  {
    "unique_key": [
      "elastic.agent.id"
    ],
    "sort": "@timestamp"
  }
sync: |
  {
    "time": {
      "field": "event.ingested",
      "delay": "10s"
    }
  }

One challenge here is that to solve the above problems we may also need to support the existing pattern that the Endpoint plugin uses. Alternatively, we could switch over to a new pattern in a given stack version and older Endpoint packages wouldn't be installable in those newer Stack versions. @kevinlog would this be acceptable?

@kevinlog
Copy link

@joshdover

Alternatively, we could switch over to a new pattern in a given stack version and older Endpoint packages wouldn't be installable in those newer Stack versions. @kevinlog would this be acceptable?

This would be OK from our perspective since we tear down the old transform and install the new one on each package upgrade. We always tie a new package release to a new stack release. One thing to keep in mind is that the new package doesn't get installed until a user with proper permissions visits Fleet or Security in Kibana. I know there is talk of installing packages in ES at upgrade time which would solve this drift. I still think we would be OK since the older package/assets would already be installed and we would only try to install the new package with the new patterns.

Let me know if the above makes sense.

@joshdover
Copy link
Contributor

One thing to keep in mind is that the new package doesn't get installed until a user with proper permissions visits Fleet or Security in Kibana. I know there is talk of installing packages in ES at upgrade time which would solve this drift. I still think we would be OK since the older package/assets would already be installed and we would only try to install the new package with the new patterns.

Is this still true? Fleet changed to auto upgrade packages on boot using the kibana_system user in 8.0 without requiring any users to visit the UI. Is there custom code in the Endpoint side that still needs to run separate from Fleet's package install/upgrade process?

@sophiec20
Copy link

sophiec20 commented Mar 31, 2022

Transforms require an index template as well (which is another 'illegal' feature)

It is best practice for Transforms (pivot transforms), when deployed as part of a package, to explicitly define the destination index mappings. This can be achieved by either
a) create the index with its defined mappings
b) create an index template and specify deduce_mappings: false as a transform setting

I agree with @joshdover that it should be part of the package.

I'll also add a requirement to specify alias would help with future proofing upgrade scenarios.

Adding support for namespace-specific transforms

Controlling the name of the transform and its destination and source indices might not be applicable to all use cases. It might fit for the endpoint use case, but there are multiple future requests for packages which contain transforms (such as Host Risk Score and Beaconing, from a different part of Security). These do not currently conform to the Fleet naming conventions for data streams afaik.

What naming conventions are being enforced with this PR? Can we review to ensure there is enough flexibility to cover other transform use cases.

@kevinlog
Copy link

@joshdover

Is this still true? Fleet changed to auto upgrade packages on boot using the kibana_system user in 8.0 without requiring any users to visit the UI. Is there custom code in the Endpoint side that still needs to run separate from Fleet's package install/upgrade process?

You are correct - there is no other custom code on the Endpoint side. I simply forgot about this change, apologies for the confusion.

contents:
- description: An Elasticsearch transform folder
type: folder
pattern: '^[a-z|_]+$'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI complains about your test case. Should this pattern contain also 0-9?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (and removed |)

@@ -0,0 +1,46 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add also transforms referenced by Kevin in the other conversation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@eyalkraft
Copy link
Contributor Author

eyalkraft commented Apr 3, 2022

@sophiec20: What naming conventions are being enforced with this PR? Can we review to ensure there is enough flexibility to cover other transform use cases.

As can be seen in the spec, this PR allows:

  • Optionally adding a directory called transform under the elasticsearch directory to represent transform assets.
  • The transform directory may contain subdirectories. Each one of these directories represents a transform asset to be installed upon package installation. The names of these subdirectories are allowed to contain lowercase letters, numbers and underscores. The directory name ends up being part of the transform id.
  • Each on of these subdirectories (for example elasticsearch/transform/example_transformation_01/) should contain a default.json file representing the transformation itself.
  • The spec says nothing about the source/destination indices of the transforms.

Regarding the naming of the created transform, as can be seen here:
Screen Shot 2022-03-31 at 15 04 25
The package assets that were installed:

endpoint/elasticsearch/transform/metadata_current/default.json
endpoint/elasticsearch/transform/metadata_united/default.json
cis_kubernetes_benchmark/elasticsearch/transform/latest/default.json
cis_kubernetes_benchmark/elasticsearch/transform/score/default.json

The ids of the created transforms:

endpoint.metadata_current-default-1.6.0-dev.0
endpoint.metadata_united-default-1.6.0-dev.0
cis_kubernetes_benchmark.latest-default-0.0.1
cis_kubernetes_benchmark.score-default-0.0.1

Note that it's not the spec that determines these ids but the package installation code in Kibana.

@joshdover: I think we should redesign how this works a bit, what about supporting a yaml file like this?

Not going into the complicated details of your comment but just noting that mixing yaml and raw json inside as strings can result with a pretty ugly parsing logic, and might lead to mistakes while creating these assets. One thing that really helped us during the integration development is that the assets formats are (almost?) identical to the exported saved objects jsons. This makes sense since under the hood Kibana actually uses the "import saved object" API for the package assets installation, And this let us first create whatever asset we wanted our integration to have in Kibana using the UI and only then export it as Json and add it to our integration (no manual asset creation).

This is your yaml parsed to json
{
  "description": "Latest Endpoint metadata document per host",
  "frequency": "10s",
  "source": {
    "index_prefix": "metrics-endpoint.metadata-",
    "query": "{\n  \"range\": {\n    \"@timestamp\": {\n      \"gt\": \"now-90d/d\"\n    }\n  }\n}\n"
  },
  "dest": {
    "index_prefix": "metrics-endpoint.metadata_current-",
    "index_template": ""
  },
  "latest": "{\n  \"unique_key\": [\n    \"elastic.agent.id\"\n  ],\n  \"sort\": \"@timestamp\"\n}\n",
  "sync": "{\n  \"time\": {\n    \"field\": \"event.ingested\",\n    \"delay\": \"10s\"\n  }\n}"
}

I'm probably missing what is the value of having the raw index template json inside the yaml as string?

Alternatively, what's wrong with going for yaml alone?

Alternative yaml suggestion
description: Latest Endpoint metadata document per host
frequency: 10s
source:
  # namespace would be appended as `metrics-endpoint.metadata-default*`
  index_prefix: metrics-endpoint.metadata-
  query:
      range:
        "@timestamp":
          gt: "now-90d/d"
dest:
  # namespace would be appended as `metrics-endpoint.metadata_current-default`
  index_prefix: metrics-endpoint.metadata_current-
  index_template:
latest:
    unique_key:
      - elastic.agent.id
    sort: "@timestamp"
sync:
    time:
      field: event.ingested
      delay: 10s

And as a json this looks like:

{
  "description": "Latest Endpoint metadata document per host",
  "frequency": "10s",
  "source": {
    "index_prefix": "metrics-endpoint.metadata-",
    "query": {
      "range": {
        "@timestamp": {
          "gt": "now-90d/d"
        }
      }
    }
  },
  "dest": {
    "index_prefix": "metrics-endpoint.metadata_current-",
    "index_template": null
  },
  "latest": {
    "unique_key": [
      "elastic.agent.id"
    ],
    "sort": "@timestamp"
  },
  "sync": {
    "time": {
      "field": "event.ingested",
      "delay": "10s"
    }
  }
}

@eyalkraft
Copy link
Contributor Author

Also another suggestion, since:

  1. This spec change is a blocker for the @elastic/cloud-posture-security team (specifically for Add Kubernetes CIS Benchmark integration integrations#2930)
  2. Transformations as integration-package assets already exist de-facto, and this PR changes the spec to reflect it

What do you think about merging this PR to align the spec with Kibana & the endpoint integration package, and opening another issue to continue discussing the problems and open ends with today's implementation of transformation assets?

I'll happily open the follow-up issue with all the information from this thread if you all agree.

@mtojek @ruflin @joshdover @kevinlog @sophiec20

@joshdover
Copy link
Contributor

Controlling the name of the transform and its destination and source indices might not be applicable to all use cases. It might fit for the endpoint use case, but there are multiple future requests for packages which contain transforms (such as Host Risk Score and Beaconing, from a different part of Security). These do not currently conform to the Fleet naming conventions for data streams afaik.

@sophiec20 Is there a reason they can't conform to the data stream naming convention? It's a goal for all data ingested by Elastic integrations to conform to this convention as it provides many benefits across the entire architecture.

Not going into the complicated details of your comment but just noting that mixing yaml and raw json inside as strings can result with a pretty ugly parsing logic, and might lead to mistakes while creating these assets.

Alternatively, what's wrong with going for yaml alone?

@eyalkraft Yeah I'm not too particular on the mix of yaml and json in my example. Going pure YAML probably makes sense. The goal of my example was to demonstrate how accommodating my concerns around the associated index template and dynamic namespace could work.

What do you think about merging this PR to align the spec with Kibana & the endpoint integration package, and opening another issue to continue discussing the problems and open ends with today's implementation of transformation assets?

If we're going to make breaking changes to how this works, I'd rather not add it to the spec now since then other packages may start to use this and we'd then need to support two ways of doing the same thing from the install side.

If we defer on adding it to the spec now, could CSP continue using the existing implementation in Kibana today and then accommodate a new updated pattern with a hard version requirement (constraints.kibana.version) in the future, similar to what Endpoint is able to accommodate? Can we make an exception in the Integrations repo CI to allow the CSP packages to merge with this violation for now?

@CohenIdo
Copy link
Contributor

CohenIdo commented Apr 4, 2022

Hey @joshdover, thanks for syncing everyone with the index-template issue.
I want to add something else in the same context:
When adding a new Transform and the source index does not exist we will also get an error, for the test, I created a Transform that uses logs-foo as a source index and when Installed the package I got the next error message:

 Failure to install package [cis_kubernetes_benchmark]: [ResponseError: validation_exception: [validation_exception] Reason: Validation Failed: 1: no such index [logs-foo];]

So what I think is in addition to adding a section of index-template in the Transform config we will also need a create_index boolean flag.

@mtojek
Copy link
Contributor

mtojek commented Apr 5, 2022

Can we make an exception in the Integrations repo CI to allow the CSP packages to merge with this violation for now?

@joshdover To be honest, I'm on the fence, as we obey spec rules. We don't have a mechanism to disable spec validation now, but even if we have that would be a backdoor many developers could use :) So rather than disabling the spec, I would recommend merging this PR, as it just legalizes a "transform directory":

    - description: Folder containing Elasticsearch Transforms
      # https://www.elastic.co/guide/en/elasticsearch/reference/current/transforms.html
      type: folder
      name: transform
      required: false
      contents:
        - description: An Elasticsearch transform folder
          type: folder
          pattern: '^[a-z0-9_]+$'
          contents:
            - description: An Elasticsearch transform file
              type: file
              contentMediaType: "application/json"
              pattern: '^default\.json$'

Transformations as integration-package assets already exist de-facto, and this PR changes the spec to reflect it

^ this, to be honest. We already "committed" a crime, let's just confirm it :)

@joshdover
Copy link
Contributor

Transformations as integration-package assets already exist de-facto, and this PR changes the spec to reflect it

^ this, to be honest. We already "committed" a crime, let's just confirm it :)

This will mean we have to support this for quite a while until we do a v2 of the spec, which is unfortunate since it's only used by 1 package today, one that we have very tight control over. If we add this way of configuring transforms to the spec any package can start using it and we already know this way has problems.

This current way of configuring transforms even poses problems for the very use case we're trying to unblock here and we're already having to hack around it in the CSP Kibana plugin. If we're already having to hack around it, why don't we do this:

  1. Don't add this 'broken' way of configuring transforms to the spec
  2. Don't add transforms to the CSP package at all until we have fleshed out the long-term support for this in the spec
  3. Extend the hack in the CSP Kibana plugin to handle all the transform installation logic until we flesh out the long-term support for transforms in the spec

I just don't see the benefit of adding a broken feature to the spec that we're going to have to keep supporting. We've lived with this 'illegal' feature for some time, why not leave it as-is until we have the time to really fix it rather than committing to supporting it's problems for potentially years when there's not much incentive to do so.

In terms of the long-term approach, how much do we still need to decide? Could we iron out the details in the next week or so?

Copy link
Contributor

@joshdover joshdover left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking very close! A couple last suggestions

@kevinlog
Copy link

@joshdover

@kevinlog The uninstall logic in Fleet is very basic right now. We keep track of references of all assets we create during install, regardless of which part of the install process created the asset, and then just delete those assets.

So I think since we will have a reference to the index template we created in the legacy package, when we do the upgrade, the old template should get removed regardless. Note that we don't delete indices, would that be required?

If we conform to the proposed index template naming, it seems it would imply that we should rename the destination indices for our transforms.

For instance: https://github.com/elastic/endpoint-package/blob/master/package/endpoint/elasticsearch/index_template/metrics-metadata-current.json

Our index is coded to be: metrics-endpoint.metadata_current_default which is different than the proposed index template naming convention package-name.index-name, note the metrics qualifier. It's OK to create new indices and index our data into those new indices from scratch, however we should clean up the old indices. Maybe we have a one-time cleanup to account for this which should be a special case?

Let me know if the above makes sense. I may be missing something.

cc @eyalkraft @pzl @joeypoon

@eyalkraft
Copy link
Contributor Author

@kevinlog

If we conform to the proposed index template naming, it seems it would imply that we should rename the destination indices for our transforms.

I don't think that would be necessary.

The name of the created index template will be derived from the name of the package and the name of the containing folder, For example: example_package.example_name.

Note that this talks about the name of the index template itself, and not about the index pattern(s).
Your index template could have the same index pattern as of now.

See for example one of the index templates we use:
Screen Shot 2022-06-26 at 16 35 40
It doesn't have logs in the name, but only in the index pattern.

Your index template
Screen Shot 2022-06-26 at 16 38 05
Could be renamed to endpoint.metadata_current and keep the index pattern as is.


I'm updating the spec and issue descriptions with the naming information as it was a bit hidden in one of the previous comments.

Copy link
Contributor Author

@eyalkraft eyalkraft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Index naming constrains.

Could be done on the following iteration.

"$id": "#root/dest/index"
title: Index
type: string
pattern: "^.*$" # TODO: Enforce the data stream naming scheme.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Destination index naming constraints

- type: string
examples:
- kibana_sample_data_ecommerce
pattern: "^.*$" # TODO: Enforce the data stream naming scheme.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Source index naming constraints

@kevinlog
Copy link

@eyalkraft

Note that this talks about the name of the index template itself, and not about the index pattern(s).
Your index template could have the same index pattern as of now.

You are correct here, thanks for showing your team's example as well!

@joshdover my prior concerns with upgrade and deleting old indices shouldn't be an issue. We can change the name of our index pattern and/or move the template files in the directory, but we can leave the index patterns the same.

cc @pzl @joeypoon

@eyalkraft eyalkraft requested a review from joshdover June 27, 2022 10:00
@joshdover
Copy link
Contributor

Note that this talks about the name of the index template itself, and not about the index pattern(s). Your index template could have the same index pattern as of now.

See for example one of the index templates we use: Screen Shot 2022-06-26 at 16 35 40 It doesn't have logs in the name, but only in the index pattern.

@eyalkraft Is there a reason not to include the type of the data in the name of the index template? I think it should match the existing pattern we have for all other index templates which does include the type prefix, for example the logs-nginx.access template matches logs-nginx.access-*.

You are right that the name of the index template shouldn't impact the name of the destination indices, but IMO, they should actually match so this isn't confusing or inconsistent.

@eyalkraft
Copy link
Contributor Author

@joshdover

@eyalkraft Is there a reason not to include the type of the data in the name of the index template? I think it should match the existing pattern we have for all other index templates which does include the type prefix, for example the logs-nginx.access template matches logs-nginx.access-*.

Actually there is no reason I know of we don't include the type of the data in the name of the index template.
I was under the impression we went through some verifications over the names we use but I guess I was wrong.
We will rename our index templates to match the suggested pattern.

I'm updating the PR and the description to match then.

cc @kevinlog sorry for the confusion

@eyalkraft
Copy link
Contributor Author

Following another quick sync with @joshdover regarding the naming of the transform, we thought it makes sense for the transform to be named in a similar way to the index template (aka include the data type prefix).
This means both CSP and endpoint's transforms would be renamed.
image
So for example
endpoint.metadata_current-default-8.3.0 -> metrics-endpoint.metadata_current-default-8.3.0.

Same with the index template names, the name of the transform doesn't affect the destination/source indices it runs on, but optimally is correlated with them to provide a clear user experience.

@kevinlog

Copy link
Contributor

@joshdover joshdover left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thank you for locking down the transform fields.

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@joshdover joshdover merged commit 1395e87 into elastic:main Jul 6, 2022
@joshdover
Copy link
Contributor

Implementation for supporting this in Fleet's installation code will be discussed in elastic/kibana#134321. This is not yet scheduled.

@joshdover
Copy link
Contributor

I've opened a follow up issue for the future improvements that still need to be made, many of which were discussed in this PR: #370

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Facility for deploying ElasticSearch Transform