Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LogstashFormatter: Move field_skip_set creation to __init__ #97

Merged
merged 1 commit into from
Mar 28, 2024

Conversation

feliixx
Copy link
Contributor

@feliixx feliixx commented Mar 22, 2024

Commit baf2118 introduced a new set field_skip_set to store the fields to
skip, unfortunately this field is created too early, causing modification
of FORMATTER_RECORD_FIELD_SKIP_LIST by library users to be ignored.
Let's create the field at the end of init instead to fix the regression.

Fixes #96

Commit baf2118 introduced a new set `field_skip_set` to store the fields to
skip, unfortunately this field is created too early, causing modification
of `FORMATTER_RECORD_FIELD_SKIP_LIST` by library users to be ignored.
Let's create the field at the end of __init__ instead to fix the regression.

Fixes eht16#96
@eht16
Copy link
Owner

eht16 commented Mar 24, 2024

Thanks for raising the issue and fixing it.

Just for the records: it was possible if the "constants" module have been imported before the "formatter" module, e.g.:

from logstash_async.constants import constants
constants.FORMATTER_RECORD_FIELD_SKIP_LIST = \
    constants.FORMATTER_RECORD_FIELD_SKIP_LIST + [
        'custom_data', 'custom_message', 'func_name', 'interpreter',
        'interpreter_version', 'line', 'logsource',
        'logstash_async_version', 'pid', 'process_name', 'program',
        'thread_name']

from logstash_async.handler import AsynchronousLogstashHandler
from logstash_async.formatter import LogstashFormatter

But this might be a bit cumbersome and most probably most linters won't like this.

@eht16
Copy link
Owner

eht16 commented Mar 24, 2024

Since the FORMATTER_LOGSTASH_ECS_MESSAGE_FIELD_LIST setting is affected in the same way, we should move it too.

Should be like:

diff --git a/logstash_async/formatter.py b/logstash_async/formatter.py
index cc9cf09..e57c84a 100644
--- a/logstash_async/formatter.py
+++ b/logstash_async/formatter.py
@@ -27,9 +27,6 @@ class LogstashFormatter(logging.Formatter):
 
     _basic_data_types = (type(None), bool, str, int, float)
 
-    field_skip_set = set(constants.FORMATTER_RECORD_FIELD_SKIP_LIST)
-    top_level_field_set = set(constants.FORMATTER_LOGSTASH_MESSAGE_FIELD_LIST)
-
     class MessageSchema:
         TIMESTAMP = '@timestamp'
         VERSION = '@version'
@@ -89,6 +86,9 @@ class LogstashFormatter(logging.Formatter):
         self._prefetch_logsource()
         self._prefetch_program_name()
 
+        self.field_skip_set = set(constants.FORMATTER_RECORD_FIELD_SKIP_LIST)
+        self.top_level_field_set = set(constants.FORMATTER_LOGSTASH_MESSAGE_FIELD_LIST)
+
     # ----------------------------------------------------------------------
     def _prefetch_interpreter(self):
         """Override when needed"""
@@ -270,10 +270,13 @@ class LogstashEcsFormatter(LogstashFormatter):
     }
 
     normalize_ecs_message = constants.FORMATTER_LOGSTASH_ECS_NORMALIZE_MESSAGE
-    top_level_field_set = {*constants.FORMATTER_LOGSTASH_ECS_MESSAGE_FIELD_LIST,
-                           *__schema_dict.values()}
     MessageSchema = type('MessageSchema', (LogstashFormatter.MessageSchema,), __schema_dict)
 
+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+        self.top_level_field_set = {*constants.FORMATTER_LOGSTASH_ECS_MESSAGE_FIELD_LIST,
+                                    *self.__schema_dict.values()}
+
     def _get_primary_fields(self, record):
         message = super()._get_primary_fields(record)
         Schema = self.MessageSchema
@@ -407,13 +410,16 @@ class DjangoLogstashEcsFormatter(DjangoLogstashFormatter, LogstashEcsFormatter):
         'REQ_REFERER': 'http.request.referrer',
     }
 
-    top_level_field_set = LogstashEcsFormatter.top_level_field_set | set(__schema_dict.values())
     MessageSchema = type(
         'MessageSchema',
         (DjangoLogstashFormatter.MessageSchema, LogstashEcsFormatter.MessageSchema),
         __schema_dict,
     )
 
+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+        self.top_level_field_set = self.top_level_field_set | set(self.__schema_dict.values())
+
     def _remove_excluded_fields(self, message):
         message.pop('status_code', None)
         super()._remove_excluded_fields(message)
@@ -498,13 +504,16 @@ class FlaskLogstashEcsFormatter(FlaskLogstashFormatter, LogstashEcsFormatter):
         'REQ_ID': 'http.request.id',
     }
 
-    top_level_field_set = LogstashEcsFormatter.top_level_field_set | set(__schema_dict.values())
     MessageSchema = type(
         'MessageSchema',
         (FlaskLogstashFormatter.MessageSchema, LogstashEcsFormatter.MessageSchema),
         __schema_dict,
     )
 
+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+        self.top_level_field_set = self.top_level_field_set | set(self.__schema_dict.values())
+
     def _remove_excluded_fields(self, message):
         message.pop('status_code', None)
         super()._remove_excluded_fields(message)

@feliixx do you want to add this to this PR?
If not, I'll add it later.

@andriilahuta since you are using the ECS formatters, would you mind giving it a try?

@feliixx
Copy link
Contributor Author

feliixx commented Mar 26, 2024

Hi @eht16 thanks for your quick feedback !

Smart idea to fix the issue by changing the import order, but it does feel a bit odd.

Since the FORMATTER_LOGSTASH_ECS_MESSAGE_FIELD_LIST setting is affected in the same way, we should move it too.

Agreed, I've thought of doing it in #97 but as you noticed it's not as trivial, and I'm not familiar with the ECS formatter so I choose not to do it.

@feliixx do you want to add this to this PR?

Thanks, but I'd rather not to as I'm not familiar with this part, so I leave it to you :)

@eht16
Copy link
Owner

eht16 commented Mar 28, 2024

Smart idea to fix the issue by changing the import order, but it does feel a bit odd.

Yes, even though technically correct, it's not so nice.

@feliixx do you want to add this to this PR?

Thanks, but I'd rather not to as I'm not familiar with this part, so I leave it to you :)

Alright, perfectly fine.

@eht16 eht16 merged commit 6c60c27 into eht16:master Mar 28, 2024
5 checks passed
@andriilahuta
Copy link
Contributor

My original intention for this was simpler subclassing, something like this:

class Fmt(LogstashFormatter):
    field_skip_set = {*LogstashFormatter.field_skip_set, ...}

But it shouldn't be too difficult to do this in constructor as well.

While I don't currently have time to properly test this, I don't expect any issues with this PR's approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Modification to constants.FORMATTER_RECORD_FIELD_SKIP_LIST ignored since v3.0.0
3 participants