Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev/1.15.9 #85

Merged
merged 3 commits into from
Dec 15, 2023
Merged

Dev/1.15.9 #85

merged 3 commits into from
Dec 15, 2023

Conversation

alexjhawk
Copy link
Collaborator

No description provided.

@alexjhawk alexjhawk self-assigned this Dec 13, 2023
@alexjhawk
Copy link
Collaborator Author

To make access easier to 'log once' functionality, I have enabled it directly within the Logger.java class. This has also been enabled within the configuration class in the abstract connector framework.

Copy link
Collaborator

@TomKimsey TomKimsey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is room for improvement here regarding how the the stored "logged" messages are done.

Right now there is no limit on the arraylist size which if used improperly could cause issues for connectors utilizing this functionality. I think its beneficial to guard against this possibility. Worst case scenario is that a message gets printed again way after it was first printed.

The second thing is the uniqueness of the keys. Its probably not practical or efficient to use the full log message as the key but have you considered using something like a HashMap. That way the key could be the log message string and the value could be the stringKey value used for a second check if needed. Or you just have a very limited sized arraylist of the full log messages and just have a reasonable maxStringLength check with methods to set the maxListSize for applications needing more log once capability.

*
* @since 1.15.9
*/
private static final List logOnceList = new ArrayList(); // List<String>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be implemented as a FIFO instead to protect against this list getting too large?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what you mean by using a FIFO here.

This was implemented with the reduction of memory in mind, thus the reason for storing only the key and not the body of the log message.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an application were to use this functionality in a way that is not in the spirt of the implementation it may generate up to an infinite amount of new keys over a a long enough runtime.

Example: (this would be bound by the amount of values possible for whatever datatype tag is.
Logger.LOG_WARN_ONCE(tag.valuestring(), tagValueOutOfBounds);

if it was implemented as a fifo you could allow the logOnce list to grow to a set amount of entries, then trim the oldest so that even if the mechanism was abused you would not have memory issues.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be implemented, but it was outside my original scope and makes the implementation less simple/straightforward.

This also checks for/guards against a case that I think is an application-level concern. We don't currently impose limits on other ArrayLists, including those used for retry messages, queued payloads, response error messages, etc., in our connectors, most of which are certain to be larger and more volatile.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a public library and as such i think we should aim to provide the most stable library solution we can for users. I dont expect users of the library to dig into the source to attempt to find potential issues/bugs.

I do not think it would be unreasonable to have a sane default, then allow for limit adjustment via a method
Logger.setOnceMessageListSize(100)
Logger.setOnceMEssageListSize(-1) //no limit

We probably should look into other uses of arraylist and see if we can improve the user experience in terms of noting in the javadocs implications of growing the list too big and/or setting limits which could be modifiable by the application if needed.

I even think generating exceptions for out of size conditions would not be unreasonable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good idea to cap the size to prevent resource issues, but we can add that later. It appears the aim of this PR is to address missing configuration options.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this is a good idea, and is really applicable to many components in the library. I've created issue #86 to investigate and address this at a larger scale in the future.

+ ".");
+ ".";
if (logOnce) {
Logger.LOG_WARN_ONCE(missingKey, logMessage);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this key implementation is not very unique. If another log message was added referencing the same config file key and used the same scheme it would not be printed but it would be unique.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is within the abstract connector framework's configuration class, and the goal of the .logMissingField methods is to provide a single point for handling log output associated with missing configuration fields.

I opted for directly using the key here because it is generally safe to assume there is no other need for another/other log messages elsewhere.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think even a simple improvement would be

Logger.LOG_WARN_ONCE("logMissingField:"+missingKey, logMessage);

Its not unreasonable to thing someone may add in somewhere else
Logger.LOG_WARN_ONCE(invalidValueKey, logMessage);
and it would not be printed because the key could be exactly the same.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a uniqueness prefix

@TomKimsey
Copy link
Collaborator

Or you could just hash the log message string to store in a limited size arraylist. Then just compare hashes.

@alexjhawk
Copy link
Collaborator Author

Or you could just hash the log message string to store in a limited size arraylist. Then just compare hashes.

I considered this solution, but I think it is just extra processing that is ultimately unnecessary.

Also, using the log once methods in the logger is the responsibility of the implementing application, and I don't think it makes sense to spend additional runtime power in the library to work around the potential for compiled code that contains improper usage.

  • If an application uses the same key twice currently, the worst-case scenario is that a log message is not output.
  • In the proposed solution, the worst-case scenario would be much more impactful but also unknown. The implementing application invokes the log once method(s) an unknown number of times with messages that are an unknown size, and calculating hashes has the potential to drag down performance in connectors.

@TomKimsey
Copy link
Collaborator

Or you could just hash the log message string to store in a limited size arraylist. Then just compare hashes.

I considered this solution, but I think it is just extra processing that is ultimately unnecessary.

Also, using the log once methods in the logger is the responsibility of the implementing application, and I don't think it makes sense to spend additional runtime power in the library to work around the potential for compiled code that contains improper usage.

The javadocs do not suggest or define "proper usage".

  • If an application uses the same key twice currently, the worst-case scenario is that a log message is not output.

-Not printing a message because of poor key choice would be bad.

  • In the proposed solution, the worst-case scenario would be much more impactful but also unknown. The implementing application invokes the log once method(s) an unknown number of times with messages that are an unknown size, and calculating hashes has the potential to drag down performance in connectors.

The javadocs do not state that this should not be used in cyclic operations and the functionality suggests thats exactly how it should be used. Under "normal" conditions the performance impact would probably not be measurable.

@alexjhawk
Copy link
Collaborator Author

Or you could just hash the log message string to store in a limited size arraylist. Then just compare hashes.

I considered this solution, but I think it is just extra processing that is ultimately unnecessary.
Also, using the log once methods in the logger is the responsibility of the implementing application, and I don't think it makes sense to spend additional runtime power in the library to work around the potential for compiled code that contains improper usage.

The javadocs do not suggest or define "proper usage".

  • If an application uses the same key twice currently, the worst-case scenario is that a log message is not output.

-Not printing a message because of poor key choice would be bad.

  • In the proposed solution, the worst-case scenario would be much more impactful but also unknown. The implementing application invokes the log once method(s) an unknown number of times with messages that are an unknown size, and calculating hashes has the potential to drag down performance in connectors.

The javadocs do not state that this should not be used in cyclic operations and the functionality suggests thats exactly how it should be used. Under "normal" conditions the performance impact would probably not be measurable.

I can update the Javadocs and include a unique prefix to the abstract connector config logMissingField methods.

You're right that the Javadocs should not indicate that this does not apply to cyclic operations, as this system was designed exactly for that case (suppressing duplicate log messages generated by accessing ? in a cyclic manner).

The performance impact of introducing additional calculations to a cyclic process, though, such as the hashing of log messages, is a very reasonable concern. This is less relevant for single-connector installations, but for scenarios where a connector may be multi-loaded with an additional component, performance has shown to be paramount in maintaining the stability of the solution.

Copy link
Contributor

@it-hms it-hms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One constraint this adds is that it requires unique config keys. I believe that two different JSON objects can have the same key but different meanings. So we just need to make this obvious.

*
* @since 1.15.9
*/
private static final List logOnceList = new ArrayList(); // List<String>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good idea to cap the size to prevent resource issues, but we can add that later. It appears the aim of this PR is to address missing configuration options.

*
* @since 1.15.9
*/
private static final List logOnceList = new ArrayList(); // List<String>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer hash map to array for faster lookups.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migrated to a HashSet-based implementation, which in Java is backed by a HashMap.

@alexjhawk
Copy link
Collaborator Author

Will update with HashMap, improved Javadocs, and uniqueness prefix in framework config .logMissingField methods.

Added methods to the Logger.java class which allow
for logging an event only once. This is tracked
internally by the logger using a specified key value.

This facilitates the implementation of a mechanism
which logs missing fields at most once.
Updated the AbstractConnectorConfig.java class so that
it provides overloads of the #logMissingField() method
which allow for suppressing the log after one time.
@alexjhawk
Copy link
Collaborator Author

Will update with HashMap, improved Javadocs, and uniqueness prefix in framework config .logMissingField methods.

  • Update with HashMapHashSet
  • Improve Javadocs
  • Add framework config log key uniqueness prefix

@alexjhawk alexjhawk added enhancement New feature or request framework Improvements, additions, fixes, or documentation to/for the abstract connector framework labels Dec 14, 2023
@alexjhawk alexjhawk merged commit 7f094eb into main Dec 15, 2023
5 checks passed
@alexjhawk alexjhawk deleted the dev/1.15.9 branch December 15, 2023 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request framework Improvements, additions, fixes, or documentation to/for the abstract connector framework
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants