Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible issue with alarm "disable until" menu and "enabled" field in kafka and elasticsearch #3304

Open
tynanford opened this issue Mar 2, 2025 · 11 comments

Comments

@tynanford
Copy link
Contributor

We noticed at some point that the alarm log search stopped working. There was this error in the logger service which suggested the enabled field somehow has a date string instead of bool value:

SEVERE: Failed to parse the searched alarm log messages. co.elastic.clients.elasticsearch.core.search.Hit@145b4471
com.fasterxml.jackson.databind.exc.InvalidFormatException: Cannot deserialize value of type `boolean` from String "2025-03-01T22:09:00": only "true"/"True"/"TRUE" or "false"/"False"/"FALSE" recognized
 at [Source: UNKNOWN; line: -1, column: -1] (through reference chain: org.phoebus.alarm.logging.rest.AlarmLogMessage["enabled"])
	at com.fasterxml.jackson.databind.exc.InvalidFormatException.from(InvalidFormatException.java:67)
	at com.fasterxml.jackson.databind.DeserializationContext.weirdStringException(DeserializationContext.java:1851)
	at com.fasterxml.jackson.databind.DeserializationContext.handleWeirdStringValue(DeserializationContext.java:1079)
	at com.fasterxml.jackson.databind.deser.std.StdDeserializer._parseBooleanPrimitive(StdDeserializer.java:453)
	at com.fasterxml.jackson.databind.deser.std.NumberDeserializers$BooleanDeserializer.deserialize(NumberDeserializers.java:225)
	at com.fasterxml.jackson.databind.deser.std.NumberDeserializers$BooleanDeserializer.deserialize(NumberDeserializers.java:200)
	at com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:324)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
	at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:322)
	at com.fasterxml.jackson.databind.ObjectMapper._readValue(ObjectMapper.java:4569)
	at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2798)
	at com.fasterxml.jackson.databind.ObjectMapper.treeToValue(ObjectMapper.java:3261)
	at org.phoebus.alarm.logging.rest.AlarmLogSearchUtil.lambda$search$25(AlarmLogSearchUtil.java:282)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682)
	at org.phoebus.alarm.logging.rest.AlarmLogSearchUtil.search(AlarmLogSearchUtil.java:287)
	at org.phoebus.alarm.logging.rest.SearchController.search(SearchController.java:79)
	at jdk.internal.reflect.GeneratedMethodAccessor63.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:569)
	at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)
	at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:150)
	at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:117)
	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:895)
	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:808)
	at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1070)
	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:963)
	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)
	at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:655)
	at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:764)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:227)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
	at org.springframework.web.filter.AbstractRequestLoggingFilter.doFilterInternal(AbstractRequestLoggingFilter.java:289)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
	at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
	at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
	at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201)
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:197)
	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97)
	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:541)
	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:135)
	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78)
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:360)
	at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:399)
	at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
	at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:890)
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1789)
	at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
	at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191)
	at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
	at java.base/java.lang.Thread.run(Thread.java:840)

I can reproduce this with phoebus v4.7.3 or the master branch by using either the calendar picker or pre-defined shelving_options dropdown. After disabling an alarm this way, elasticsearch has the date string in the enabled field. It also looks like kafka has the date string in enabled so it doesn't look like an issue in the alarm logger:

./kafka-console-consumer.sh  --bootstrap-server localhost:9092 --topic *** --from-beginning
{"severity":"INVALID","message":"STATE_ALARM","value":"FAULT","time":{"seconds":1740865677,"nano":422468811},"current_severity":"INVALID","current_message":"STATE_ALARM"}
{"user":"tford","host":"***","description":"0 is INVALID alarm, 1 is NO_ALARM","enabled":"2025-03-02T03:35:28.209688746","latching":false,"guidance":[{"title":"Advice","details":"This is just for fun"}],"displays":[{"title":"AlarmExampleScreen","details":"alarm-example.bob"}]}

The alarm-server and alarm-logger service are running v4.7.3 but we tested with the master branch phoebus client. Is this working for anyone or has anyone else ran into this issue too?

@tynanford
Copy link
Contributor Author

Well I am confused but it looks like maybe this is the intended behavior after all?

https://github.com/ControlSystemStudio/phoebus/blob/master/app/alarm/model/src/main/java/org/phoebus/applications/alarm/messages/EnabledSerializer.java#L29

https://github.com/ControlSystemStudio/phoebus/blob/master/app/alarm/model/src/main/java/org/phoebus/applications/alarm/messages/EnabledDeserializer.java#L51

So it could be the alarm logger needs to handle the case where enabled is a datestring and then if so, assume that means enabled is false?

@jeonghanlee
Copy link
Contributor

It may be more than what we fixed in the previous PR #3030

Could you describe how we recreate this issue?

@tynanford
Copy link
Contributor Author

Here are the steps to re-create the issue:

  • Query alarm logger and see that it works with curl or phoebus alarm log table
  • In alarm tree in phoebus: Right click -> Configure Item -> Either disable with the calendar dropdown or the predefined list (e.g. disable for 6 hours)
  • Query alarm logger again with curl or with phoebus alarm log table
  • Check the logs from the alarm logger service - The SEVERE: Failed to parse the searched alarm log messages should appear

The issue seems to me with how the enabled field in alarm server/kafka is handled. If you use the "Disable Until" feature it stuffs the date string into the enabled field. But from the code in the alarm logger, it seems like this field is expected to be a boolean. I am exploring a fix to the alarm logger serialization to convert date string to false if this indeed is a bug.

I also see null entries in the alarm log with curl or a browser when creating a "Disable Until" event. Not sure if that is related

It would be nice to see if anyone can re-produce this at their site since I am not using the latest version of phoebus or the services. Maybe this has all been cleaned up on the master branch

@jeonghanlee
Copy link
Contributor

jeonghanlee commented Mar 6, 2025

@tynanford It looks like the bug you saw is the same one that we recently identified and fixed. I couldn't reproduce the issue. And I will review the exact troubleshooting documentation when I have some time.

Tomorrow, please join the meeting where we will demonstrate the scenario together. You can evaluate the ALS-U "semi-production" version of all services through the pas-demo repository with our repository. Since the environment is local, you can delete it after testing.

Additionally, I recommend rebuilding Phoebus with the patch code from this pull request: #3030. You only need to change one line if you prefer not to use the current master branch.

See you later!
@jeonghanlee

@shroffk
Copy link
Member

shroffk commented Mar 6, 2025

@tynanford your observation is correct.

With the introduction of the "disable until" functionality the enable field which previously only allowed boolean type now needed to support 2 types, boolean and String representing a datetime value.

The serialization and deserialization classes in the alarm server and application were updated to support the above change...

Now, we should work on the short term fix that @jeonghanlee mentioned and update the alarm logger service and the alarm log table client to handle the situation where the enable field can be both a boolean or string.

It looks like the current mapping for the enable field is already a "string/keyword"


we can update the serializer/deserializer for the POJOs and then allow both "true/false" and "date time"... we will have to update the POJO to change the enable filed from boolean to string.

AlarmLogTableItem and AlarmLogMessage will have to be updated. We can actually add a dedicated deserializer

    @JsonDeserialize(using = AlarmEnabledDeserializer.class)
    private String enabled;

    public static class AlarmEnableDeserializer extends JsonDeserializer<Boolean> {

    public AlarmEnableDeserializer() {
    }

    @Override
    public Boolean deserialize(JsonParser p, DeserializationContext ctxt) throws IOException, JsonProcessingException {
        // TODO add any additional type handling we need to
        return p.getText();
    }
}

I don't see any usage in the code where having the enable field as a boolean is helping so I think we should simply update the field and test.

@tynanford
Copy link
Contributor Author

Hey great minds think alike! ;) I've been playing around with this patch and it seems to fix the issue like you said. Can test some more and create a PR

In the future I could see it being nice to include the date string in the alarm log messages. So then people can see "Oh Joe disabled that alarm until next tuesday and then Sally disabled it even further out to Friday". Wonder in the next release if we could add a new field for disabled_until for config messages and make enabled only a boolean again

diff --git a/services/alarm-logger/src/main/java/org/phoebus/alarm/logging/rest/AlarmLogMessage.java b/services/alarm-logger/src/main/java/org/phoebus/alarm/logging/rest/AlarmLogMessage.java
index ae96b1630..1e6e3d60e 100644
--- a/services/alarm-logger/src/main/java/org/phoebus/alarm/logging/rest/AlarmLogMessage.java
+++ b/services/alarm-logger/src/main/java/org/phoebus/alarm/logging/rest/AlarmLogMessage.java
@@ -11,6 +11,7 @@ import com.fasterxml.jackson.databind.annotation.JsonDeserialize;
 
 import java.io.IOException;
 import java.time.Instant;
+import java.time.LocalDateTime;
 import java.time.ZoneId;
 import java.time.format.DateTimeFormatter;
 
@@ -34,6 +35,7 @@ public class AlarmLogMessage {
     private String user;
     private String host;
     private String command;
+    @JsonDeserialize(using = EnabledFieldDeserializer.class)
     private boolean enabled;
 
     public String getConfig() {
@@ -168,4 +170,25 @@ public class AlarmLogMessage {
             return Instant.from(formatter.parse(p.getText()));
         }
     }
-}
\ No newline at end of file
+
+    public static class EnabledFieldDeserializer extends JsonDeserializer<Boolean> {
+
+        private static final DateTimeFormatter formatter = DateTimeFormatter.ISO_LOCAL_DATE_TIME;
+
+        @Override
+        public Boolean deserialize(JsonParser p, DeserializationContext ctxt) throws IOException, JsonProcessingException {
+            String text = p.getText();
+            try {
+                return Boolean.parseBoolean(text);
+            } catch (Exception e) {
+                try {
+                    LocalDateTime.parse(text, formatter);
+                    return false; // A date string means the alarm is disabled
+                } catch (Exception ex) {
+                    // If all else fails, return false?
+                    return false;
+                }
+            }
+        }
+    }
+}

@shroffk
Copy link
Member

shroffk commented Mar 6, 2025

Ha! :)

I have not been able to test my PR with your issue but if you could do that it would be great. It should show messages like "disabled until ...."
Let me know if the changes from the PR are useful.

@jeonghanlee
Copy link
Contributor

Why do we work during the night? Thank you @shroffk and @tynanford to make the system much better than before! I like it!

@shroffk
Copy link
Member

shroffk commented Mar 6, 2025

I am closing this PR #3314 cause I think we need to handle the new enable object better. It consists of the boolean and a time representation.

I don't want to change the elastic mapping as far as possible

@tynanford
Copy link
Contributor Author

tynanford commented Mar 6, 2025

Hi @shroffk , do you not like the deserialization idea for the logger API code? We lose the ability to show the timestamp in the alarm log table UI like your PR had, but it keeps the enabled type in the API the same and fixes the issue above.

@shroffk
Copy link
Member

shroffk commented Mar 7, 2025

I think the deserialization is a good solution.
We can address the possibility of adding the additional timestamp as a separate task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants