Maturing Shotover - part 1 #378
Replies: 4 comments
-
Transform ChainsCurrently transform chains only allow a single wrapper struct to traverse them at a single time. This due to the fact that the wrapper actually contains a set of mutable references to each transform in the chain and owns them for the life of the request. The method transform requires a mutable reference to the transform struct. This make it easier to reason about, easier to manage on a per connection basis, but also (maybe) has an impact on performance as its harder to have multiple requests in flight across the chain structure. Moving transforms to a generator approach, may allow us to keep the transform function semantics, simplify some things and be able to process multiple messages across a chain at any given point in time. |
Beta Was this translation helpful? Give feedback.
-
RDF as a potential solutionThe last place I worked we had a system where as a request moved through the system various modules/transforms would add data to the request. Since anyone could write a transform we never knew what data would be added to the request. To solve this we made the request an RDF graph. We stored the graph in a Fuseki (Apache Jena component) data store. Then any transform can access any data inserted by any other transform or process. This makes it possible to do Colored Connections easily. And also makes it easy to share information about what nodes are closest (latency wise) which nodes have duplicates of what data sources (though the clustering software already knows). I gave a talk about this application at ApacheCon 2 years ago (the last face-to-face meeting). I used a similar system to manage a system with several disparate configuration files that had to be kept in sync. In effect we merged all the configs into one space and then could request the data point by any alias. We could also execute some logic to derive new data points. Since RDF is generally not well known, I could give a talk about it and the solution we used for the application I mentioned above as well as an early applications that did data mapping. |
Beta Was this translation helpful? Give feedback.
-
I think doing a talk / bit of education on this approach would be super useful. Do you have any background reading you would recommend? |
Beta Was this translation helpful? Give feedback.
-
A very short introduction: https://sites.google.com/site/restframework/introduction-to-rdf The above examples talk about writing RDF in XML as though that is the only format. RDF is a conceptual framework, RDF/XML is the XML serialization of a specific dataset. There are other, in my opinion easier to read, serializations. So don't get caught up in trying to read the RDF/XML just understand how it works. The best source, an probably largest rabbit hole, is w3c itself. RDF is, like HTML, a w3c recommendation. The most recent W3C primer on RDF can be found at: https://www.w3.org/TR/rdf11-primer/ If you want to play with it check out the fuseki on docker Fuseki is part of the Apache Jena project (https://jena.apache.org) and a reference implementation for the W3C RDF and SPARQL recommendations. Finally, I have a very old implementation of Jena storage layer on Cassandra. I have never tested it under load and it is based on old Jena code so it would need to be updated. The code is at https://github.com/Claudenw/jena-on-cassandra |
Beta Was this translation helpful? Give feedback.
-
Next Steps
The following section is a list of tasks / changes required to support the redis caching, authentication for redis and general development on shotover.
Message/Messages structure
Shared config maps:
Colored connections
Connection setup
Stop abusing Clone trait
new
function that is also async.Module structure refactor
Error handling audit
There are some areas and some transforms that just straight up swallow or log an error but don't do anything sensible with it.
Beta Was this translation helpful? Give feedback.
All reactions