A federated search service with result aggregation, moderation and feedback mechanism to potentially update the source data.
If you are interested in similar topics to those used in the project, I'd be happy to chat. If you think my approach makes no sense, I'd love to hear from you - create an issue and let me know where I went wrong.*
Now let me tell you more about the ODIS idea…
A much fuller description of the problem that motivated the project is available in the Search for a Better Search paper, but in short...
We have increasing amounts of data all around us. Discoverability is a challenge. It is especially true in systems where fine-grained access controls are necessary across organisational boundaries.
Public crawling and indexing are not possible. Centralisation of data in data warehouses and lakes is currently the go-to solution. But that means the data governance has to be centralised. While it works in some organisations, it poses significant challenges in heterogeneous systems, and searches across multiple organisations.
The increasing amount of data, gives rise to a growing number of possible search results.
- To improve data discoverability in systems where centralisation of data is not an option
- To improve the search results based on the relative location of the searcher, and the information in the network topology
- To find a mechanism to send feedback and suggest data improvements, from the results, back to the data sources
- To get ready for a more distributed data landscape
So far there is:
- Sample Data Service - to provide test and demo data for the project using a range of formats.
- Data Exchange Network Node - The server that can act as a node in the network doing all the federation of requests.
Later there will be:
- A 'Standard' that allows for easy searches in data accross a distributed network of services in a range of interactions.
- A Demo (or two)
- The search query will be federated (distributed) in a mesh search network
- The results will be filtered by the source systems
- The results will be moderated by the network
- The results will be aggregated before presenting them to the user
- The feedback will be pushed back through the network to the source systems to suggest updates
- Find Me Button - Search for and update your personal infromation held by any department accross the Civil Service.
- Product Search - Building the 'bigger picture' of a product through single search without the data warehouse.
- Acronym Buster - An oversimplified example of why topology as context can help improve search results.
- No Need for Catalogs - Not really a concrete use case, but an sidea what could be possible with federated search approach in terms of data management and governance.
While I hope the use cases are of use, another way to look at what the search (or data exchange) network can offer, is to look at possible interactions within the system.
But the distributed systems can do more than just facilitate a search. The CRM example illustrated how hypermedia approach to API design can help in building extensible distributed systems.
At the moment the project is in a conceptual design stage. Have a look at the solution's architecutre as it is developed.
- API: OpenSearch, OpenAPI, REST, HAL, HATEOAS, JSON-LD,
- Security: OAuth2.0, Open ID Connect (OIDC), Json Web Tokens (JWT)
- Back-end: Python, Django
- Front-end: JavaScript, Design System, React