You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One of the integration we are going to work on is the one with scikit-learn.
This conversation is to collect requirements and features to implement calling scikit-learn using kglab abstraction layer.
My point of view after taking a look to the API provided by popular data science libraries, these are the interesting scikit-learn and scipy functionalities that we could start with:
Allow converting kglab's KnowledgeGraph data structures to observations matrix (to be defined), adjacency matrix and condensed distance matrix as defined by scipy. This will allow building up further flows (or "pipelines", chains of function calls) that the users can assemble to go from a KnowledgeGraph representation to a graph algebra representations. This is critical as we need to pick first principles or to provide different alternatives according to the type of graph or the different tasks the users may want to accomplish.
After 1, let's start with an example flow in kglab for SciPy's Hierachical Clustering. It would be nice to have a flow that allow simple clustering. This implies providing switches to:
These are now in unordered fashion, will take some time to figure out which principles to import from scikit-learn and scipy so to build up proper user flows from knowledge graph as represented in RDF/kglab and graph algebra representations.
Please provide feedback and suggestions. I will create a Github project around this effort.
@SultanOrazbayev mentioned the importance of having a descriptive summary of general metrics about a graph, something like pandas.describe(). These are the metrics that could be useful in an hypothetical SubgraphMatrix.describe():
One of the integration we are going to work on is the one with
scikit-learn
.This conversation is to collect requirements and features to implement calling
scikit-learn
usingkglab
abstraction layer.My point of view after taking a look to the API provided by popular data science libraries, these are the interesting
scikit-learn
andscipy
functionalities that we could start with:KnowledgeGraph
data structures to observations matrix (to be defined), adjacency matrix and condensed distance matrix as defined byscipy
. This will allow building up further flows (or "pipelines", chains of function calls) that the users can assemble to go from aKnowledgeGraph
representation to a graph algebra representations. This is critical as we need to pick first principles or to provide different alternatives according to the type of graph or the different tasks the users may want to accomplish.Other possible examples:
These are now in unordered fashion, will take some time to figure out which principles to import from
scikit-learn
andscipy
so to build up proper user flows from knowledge graph as represented in RDF/kglab and graph algebra representations.Please provide feedback and suggestions. I will create a Github project around this effort.
cc: @tomaarsen @SultanOrazbayev
The text was updated successfully, but these errors were encountered: