You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a basic user of a class expression learning algorithm, I am mainly interested in getting a good result in a short amount of time. In general, it is great that a refinement-based algorithm can (in theory) generate any expression of the infinite search space. However, time-wise ⌚, it seems to increase the runtime of the whole process when the top-to-bottom refinement operator generates many candidates, for which a bad quality is already foreseeable. Hence, I would like to be able to limit the search to generate expressions, that have a chance to achieve a quality that is larger than 0.0 F1-measure. Or, to formulate it the other way around: I would like that the refinement operator does not generate expressions, that will have a performance of 0.0.
Example
Let's assume that our KG contains all data from Wikipedia and that we have a learning problem, that comprises only humans as positive and negative examples. I would like to see that the refinement operator does not generate expressions that are obviously not helpful in this setup, e.g., ∃ capital.⊤ is not a helpful expression, because assuming that our KG is free of errors, non of the humans has a capital. Similarly, ∀ capital.⊥ is not helpful either, because this also does not separate the positive and negative examples.
Suggestion
For each "pattern" of class expressions, it might be possible to ask the endpoint for classes or properties that could be useful in this setup. This would allow to reduce the number of COUNT SPARQL queries that are needed to evaluate generated expressions by running a single SPARQL query that collects potential candidates beforehand.
In the following, let <e_1> <e_2> <...> <e_n> denote the set of examples.
C
When generating a class expression that comprises one or several named classes, it could make sense to ask which classes the positive examples have.
For this type of expression, it is important that at least one negative example has the property p. Otherwise, the performance will always be 0. We can query that as before, but use the negative examples.
The text was updated successfully, but these errors were encountered:
I see that the merit of ignoring all Object Property Restrictions (e.g. ∃ capital.⊤ and ∀ capital.⊥ ),
where the union of the domains of roles (let denote it with DR) does not intersect with E^+ = { <e_1> <e_2> ... <e_n> }, i.e., (e r, x) \not \in KG, s.t. e \in E^+, r \not in DR, and x \in AllNamedIndividuals .
Since, if E^+ is about people, no need to take a look at ∃ capital.⊤ and ∀ capital.⊥.
Yet, ignoring ∃ capital.⊤ can be deadly wrong if a goal concept resembles ∃ lives.∃ capital.⊤.
Therefore, we cannot plainly ignore such concepts if you are suggestion this.
Beside, the refinement operators of symbolic models (e.g. CELOE) might already implemented your suggestion.
C
When generating a class expression that comprises one or several named classes, it could make sense to ask which classes the positive examples have.
User Story
As a basic user of a class expression learning algorithm, I am mainly interested in getting a good result in a short amount of time. In general, it is great that a refinement-based algorithm can (in theory) generate any expression of the infinite search space. However, time-wise ⌚, it seems to increase the runtime of the whole process when the top-to-bottom refinement operator generates many candidates, for which a bad quality is already foreseeable. Hence, I would like to be able to limit the search to generate expressions, that have a chance to achieve a quality that is larger than 0.0 F1-measure. Or, to formulate it the other way around: I would like that the refinement operator does not generate expressions, that will have a performance of 0.0.
Example
Let's assume that our KG contains all data from Wikipedia and that we have a learning problem, that comprises only humans as positive and negative examples. I would like to see that the refinement operator does not generate expressions that are obviously not helpful in this setup, e.g.,
∃ capital.⊤
is not a helpful expression, because assuming that our KG is free of errors, non of the humans has a capital. Similarly,∀ capital.⊥
is not helpful either, because this also does not separate the positive and negative examples.Suggestion
For each "pattern" of class expressions, it might be possible to ask the endpoint for classes or properties that could be useful in this setup. This would allow to reduce the number of
COUNT
SPARQL queries that are needed to evaluate generated expressions by running a single SPARQL query that collects potential candidates beforehand.In the following, let
<e_1> <e_2> <...> <e_n>
denote the set of examples.C
When generating a class expression that comprises one or several named classes, it could make sense to ask which classes the positive examples have.
Note that all classes would be listed, that have at least a single instance in the set of examples.
∃ p.⊤
or≥1 p.⊤
Similarly, it would be possible to list properties.
∀ p.⊥
For this type of expression, it is important that at least one negative example has the property
p
. Otherwise, the performance will always be 0. We can query that as before, but use the negative examples.The text was updated successfully, but these errors were encountered: