-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removes typing extension #57
base: master
Are you sure you want to change the base?
Conversation
The features of this extension may be implemented by the intersection with (?, instance of, MY_TYPE)
How do you type resources now? |
Is it always ok? (I do not have any counter-example in mind) |
An example: Or (it's not valid in RDF but I think we can allow it in your data model) |
It's a bit ugly to join an |
👍 @yhamoudi |
Yes. Types were an optionnal information provided to improve the precision. With this PR it would become mandatory... |
My basic point of view is: we should try to keep the datamodel as simple as possible data model in order to be easy to maintain and understand. I am afraid of having a feature explosion in the data model that would makes the work of module creation very difficult (and it's why I personally dislike the reverse-predicates that has the only advantage (for the Wikidata module) to reduce the tree size).
You are sure that you will be able to add type annotations to each resource/triples? Imho we should only add them when you are sure they are relevant i.e. when they are explicitly stated in the question like in "Who is Bach" that would be rewritten "Bach ∩ (?, instance of, person)" or "In which country is Paris" that would be rewritten something like "(Paris, [located in, in, location], ?) ∩ (?, instance of, country)". And that because I don't see how you can do a good enough typing everywhere without real knowledge of the semantic of each word. For example will you be able to understand that "mother" may be both a relationship and a movie? An other example: type the output of "Type "Where is Paris?" is very tricky. But I would be very happy to be wrong on it, so feel free to convince me I'm wrong. Side remark because I believe it will arise again quickly: please no parsing of "When is born X" as "(X, birth, ?) ∩ (?, instance of, date)", because it has no real meaning: the range of a "birth" predicate would usually be an event, and cast it to date with an intersection with "(?, instance of, date)" or with a type annotation has really no semantic sense. More, it makes simple module development far mode difficult (need to do clever guesses from a "birth" predicate and a "date" type to see that it's a "birth date" we are looking for).
Could you expend on it? I believe that adds some instance of triples is cleaner because we could imagine that the module rewrite the triples he knows about and then the libmodule applies "instance of" triples using resource value-type and JSON-LD @type. If you see a simpler way to use type annotations, please expend on it. I would be very happy to have something simpler than that. |
On 21/02/2015 17:34, Thomas Tanon wrote:
Because module developpers would have to implement a simplification step |
It's exactly why I've proposed the filter based on value-type and @type. |
What is the difference between type and value-type? What is the the JSON serialization of typing? |
The serialization of resources specifies a type ("resource") and a value-type ("time", "string", "resource-jsonld"...). See the spec for more details
The serialization of the type extension has not been specified yet. |
I have not been clear. I was talking of type from the datamodel (that is removed in this pull request) and value-type from the serialization. But after re-reading the doc, i have no more question on this.
I know that Watson uses thousands of types and that it's an important feature, so they probably succeed to perform a very accurate typing.
I'm not sure that i understand this remarks (especially about " filter based on value-type and @type"). You say (?) that having 2 triples instead of 1 is better because 2 differents modules can try to solve them. For instance, let's consider Depending on the datamodel we have:
I agree that removing types will solve this kind of things, but i'm not sure that it's the clean way to do. Indeed, with the same reasoning there is a lot of other parts that could be split:
I think we have 3 possibilities:
I dislike the use of Moreover, i think that we should take into account the computation time needed to solve a question. When there is only 4-5 modules to query it's easy, it could be more difficult if there were 100 modules. The shortest is the normal form, the quickest will be the algorithm (there is a balance to find between the accuracy of the answer and the speed needed to obtain it). |
The features of this extension may be implemented by the intersection with (?, instance of, MY_TYPE)