-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define goals [meta] #2
Comments
Thanks for taking this initiative @andru! cc @elf-pavlik? Here's my attempt at some answers.
|
Worth keeping in mind: http://aims.fao.org/agrovoc |
When thinking about goals, I usually try to answer these questions...
Applying this methodology I think @simonv3 is onto a good start. We think there is a need for a more machine-friendly database of (at least) plant data, but we should take it a step further to gauge how much effort to put into it. (and getting answers might involve reaching back out to the community)
This is a bit "stream of consciousness" but I think important things to ask before diving in too deep. The answer may end up being "there is no good reason, it'd just be fun" - which is fine too. :) |
Also how it fits together with other attempts already present in current ecosystem. IMO README should list all the other know and relevant efforts. Besides mentioned before AGROVOC I also recall Practical Plants |
Great point. That site is a great reference, I had not come across it before. I've been studying it for a couple hours and there are a lot of interesting things we can learn from how they've approached it. The thing that amazes me is the sheer depth and breadth of characteristics there are for even a narrow sampling of plants. Organizing all of that is going to be a real challenge. I like the focus on "useful plants" - more broad than vegetables, but less difficult to approach than the entirety of the natural kingdom. |
@elf-pavlik somehow I've not come across AGROVOC before. Thanks for sharing. Agreed on identifying similar projects early. Edit: see the wiki @tkeifer I like your approach to identifying a problem and addressing needs. I'll try to be guided by it as I get some thoughts out. Thanks for mentioning Practical Plants; it's a wiki I developed some time back and it's in large part my experience in pulling that project together which motivates a lot of my desire to see better open data available, particularly in the areas of names and physical and environmental data, and to see better tools for collectively stewarding that data. It's also my work on a wip agro-tech app Hortomatic and my involvement in the ongoing semantic preparation of the Flora of North America which informs some of the direction I'd like to see on the technical side, but that's for a later conversation. Here's my thoughts on some problems that need to be solved and some goals and ethos for the project... stream of conscious style while getting some of these thoughts out... Problems to solve / niches to serve
Some ideas for Goals
Related thoughts
|
Great to see this conversation happening all :) With FarmBot, at the very least, we'd like to access a database of common edible crop names and representative icons. While we could create and host that alone, we see a lot of value in pooling the effort to allow many apps to use/create/maintain the data. We're currently planning to use the OpenFarm API for this need, because eventually we want to use OpenFarm guide data as well. Regarding centralized vs distributed, I think a hybrid approach would be cool. Distributed in the sense that different apps will exist in order to create special data around the crops/do things with the data for that specific application. And centralized in the sense that we maintain a 'canonical' data set that all apps are using/contributing to. As far as I understand, that's why we want to separate the crops db from OpenFarm guides in the first place. Let OpenFarm do guides, Hortomatic do garden planning/tracking, FarmBot do its thing, etc; and the crops db be the shared resource among them all. |
As far as the types of data this crop db holds, I think it would be neat to not set out any limitations, but rather allow it to grow in any direction based on the overlapping needs of the beneficiaries. So for example, we might start out with just common names. Then we find out that two beneficiaries (FarmBot and Hortomatic) want icons. So those two communities can spearhead that component of the data set. Then we find out that group X and OpenFarm want to share photos. So those two can spearhead that. Just spit ballin here :) |
Hey everyone! Thanks for getting this conversation started! I am building a farm management platform called farmOS (http://farmos.org), and I have been putting some thought into designing a standardized data type for storing cultivation data for various crops/varieties/species. It would be great if we could all work together on a common data format that is useful to everyone! I agree with a lot of the things said above. Here are some thoughts I would add (or echo):
So really, I think what I am leaning towards is more of a "standard data format" than a specific collection. Perhaps the definition of that data format could be it's own Git repository, and OpenFarm's collection of crops could be a separate one, which the OpenFarm web app refers to for information. (I am not very familiar with the OpenFarm architecture, so I'm not sure if that makes sense or not.) The main challenge in collaborating on something like this is defining what actual data we all need represented, and where that is the same and where that is different. I'll organize my notes and post some of those details soon. Excited to continue the conversation! |
I agree with your ideas on plurality. There is rarely a one size fits all approach to data. To me it seems to me that a good dividing line for purpose here is horticulture; keeping the scope and by extension the schema restricted to something manageable. If the project works we could clone the model for similar databases covering related agricultural domains.
I think there's absolutely scope for someone's new variety of tomato to be in this crop database, but I agree with your general point on the power of a distributed dataset. @roryaronson I think you talk a lot of sense when it comes to what the database holds. That we grow it organically based on our needs. There's also some overlap with your thoughts here Mike... if I need GDD data for Hortomatic and nobody else does, then I could start a database with just GDD data, but using some shared standards on naming, schema, etc, to make that database compatible. |
Actually, rolling on from that last comment. Maybe we should be making multiple, distinct databases which share a common naming scheme, each covering a small purpose. These would be a product of our collective needs but for example...
A shared naming scheme is the tricky part. There are often multiple ways to refer to a crop depending on the nomenclature used, so we'd have to come up with some strict rules on which nomenclature gets used where. |
A few questions/comments: There is some great goals content in this issue, would it be worth extracting out into a wiki page, separate from the conversation? It also helps to separate out possible tech solutions and preferences from user needs IMHO. Could a set of user stories be developed? e.g.
There's also (at least) 2 key groups of needs:
What's missing from these sources that doesnt meet your needs?
So what about starting to develop a linked data model? Or simply a set of models and their properties, which could be translated into linked data formats later. Quick background: I'm quite interested in this area, have worked on OpenFoodNetwork a fair bit, a little on an API for Growstuff and explored food data modelling on Freebase before it was eaten by Google. |
I was thinking about this conversation over the weekend, trying to think about it at a high-level and had some insights... (bear with my explanation) It seems to me we are struggling with a question of how to efficiently represent what is essentially plant genetics. For any given living thing really, subtle changes in genetic makeup result in traits which we (as humans) then classify into manageable groups. In our scenario - fruits and vegetables, and all their divisions. The result is a near infinite combination of characteristics that we could potentially need to represent in a database - as gene identification technology advances we could find out our fairly myopic view of the diversity in our vegetables is actually tremendously more than we imagined. While "Tomato A" looks exactly like "Tomato B", it may have a single gene difference that makes it more cold-tolerant and, as such, would be referred to by a completely different name. Obviously, we can't gene sequence every single vegetable and store that data for the average backyard-gardener to search (though that would be cool), so we need to abstract it out a little bit. So looking at it from the opposite perspective, I said "how do two people currently differentiate between different crops?" I realized that we do this by evaluating a very small amount of traits, most of which are visual. I'll use peppers as an example - If we look at one that is round and orange, we all agree - "that is a habenero." In lieu of hard, scientific fact - we generally go with a loose naming convention by majority rule. So what is my point? I envision some sort of object-storage mechanism, which allows attributes to be applied and then grouped through a crowd-sourced type of mechanism. "Object A" is placed into the database and very small core set of attributes are fixed - height, sun requirements, spacing recommendations, etc... are applied. The rest - specifically names, varieties, etc... are left to a kind of crowd-sourced tagging mechanism. If 50 ppl look at a picture of our object and say "Thats a tomato", we go with tomato. If a tomato expert logs in and says "that's a cold-weather, cherry tomato" we apply the tags. There could be some sort of weighting applied to bubble good tags up. I dont know if such an object-storage type of mechanism exists, but I thought I'd throw this out there and see what you guys thought. |
If the models, properties, data, etc are to be useful to a wide range of groups, I wouldnt use tagging. Would be much more beneficial to define a strongly typed set of models and properties. However a system that allows people to enter the information like a wiki based on those models could be good. |
Could you explain "define a strongly typed set of models and properties" in more detail? I'm not sure I follow. |
Basically what's now being debated in #5. So define a model, e.g. |
... that would allow for the basic set of attributes to be defined, and then other people to define "varieties" or "cultivars" that extend it. I think that jives with what you're saying, yea? |
I missed that... it looks close though! My experience has been that even the simplest of assumptions around types, variety, etc.. tends to fail hardcore in the plant world, so I was trying to think of a method of reference that was super flexible and didn't involve many rules. I'm interested to see how that idea evolves. |
I think taxonomy is a useful compromise. We accept in naming a cultivar that the genetics are variable and that a term like Brassica oleracea 'Early Purple Sprouting' represents an arbitrary genetic community which we choose to label for our own needs. Taxonomy accepts this because the alternative, attempting to model the huge complexity of genetics, is not only practically impossible, but I don't see why it would be desirable. When a group of plants has genetically diverged from another enough to have different qualities, and defining that group of genetics as distinct is useful to humans, we assign a new name in order that we can communicate about it. To me the lack of 1-to-1 mapping with genetics is not a flaw we need to figure out, it's a fundamentally useful abstraction we can't do without.
I'd say visual traits are no more important to food crops than any of the other traits. There is also taste, aroma, texture, life stage timing, shelf life, environmental tolerances and preferences, etc. These things cannot be detected and recorded without rigorous study, and I wouldn't trust a digital crowd sourced methodology to get it right.
I find no fault in your method, only that this is more or less how taxonomy has worked for generations and the result is the taxonomy and nomenclature we currently use. I might be misunderstanding something in your proposal, but I don't see an improvement over current taxonomy. Could you expand on what problem it solves? |
@andru - I hadn't yet seen the taxonomy that was started, so it was not meant to be a discussion of how another approach would be better really. To clarify though... I was not suggesting we represent the plants genetically in the repository, only pointing out that in the absence of hard scientific differentiators (looking at pictures on the internet or browsing a farmers market, for example) people revert to visual indicators. To use your example - the average person talking to a farmer is probably more unlikely to look at something and say "that is Brassica oleracea" than they are "that is Early Purple Sprouting" - so it might make sense to approach building a database from a less classification-intensive way than the traditional Family-Genus-Species model. This also takes into account the fact that a large majority of users may be operating below these levels anyway in their discussion of cultivars and varieties over genus and species. Hope that helps clarify somewhat... |
Thanks for clarifying. Totally agreed that scientific nomenclature is unfamiliar to most people. I think we need to use the taxonomic model for providing structural relationships to the data and have a very extensive list of common names in all languages so that people can access the data in whatever way is familiar to them |
@andru I agree that while the scientific nomenclature will likely not be used by most apps or people, it seems the best thing we have for structuring the data. Each app can then choose to only represent the common names if desired. |
I had some thoughts after the talking around climates and website like this one I know this is not the priority but I wanted to write it down somewhere. |
Seems to me we might discuss a little what our aims are for the project in order to inform the technical discussion and plans for implementation...
It would be good to try and get some interested parties together to go over some of this, to make sure we're building something which serves the broadest base possible within the confines of something achievable.
The text was updated successfully, but these errors were encountered: