-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🍑 ③ Set up API #2
Comments
👋Let me know when you're getting started here! |
As I'm not as familiar with APIs, I'm not sure when it's best to start thinking about the structure of this, but the general idea is that I would like the entire dictionary to have an API, so that different companies and products can tap into the database of words. Part of the build functionality will include a bot function that can help to auto-correct exclusive terminology. What other context can I provide that would be helpful? |
I think the features and functionality that you have in mind are quite clear but I'm unsure about the technical implementation. The below questions will help us to figure out how to store the data and how to query it.
|
Only definitions for now, but there will be layers of information (eventually things like parts of speech, connecting like-terms, etc).
It's definitely layered in that some definitions will require alternate definitions, or sub definitions (as you noticed. It's probably also important to note the future URL feature I'd like to be able to integrate, which could affect how we structure the data. |
As hinted at by the above comments - I think this may be easier to parse it you split this task into 3 parts:
This is primarily relevant here because the |
Hi all! My thoughts are pretty in line with what @lynncyrin mentioned above.
Definitely. In my experience, it has been really helpful to figure out how all of the content fits together as a system, and explore it as much as possible to figure out the edge cases. For instance, I imagine that the structure could be described as (for a start):
An alternative here would be limiting the API to providing the data only, and allowing the frontend (website, app, slackbot) take care of rendering the HTML on its own. This could be kind of nice because the API wouldn't have the responsibility for supporting different platforms or environments.
Have either of you worked with or have thoughts about GraphQL? I've been using it in my projects for a year or so and have grown to really like it. It comes with some of its own overhead, but a bonus here is that the API (again) doesn't need to grow with more endpoints to satisfy the needs of particular use-cases. I've also found that defining a schema (wether you are going to use GraphQL or not) can be really helpful in the planning phase, just to figure out how all of the data fits together. Question: other than marking suggested alternative words as inclusive/non-inclusive, any thoughts on how the overall structure might be different from a thesaurus? Other question: @lynncyrin --- you spoke at !!Con West earlier this year, didn't you? |
You're headed down the right path @good-idea! I would encourage splitting the a. figure out how to encode the data structure into this repo, eg. yaml / json / etc file? database with a CMS? etc.
I would recommend against picking technologies at this point, the encoding of the data structure (eg. task
|
Thank you, everyone! I broke this task up into several issues so we can have more targeted discussions under those relevant issues. Also, this maybe obvious to everyone but if you haven't already, what I've built so far is here: https://www.selfdefined.app/ |
@lynncyrin yay, I thought I recognized you! We didn't meet but I was in the audience. @tatianamac https://www.selfdefined.app/ is looking great! |
Just throwing this out there: Noticed that since the site is being hosted on netlify, the project could leverage netlify functions and query the JSON/markdown files for a basic query. I know that the aim is to leverage algolia in the future as well, so this could be a step forward towards that. An alternative is that with Wondering if @good-idea/ @ovlb / @tatianamac have thoughts before I start a PR. |
@denistsoi I have some thoughts, but currently no time to write them out. I will try to do so over the weekend. Also, my thoughts don’t matter too much :) |
Thanks @ovlb I think this would be helpful say, if the project is moved off from github. |
Sorry for the delay in answering! I think we have to answer additional questions before we hop into implementation mode. Let me expand my reasoning a bit:
I guess the first task where an API is needed is the Twitter bot. As soon as we have this task specced out (which questions does the bot answer, what happens if there is no definition/no alternative words, …) we can build something that solves these problems. As we all have limited resources, I think it is vital to use our energy wisely and try to focus on implementations that – hopefully – stand the test of time. An API that uses the file system will most likely not do that. Re: Including raw definitions in the build: Having the definitions in the deployed markup and/or the client-side JS bundle (not entirely sure what you would like to target) seems very detrimental to the performance of the site. I would advise against this. |
Sorry for not sending this sooner @ovlb, but had drafted an earlier response on my phone and forgot to getting round to sending it.
Netlify functions or any serverless function can act as an endpoint. according to netlify you can add it into the working directory
only issue is billing (free-tier is 125k api calls per month, or 100 hours)
By hooking into the build pipeline, we could define the serverless functions. (Re Algolia, I believe we need to provide an index of all the items that want to be searchable -we could feed it with all definitions as JSON). I don’t think it’s necessarily to require a database unless there’s some data processing later down the line, or want to host the api via heroku or something) (You could theoretically store all definitions on a headless cms and run the build on update)
The API from previous posts was for users to call the service and service the definition as a response (e.g. /define “word”) rather than have the user copy/paste. The consumers of the api would be users or bots, but ultimately by sharing a link, the api would return the definitions of words from the defined api. e.g. user flow:
I mentioned including the markup as technically, you could serve the raw text files as a route, and that'd be the most rudimentary proof of concept for an api. Suppose if the app were to be self hosted, considerations about performance would be higher priority, whereas using statically generated sites like 11ty removes that consideration until the app reaches the free tier limit. Re: api design/structure or security, I haven't considered them yet, but just wanted to start a discussion.
I’m not currently working at the moment so was hoping I thought I’d offer my time to the project rather than doing coding interviews (and face the emotional rejection that comes with that). Also, since i'm in Hong Kong right now, recruitment is slow due to nCoV-2019. |
(I wanted to jump in and say thank you for this discussion and your time!!! Finding this particular sort of help has been difficult, so your expertise is really valued. I want these discussions to happen concurrently with the the build of the webapp, so we can be mindful of the dictionary's infrastructure for its future plans.) |
Hey @denistsoi, thanks so much for your answer. I think I misunderstood one point with including the raw markdown content in the first place, which led me down the wrong path. Would you aim to compile the definitions into JSON objects during build time and store them somewhere in It feels much more feasible to start building it with this background knowledge. I would say: go for it! |
Hey @ovlb
I think requiring the definitions as
Having
|
Update - I created this structured
I got this info via the collection |
In selfdefined/web-app#72 Created an OpenAPI spec for the app based on the high level stuff I saw on the app and in the linked issues from this thread. I don't know much of anything about netlify, so if there's requirements for the data elements to match that format for some reason, the models could be changed in the spec to accommodate. I think this also addresses selfdefined/web-app#6 and should handle how you can maintain the dictionary when the words get large and manage linking of new words, synonyms, and alternatives. I also leaned towards adding internationalization support via language of origin and translations. Not really sure if that would be helpful at all, but I was just thinking about the context of words. If you haven't worked with OpenAPI before, you can take that text and put it in http://editor.swagger.io/ to get a good visual UI to see how the things work out. Once code is running and the spec matches, you can actually use that interface to make some calls to the service or output some cURL commands for you. |
@hibaymj - actually thats a good start and didn't occur to me to have gone down that route - 👍 I suppose we wouldn't need to use netlify functions if we go down this route since the spec could just codegen -> I'm a bit rusty on how that works again so have to look at that part. The other thing i forgot was where someone could pass in multiple terms and render a page
hmmm... need to think about this some more - thinking the project right now is a statically generated site hosted on netlify - my thoughts are: |
Another thing to mention: @good-idea mentions in https://github.com/tatianamac/selfdefined/issues/13 about type definitions for graphql. Using that you could also Alternatively, we could reverse this process if we wanted a gql endpoint https://github.com/IBM/openapi-to-graphql. I suppose my bias is to use something like |
I forked the project and hosting it on netlify with netlify functions.
The definitions are in json located https://github.com/denistsoi/selfdefined/blob/master/functions/data.json I wanted to get some thoughts before I submit a PR. (wanna get some rest before I do any more improvements) Thinking how this might look if say someone were to query from say slack or twitter (say get a raw text instead of it being in |
The Open API 3 tools are really robust and getting more capable over time. Regarding GQL, it's a lot more trouble than you're thinking if you have low API capabilities and your goal is to be cross linked and work with other systems. More importantly however, if you look closely at the GET operations, you'll see I added support for multiple words to come back. This would mean just tailoring the query parameters would be necessary to support returning more than 1 word in a request. Writing the API contract first is extremely valuable for the project, but generating it isn't really all that beneficial. The contract also has you define schemas as well. Hope this helps! |
That’s true;
Suppose I just wanted to couple the api with netlify and the current build
process.
Think it’s a good approach if the API is to be abstracted out of the build
process.
On Fri, 14 Feb 2020 at 10:51 PM, Michael Hibay ***@***.***> wrote:
The Open API 3 tools <https://openapi.tools/> are really robust and
getting more capable over time.
Regarding GQL, it's a lot more trouble than you're thinking if you have
low API capabilities and your goal is to be cross linked and work with
other systems.
More importantly however, if you look closely at the GET operations,
you'll see I added support for multiple words to come back. This would mean
just tailoring the query parameters would be necessary to support returning
more than 1 word in a request.
Writing the API contract first is extremely valuable for the project, but
generating it isn't really all that beneficial. The contract also has you
define schemas as well. Hope this helps!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/tatianamac/selfdefined/issues/2?email_source=notifications&email_token=AAXLQE3NLTNWEBECJANU4ADRC2VXDA5CNFSM4ISUZEE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELZIWXI#issuecomment-586320733>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAXLQEYGY5JOIXE42TAY7ZLRC2VXDANCNFSM4ISUZEEQ>
.
--
.
|
FYI: I haven’t forgotten this discussion and providing feedback is on my to-do list. Sorry for the delay. |
It seems as though the api functionality is being worked into the app itself instead of using a database to store definitions, so is that the prevailing design? By decoupling the data from the app, APIs are much easier to build. They're also easier to scale should you decide to offer it as a service to companies down the road. I would like to help in the area of APIs since I have experience there. |
Hi everyone! My name is Leo and I'd love to help with this process if I can. Having read the entire thread, I feel compelled to echo the concerns of @BrentonPoke. It sounds like the web app is only a small portion of the overall vision for this project. Maintaining a separate API that can be consumed by both our web app as well as 3rd parties would be much better in the long run. I also have concerns about storing HTML in the data. I think that that's allowing the API to be too opinionated. The frontend should be making decisions on how the data appears visually. If there is a major design change on the frontend, we would have to touch every single entry in the database as opposed to fixing the structure in a single JS file. That's expensive over time. It seems like a relational database would be the best way to go, given that everything seems to link back to a single source of truth: the root word. AFAIK that should still give us the freedom to use graphQL on the frontend to query the data. As others have mentioned, once the data structure has been defined, we'll have a better idea of what options we have regarding tech stack. @tatianamac have you finalized, from the user's perspective, what you'd like the API to be able to do? Once that's finalized, we can setup a call to discuss the data structure. I realize that I'm arriving late to a months-long conversation, so I apologize if I'm stepping on any toes. Is there a point person with whom I should touch base? |
Thanks for sharing this prototype, but we need a read/write capabilities for our API and we'd already decided on a non-relational database, as the needs of our schema are likely to evolve overtime. I should be able to get the API repo started this weekend. |
@simonw Thank you for putting this together! We'll definitely keep this in mind for the future approaches; for now I think we'll likely proceed as we have set out. @Ljyockey Thank you! Please let me know if you need anything from me. |
My apologies if anyone was waiting on me to proceed. I'm struggling to keep up with my commitments in the wake of the pandemic. Hoping to get started in the next week or two! |
hi @Ljyockey ! I recognise things are very weird right now—are you still interested in setting this up? No worries either way, just going through the tickets. ✨🙏🏽 |
Hey Tatiana! It’s looking less and less like I’ll be able to dedicate much
time to this in the upcoming month after all. I’ll let you know as soon as
that changes! 🙏🏽
…On Sat, May 2, 2020 at 4:57 PM Tatiana Mac ***@***.***> wrote:
hi @Ljyockey <https://github.com/Ljyockey> ! I recognise things are very
weird right now—are you still interested in setting this up? No worries
either way, just going through the tickets. ✨🙏🏽
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/tatianamac/selfdefined/issues/2#issuecomment-623031099>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFLMOS63WDZR4RF37FHITYLRPSXM7ANCNFSM4ISUZEEQ>
.
|
given the PR for netlify open authoring - https://github.com/greatislander/selfdefined/pull/1/checks I think the doubling down of using netlify serverless functions is more feasible ... Edit: since looking at netlify functions - the starter tier is only 125K calls per month. for glitch - we can get 4000 per hour on the free tier - i might start it with glitch first |
Happy birthday @tatianamac I came here from your birthday tweet. I have a keen interest in ontology, taxonomy, and natural language processing. This project intrigues me and I would love to contribute and help in any way I can. I see there is a discussion about how best to design the API and storage for the dictionary as well as the schema. In the ontology space there is an XML based protocol for defining concepts and their relationships called RDF. There is also a W3C recommendation called JSON-LD that fills the same space. We are using it in the IoT space to make metadata around devices and physical locations. It could prove useful in defining dictionary terms and how they relate. WRT storage, I would also recommend using a NoSQL storage platform not necessarily for flexibility but in order to provide more performance for retrieval. I'd like to offer my experience with cloud architecture, serverless development, and nosql database design. I can help us get an initial environment setup on Azure for development and production. The free tier for azure functions includes 1 million executions per month. CosmosDb (the Microsoft NoSQL database that supports MongoDB API as well as a Graph API) has a similarly generous free tier. For reference, Troy hunt of HaveIBeenPwned.com discusses the costs of running his site on Cloudflare and Azure Functions "It's costing me 2.6c per day to support 141M monthly queries of 517M records." A lot of this is due to a great caching model and usage of serverless functions to save on the backend. Azure functions supports JavaScript, Python, C#, and Java so it will accommodate whatever language the team is comfortable with. In addition the Azure API Management service will make it easier to provide managed APIs for partners who would consume the service at free as well as paid tiers. What I'd advise is that we identify an MVP of the service (it looks like a prime candidate would be the Twitter bot @tatianamac was talking about). Design the API that would support the bot and build the first revision. Perhaps we can setup a regular cadence of weekly calls (come as you can) to get some traction on the solution. |
Thank you, @aguywithcode ! What a helpful and thorough digest. I've restarted the Slack channel, which I'd love for you to join! To me, there are two potential MVPs:
I'm going to switch over and address your comments on the OpenAPI pull request now. Anyone else who has been on this thread is welcome to join the Slack channel as well if you'd like to continue to work on this! ✨ |
Oscar, Amy, Kurt, Tatiana and myself have been discussing the API in the dev channel of Slack. I mentioned using AWS. However, Oscar and Tatiana would prefer to avoid Amazon, Google and Facebook due to issues with their ethics. We continued discussing additional solutions that avoid these. Oscar mentioned Algolia as a solution for searching. We've looked at using a managed database on DigitalOcean. Kurt has been using Netlify functions for search and this would fit as a potential solution for the API code. Amy shared this article as a solution to keep the Serverless function within the DigitalOcean ecosystem. |
I'm hesitant to use Algolia as the free plan doesn't allow for much customization and I wonder what the results of search terms would be. I'd love to run a few different options locally (like elastic search) and compare against what Algolia provides. |
@kkemple I can say that elastic search provides a lot of options for customization. I created an e-commerce site using it over the last year and they had a bunch of custom data access coming from their internal warehouse management software. I agree that it would be good to test different solutions out. Maybe the first step would be to create a local database with the terms and testing it out with different platforms? |
We could apply for the OSS tier immediately?
Like that idea. |
Do we want to have a standardized local environment for this? I think we won't really be using it much in development if we're going managed database, service and serverless function. Or do we want to divi up the different types of services and then present them? |
For the future, if we're using Auth for clients, would Netlify's auth work for this? Would we use API keys? |
Did anyone ever look into Linode to see if they would be able to provide anything for this project? |
@BrentonPoke I don't believe so, at least among the people who talked about it yesterday. Linode wasn't brought up. Do you have some more context outside of what's in this issue? I'm new here so I'm not fully aware of all that's going on. |
(As an aside, when we're reading to start building stuff for this, I'd ask that we open it up as a new repo within the selfdefined organisation so we keep the web app separate from the API. Eventually we will want the web app to be served by the API, but this way it keeps concerns separate.) |
@tatianamac I was going to start on some local testing of this tomorrow night. If you create a repo I'm happy to push what I'm working on up so other people can review. Also going to try and live stream a bit, though it'll be my first time so we'll see how that goes. |
There you go: https://github.com/selfdefined/api |
Should I create a new issue to continue this discussion over there? |
Let’s keep the discussion here for context, but the implementation in the new repo. I’ve created a pinned issue in the new repo sending folks who are interested to this thread. |
@ovlb What do you think about transferring this issue over there? (I don't have strong feelings or experience either way.) |
@tatianamac Ah, didn’t know this was possible. Sounds like a plan :) |
Been thinking about this more and Craft CMS would handle the majority of the needs of the dictionary. It would only need to be extended to allow for submissions of new words and revisions of existing ones. With the approval process to do so. It handles all the user requirements I could think of, allows for localization/translation "sites". Provides an excellent interface for content management. It integrates through a plugin with Elastic search. It also has a headless graphql/rest api mode that allows for working with Eleventy. The CMS is very extendable and I know we could create our own module to handle the submissions and revisions from community members. I use it almost daily at my full time job and really like it. Thoughts? |
So because Craft CMS handles a lot of the features we'd need I'm going to spin up a test environment in it. I can do that a lot quicker than a test MySQL instance. If this isn't something that will work I'm happy to stop but until I get some direction this is how I will proceed. |
@ssmith-wombatweb Sorry for not getting back to you for so long. I’m glad you started thinking about a CMS, since it is quite a logical conclusion to our needs. I’ve only limited experience with Craft (and years old for that matter), so I trust your judgement fully. In short: Go for it :) |
No description provided.
The text was updated successfully, but these errors were encountered: