Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use new metadata and download endpoints #286

Closed
wants to merge 6 commits into from

Conversation

NajiObeid
Copy link
Contributor

No description provided.

- updater now uses the new metadata endpoint to check for edition updates.
- updater now uses the new download endpoint to directly download
  editions.
MakeOutput() ([]byte, error)
}

// Download exposes methods needed to check for and perform update to a set of mmdb editions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"perform updates"?

pkg/geoipupdate/download/download.go Outdated Show resolved Hide resolved
databaseDir string,
preserveFileTimes bool,
editionIDs []string,
verbose bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A thought I have with this is if we are changing the API significantly anyway, I wonder if we could clean up more of the mistakes from the existing API. IMO it was largely a mistake. And I think we are still seeing some leak through of that (understandably).

For example I wonder if we need the API to know about things like the database directory, "preserve file times", proxy, verbose, maybe more. Maybe the API could be simpler - give an edition ID and a date and take an io.Writer to write the database to. We'd still need to deal with writing files and such in geoipupdate, but perhaps that does not need to be public.

If we were designing the API from scratch, how would it ideally look? e.g. does the config need to be public? It seems like we are basically redoing it at this point anyway.

Probably needs more thought and discussion though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! yeah I was wondering how much the new changes can drift before I catch some flak. I'll move things around a bit but I think it won't take much to make it look good since most of what we need is already implemented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pkg/geoipupdate/download/download.go Outdated Show resolved Hide resolved
servers and includes two new api calls to the `metadata` and `download` endpoints.
* `pkg/goipupdate/writer` which is responsible for writing databases to various targets.
It only supports writing to disk out of the box.
* `pkg/goipupdate/lock` which is responsible for synchronizing concurrent access to the tool.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about taking more things private? I was thinking the public package API could include only the minimum required. That would be, I think, retrieving metadata about edition ID(s) and downloading an edition. I don't think we need to be providing stuff like the config, locking files, writing files, etc publicly. That kind of stuff could be internal to geoipupdate. Basically anything that could be private, I was hoping we could make private.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would look at it the opposite way. The maxmind api is the only thing that doesn't need to be public as we most probably are the only ones controlling it and people using it don't have an alternative for now.

Some people might be interested in writing files to targets other than disk, they would benefit from having a public writer api.
Some people might not have write access where they're running geoipupdate and might be interested in having a non flock lock type.
All they need to do in this case is provide their own implementation and use it cmd/geoipupdate/main instead of the one we provide right now. They won't have to touch or fiddle with any other piece of code other than their own custom implementation.

Copy link
Contributor Author

@NajiObeid NajiObeid Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But keeping the maxmind api public too is beneficial to people wanting to use it as a standalone in their own fully custom build.
TLDR I don't think it's beneficial to private any of these packages

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which part do you mean when you say the "maxmind api"?

The main benefit is that it would make maintenance simpler. From experience working on it, it has become more complicated and error prone once we began sharing much of the internals.

If someone wants to write to places other than disk, wouldn't they still be able to do that given we'd be working with an io.Reader/io.Writer now? Same with other custom things. I'm not sure I would expect an HTTP API client to include that kind of stuff. We've basically provided the building blocks of re-creating geoipupdate, whereas I think it would be reasonable for us to only provide the HTTP API client part and let users figure out the rest if they need something custom.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry what I meant by "maxmind api" is the current http api for downloads and metadata.

The main benefit is that it would make maintenance simpler. From experience working on it, it has become more complicated and error prone once we began sharing much of the internals.

I am sorry I don't fully get the point. We'd still have to maintain practically the same components/code whether they're made public or not.

reasonable for us to only provide the HTTP API

Yes I agree it's reasonable, but it'd be much nicer if everything is exposed so people requiring minor customization won't have to recreate geoipupdate from scratch, and still have the benefits of locks, retries...out of the box. They'd be able to fast forward their forks, enjoy the benefits of the bug fixes and features we've added while maintaining their custom implementations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry I don't fully get the point. We'd still have to maintain practically the same components/code whether they're made public or not.

Differently though. We are constrained by backwards compatibility and the existing interface. And it has made the code significantly more complicated due to the abstractions. What geoipupdate does is pretty simple but the code has become painful to work with IMO. And that is largely due to the constraints imposed by providing a public API together with its poor design (not talking about yours - I haven't looked at it in detail yet).

nicer if everything is exposed so people requiring minor customization won't have to recreate geoipupdate from scratch

That's true, but at this point my preference would be to step back from that given the drag on development and the quality of the public interface. I don't think a lot of people are depending on it, but they could continue to use older versions for now too and pull in what they need from those versions if necessary. That kind of stuff could go into a separate package/project too.

Maybe we could see if anyone else has any thoughts on the direction here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't read through all the threads, but I am generally in favor of making as much internal as possible. As Will mentioned, this code has become pretty hard to maintain as we are quite restricted by providing a stable API for library users. Many things that would have been trivial changes turn out to be complicated PRs or require unnecessary major version releases, which confuse the primary CLI users.

func (h *httpDownloader) GetEdition(
ctx context.Context,
edition Metadata,
) (reader io.Reader, cleanupCallback func(), err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, having to return a callback seems unfortunate. Would there be a drawback to taking an io.Writer as a parameter instead of this? I don't feel that strongly about it and I'm not sure if one way is more idiomatic or better than the other, but it'd be nice to not have to have the callback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I'll play with the idea a bit and see what changes in regards to your comment here #286 (comment)

@NajiObeid NajiObeid closed this Mar 5, 2024
@NajiObeid NajiObeid deleted the nobeid/metadata-endpoint branch March 5, 2024 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants