Gracefully handle differences in s3 API support #133

kallisti5 · 2019-07-12T20:37:49Z

we're using Minio for our s3 bucket storage, and this resource fails to properly connect due to assumptions about supporting the versioning api.

checking failed
resource script '/opt/resource/check []' failed: exit status 1

stderr:
�[31merror finding versions: NotImplemented: A header you provided implies functionality that is not implemented
	status code: 501, request id: , host id: 
�[0m

The text was updated successfully, but these errors were encountered:

kallisti5 · 2019-07-12T20:50:18Z

The same applies if the server supports versioning, but it is disabled:

stderr:
�[31merror finding versions: bucket is not versioned
�[0m

vito · 2019-08-01T15:31:10Z

Does this still happen when using regexp:?

What can be done here? iirc we just use the S3 AWS SDK and hope that S3-compatible blobstores are, well, S3-compatible. But if we're doing some extra call that breaks things unnecessarily, I'd be happy to review a PR to fix it. (We don't have a whole lot of bandwidth to proactively fix these things ourselves.)

Thanks!

kallisti5 · 2019-12-02T14:29:08Z

regexp works with unversioned buckets... however then if something in the bucket (even unrelated) doesn't match the regex:

Reeeeeeeeally wish this resource just allowed "uploading things, no versioning". Versioning is not really useful in most cases :-)

vito · 2019-12-02T15:28:47Z

@kallisti5 Without versioning there's no way to really do anything continuously as there's no guarantee that artifacts propagating from one build to the next don't just become clobbered by a broken artifact produced by a new build upstream.

versioned_file is there for when you don't care about semver, which I agree is not always what you want. The resource is probably just adding using an extra header to perform uploads in parallel or something, and needs to be adjusted to fall back gracefully.

kallisti5 · 2019-12-02T15:38:56Z

@kallisti5 Without versioning there's no way to really do anything continuously as there's no guarantee that artifacts propagating from one build to the next don't just become clobbered by a broken artifact produced by a new build upstream.

I want artifacts clobbered. Our build system versions artifacts with a unique version number that changes every commit. That's why this issue is so frustrating to so many people. "Just do what I say Concourse, don't get in my way"

kallisti5 · 2019-12-02T15:46:56Z

Another example is we generate a package repository with 40+ packages. We just want to upload the contents of folder(key,prefix,etc) to a bucket at a folder (key,prefix,etc).

The versioning prevents this. The only use-case I see for the s3 versioning is where "You wanna upload the same file over and over", which almost nobody really does.

vito · 2019-12-09T16:55:36Z

If you don't want versioning, you probably shouldn't be using an abstraction (resources) whose central purpose is versioning of artifacts. 😅 Maybe you should use a task for these "just put something in s3" use cases instead?

I want artifacts clobbered. Our build system versions artifacts with a unique version number that changes every commit. That's why this issue is so frustrating to so many people. "Just do what I say Concourse, don't get in my way"

I'm confused by this; it sounds like you don't want clobbering in this case, you just want to be able to push an artifact under some unique name and pass that along through your pipeline, relying on passed constraints to know which artifact to propagate instead of versioning of the artifacts in isolation. (edit: A totally valid use case, and one I'd like better support for, I'm just clarifying here to make sure it's more evidence of that kind of thing. 🙂 )

kallisti5 · 2019-12-09T18:44:01Z

If you don't want versioning, you probably shouldn't be using an abstraction (resources) whose central purpose is versioning of artifacts.

I'm not sure where this comes from. The core functionality of s3 isn't versioning resources. Versioning is an optional add-on to the s3 specification. It's common for non-aws object storage with s3 compatibility (Wasabi, Minio, GCP Storage with interoperability enabled) to not feature versioning, or being able to selectively enable it (There #111 have #98 been #97 a #61 few #55 tickets with #55 folks #140 being #23 confused about this limitation #133)

Folks can use s3 in a lot of ways, most expect "basic uploading of objects" to work without worrying about versioning. At minimum if there is zero interest in providing "basic uploading of objects without factoring in s3 versioning" you might wanna document it.

vito · 2019-12-10T18:33:48Z

I'm not sure where this comes from. The core functionality of s3 isn't versioning resources.

Sorry, I'm not talking about S3 in general, I'm talking about the concept of a Concourse resource. Concourse resources represent versioned objects, fundamentally. It's not at all wrong to use S3 in a way that doesn't involve versioning, but it's odd to use the S3 Concourse resource and not want versioning, because that's what it's for, and that's what all Concourse resources are for.

The language in the documentation isn't very precise around this, which is totally our bad. For a while we built on resources as a "general thing to be used for all kinds of integrations" which turned out to be a bad idea. Now we define them more precisely around how they were originally designed to be used - versioning of artifacts - and I'm focusing on classifying these other use cases (e.g. notifications, triggers, and cases like this one which are more like arbitrary blobs) and finding better ways to represent them in Concourse pipelines.

If you're interested and have the time, I wrote a rather lengthy blog post about all this recently. 🙂

We actually have pretty much the same use case as yours, so I'm feeling this annoyance too. If you look at our pipeline we actually have a linux-rc resource whose filenames don't have any independently orderable versioning. It ends up working because we rely on passed constraints, but it's super confusing because what looks like the "latest" version on that page isn't actually the latest, it's literally the filename with the lowest alphabetical sort order. Super janky!

So at the end of the day I'm suggesting that this resource shouldn't be viewed as "the way to do everything with S3". It's specifically built around versioning individual files in a bucket, either by filename or with versioned_file. That's a very common use case that a lot of people use. I don't disagree that there are other use cases, like uploading a bunch of files to a bucket or uploading blobs with unique filenames - I'm just suggesting that those should not be done with this resource, as they don't fit the intent of Concourse resources at all.

If you just need to upload a bunch of things to S3 and don't need to be able to propagate it through your pipeline, I would just do that with a task for now. Maybe in the future when Prototypes land people can start to develop re-usable S3 toolkits so you don't have to write the task yourself. Maybe this resource even gets converted into a prototype which supports all those extra actions in addition to supporting the Concourse resource interface that it does today. We're just not there yet.

kallisti5 · 2019-12-10T18:45:29Z

Ok. Thanks for explaining it, this definitely helps. The biggest (and most frustrating to be honest) confusion on my end was expecting an "s3 resource" to "just push objects to s3 like any normal s3 client / api"

We already worked around this by adding an s3 client into a container and "setting up the s3 client" every job execution.. we then template the s3 client configuration out via environment variables per pipeline. It's all getting pretty janky but it lets us upload artifacts within a task. I am a bit worried about exposing s3 credentials this way... one stray log line from concourse and our artifact repository could be compromised (which is a big deal)

vito · 2019-12-11T18:17:24Z

Glad all that makes sense. 🙂

Yeah, writing tasks which handle credentials is pretty risky. Lots of people are in the habit of slapping set -x up top and calling it a day. You could enable secret redaction as a sort of safety net at least: https://concourse-ci.org/creds-redacting.html - this was added recently, and will automatically redact any credentials that come from your credential manager.

If they don't come from a credential manager right now, you could consider using var_sources and the "dummy" credential manager, both of which are coming in v5.8. Here's the RFC for var_sources: concourse/rfcs#39 - we'll have docs out for it soon too.

kallisti5 · 2022-12-01T19:27:07Z

For any future lurkers, the concourse rclone resource works well to put artifacts to s3 and a variety of other remote cloud destinations without worrying about versioning.

I kinda wish this resource was renamed s3-versioned-resource or something... it's a critical distinction.

vito · 2022-12-06T16:07:11Z

I kinda wish this resource was renamed s3-versioned-resource or something... it's a critical distinction.

All resources are versioned by design (doc), so I'd wager it would be clearer if the rclone resource called out that it's "unversioned" instead. Otherwise nearly every resource that fits the intended mold would be called "versioned-foo." 😄

vito mentioned this issue Jan 16, 2020

upload to s3 bucket path without versioning and overwriting existing file #140

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gracefully handle differences in s3 API support #133

Gracefully handle differences in s3 API support #133

kallisti5 commented Jul 12, 2019 •

edited

Loading

kallisti5 commented Jul 12, 2019

vito commented Aug 1, 2019

kallisti5 commented Dec 2, 2019

vito commented Dec 2, 2019

kallisti5 commented Dec 2, 2019 •

edited

Loading

kallisti5 commented Dec 2, 2019

vito commented Dec 9, 2019 •

edited

Loading

kallisti5 commented Dec 9, 2019 •

edited

Loading

vito commented Dec 10, 2019 •

edited

Loading

kallisti5 commented Dec 10, 2019

vito commented Dec 11, 2019

kallisti5 commented Dec 1, 2022 •

edited

Loading

vito commented Dec 6, 2022

Gracefully handle differences in s3 API support #133

Gracefully handle differences in s3 API support #133

Comments

kallisti5 commented Jul 12, 2019 • edited Loading

kallisti5 commented Jul 12, 2019

vito commented Aug 1, 2019

kallisti5 commented Dec 2, 2019

vito commented Dec 2, 2019

kallisti5 commented Dec 2, 2019 • edited Loading

kallisti5 commented Dec 2, 2019

vito commented Dec 9, 2019 • edited Loading

kallisti5 commented Dec 9, 2019 • edited Loading

vito commented Dec 10, 2019 • edited Loading

kallisti5 commented Dec 10, 2019

vito commented Dec 11, 2019

kallisti5 commented Dec 1, 2022 • edited Loading

vito commented Dec 6, 2022

kallisti5 commented Jul 12, 2019 •

edited

Loading

kallisti5 commented Dec 2, 2019 •

edited

Loading

vito commented Dec 9, 2019 •

edited

Loading

kallisti5 commented Dec 9, 2019 •

edited

Loading

vito commented Dec 10, 2019 •

edited

Loading

kallisti5 commented Dec 1, 2022 •

edited

Loading