Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content consumption metrics ignore content when cached #5817

Open
jlsherrill opened this issue Sep 20, 2024 · 5 comments
Open

Content consumption metrics ignore content when cached #5817

jlsherrill opened this issue Sep 20, 2024 · 5 comments
Labels

Comments

@jlsherrill
Copy link
Contributor

Version
3.61.0

Describe the bug
From what i've been told the download bytes metrics emitted by the content app ignore content that is fetched after the cache is warmed up and includes that content.

To Reproduce
With caching turned on on the content app, fetch a file 1000 times, you'd expect to see content metrics indicate 1000x size of file, instead it'd just be te first time

Expected behavior
Every fetch of the file is counted in the download metrics regardless of pulp-content caching

Additional context

@gerrod3
Copy link
Contributor

gerrod3 commented Sep 20, 2024

I don't think I am following what you are saying. Are you saying that once a file is cached, the download bytes metric should no longer increase if that file is requested again? Or are you saying the opposite, that once a file is cached the download bytes metric no longer increases on repeated requests even though it should?

The cache is poorly named as it isn't a store of recently requested files, it's a lookup table of where the recently requested files are stored.

@jlsherrill
Copy link
Contributor Author

@gerrod3 I will admit that this is based entirely on what @lubosmj told me. It seems that he may not be sure, and it just needs to be tested.

But from i was told, if caching is enabled the metrics only represent the initial fetch of the file and until that cache entry is evicted, further requests are not reflected in the download bytes metrics.

@lubosmj
Copy link
Member

lubosmj commented Sep 27, 2024

I can officially confirm that once we "cache" a requested file, we no longer report the content consumption for the file.

@lubosmj
Copy link
Member

lubosmj commented Sep 27, 2024

Tested locally. I synced a file repository containing 3 files, 1MB each (https://fixtures.pulpproject.org/file/PULP_MANIFEST).

pulp file remote create --name test --url https://fixtures.pulpproject.org/file-many/PULP_MANIFEST --policy immediate
pulp file repository create --name test --remote test
pulp file repository sync --name test
pulp file publication create --repository test
pulp file distribution create --name test --base-path test --repository test

Then, I manually issued GET requests against the distributed source:

http http://localhost:5001/pulp/content/default/test/3.iso & http http://localhost:5001/pulp/content/default/test/3.iso & http http://localhost:5001/pulp/content/default/test/3.iso & 

Instead of showing a growing trend, the curve remains steady once all three files are cached.

image

@lubosmj
Copy link
Member

lubosmj commented Sep 27, 2024

Since we have reopened this topic, I would like to clarify expectations regarding redirect handling. Are we comfortable with a scenario where a user requests a content file but chooses not to follow the redirect? In this case, we would still report the consumption as if the user had followed the redirect and downloaded the content. Is this approach acceptable, @jlsherrill?

self._report_served_artifact_size(content_length)
if domain.storage_class == "pulpcore.app.models.storage.FileSystem":
path = storage.path(artifact_name)
if not os.path.exists(path):
raise Exception(_("Expected path '{}' is not found").format(path))
return FileResponse(path, headers=headers)
elif not domain.redirect_to_object_storage:
return ArtifactResponse(content_artifact.artifact, headers=headers)
elif domain.storage_class == "storages.backends.s3boto3.S3Boto3Storage":
raise HTTPFound(_build_url(http_method=request.method))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants