Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing version doesn't sync with archive after migration #181

Open
nc7s opened this issue Oct 21, 2024 · 3 comments
Open

Testing version doesn't sync with archive after migration #181

nc7s opened this issue Oct 21, 2024 · 3 comments
Labels

Comments

@nc7s
Copy link
Member

nc7s commented Oct 21, 2024

After dh-shell-completions 0.0.3 migrated to testing for a while (migrated on 22 Sep, problem found on 18 Oct), manpages.debian.org still has its testing version at 0.0.2. Now that 0.0.4 was uploaded and manpages.d.o version are now in sync (testing 0.0.3, unstable 0.0.4), I suspect that only uploads trigger updates, not migrations.

@stapelberg stapelberg added the bug label Oct 21, 2024
@stapelberg
Copy link
Contributor

Hey, thanks for your report.

You’re right that something seems off here, but your suspicion is not correct: debiman does not know about uploads or migrations, it always goes through the list of packages currently in the Debian archive.

However, I think there is a bug in cache invalidation that I have now tracked down based on this timeline:

This is what the Debian package tracker lists:

This is what the debiman logfiles say, annotated for clarity with the resulting state on disk:

TZ=Europe/Zurich journalctl --root=2024-09-19 --since 2024-09-15 -u debiman --grep dh_shell_completions | cat
# rendering both versions because 0.0.2 migrated to testing
Sep 16 05:03:05 ex622 run-debiman.bash[1701967]: 2024/09/16 05:03:05 render.go:296: /srv/man/www/unstable/dh-shell-completions/dh_shell_completions.1.en.html.gz invalidated by /srv/man/www/testing/dh-shell-completions/dh_shell_completions.1.en.gz
Sep 16 05:03:05 ex622 run-debiman.bash[1701967]: 2024/09/16 05:03:05 rendermanpage.go:322: rendering "/srv/man/www/unstable/dh-shell-completions/dh_shell_completions.1.en.html.gz"
Sep 16 05:03:05 ex622 run-debiman.bash[1701967]: 2024/09/16 05:03:05 rendermanpage.go:322: rendering "/srv/man/www/testing/dh-shell-completions/dh_shell_completions.1.en.html.gz"
# -rw-r--r-- 1 root root 2,0K 2024-09-09 00:38 testing/dh_shell_completions.1.en.gz
# -rw-r--r-- 1 root root 4,8K 2024-09-16 05:03 testing/dh_shell_completions.1.en.html.gz
# -rw-r--r-- 1 root root 4,8K 2024-09-16 05:03 unstable/dh_shell_completions.1.en.html.gz

# rendering both versions because 0.0.3 entered unstable
Sep 17 05:03:25 ex622 run-debiman.bash[1813534]: 2024/09/17 05:03:25 render.go:296: /srv/man/www/testing/dh-shell-completions/dh_shell_completions.1.en.html.gz invalidated by /srv/man/www/unstable/dh-shell-completions/dh_shell_completions.1.en.gz
Sep 17 05:03:25 ex622 run-debiman.bash[1813534]: 2024/09/17 05:03:25 rendermanpage.go:322: rendering "/srv/man/www/testing/dh-shell-completions/dh_shell_completions.1.en.html.gz"
Sep 17 05:03:25 ex622 run-debiman.bash[1813534]: 2024/09/17 05:03:25 rendermanpage.go:322: rendering "/srv/man/www/unstable/dh-shell-completions/dh_shell_completions.1.en.html.gz"
# -rw-r--r-- 1 root root 2,0K 2024-09-16 21:43 unstable/dh_shell_completions.1.en.gz
# -rw-r--r-- 1 root root 5,5K 2024-09-17 05:03 unstable/dh_shell_completions.1.en.html.gz
# -rw-r--r-- 1 root root 5,5K 2024-09-17 05:03 testing/dh_shell_completions.1.en.html.gz

# NOTE: The log for 2024-09-22 does not contain any mention of dh_shell_completions!
# most likely cause: 
# 1. debiman extracts the manpage to testing/dh_shell_completions.1.en.gz with modtime 2024-09-16 21:43
# 2. because the mod time of the raw manpage (2024-09-16 21:43) is older than the HTML version (2024-09-17 05:03), debiman assumes the HTML version is up to date and does not need to be re-generated.

# rendering both versions because 0.0.4 entered unstable
Oct 19 23:03:32 ex622 run-debiman.bash[1822969]: 2024/10/19 23:03:32 render.go:296: /srv/man/www/testing/dh-shell-completions/dh_shell_completions.1.en.html.gz invalidated by /srv/man/www/unstable/dh-shell-completions/dh_shell_completions.1.en.gz
Oct 19 23:03:32 ex622 run-debiman.bash[1822969]: 2024/10/19 23:03:32 rendermanpage.go:322: rendering "/srv/man/www/testing/dh-shell-completions/dh_shell_completions.1.en.html.gz"
Oct 19 23:03:32 ex622 run-debiman.bash[1822969]: 2024/10/19 23:03:32 rendermanpage.go:322: rendering "/srv/man/www/unstable/dh-shell-completions/dh_shell_completions.1.en.html.gz"

This is the state on disk:

% ls -hltr /srv/man/www/unstable/dh-shell-completions/ && head /srv/man/www/unstable/dh-shell-completions/VERSION 
total 20K
-rw-r--r-- 1 root root 2,0K 2024-10-19 16:38 dh_shell_completions.1.en.gz
-rw-r--r-- 1 root root    5 2024-10-19 23:02 VERSION
-rw-r--r-- 1 root root 3,5K 2024-10-19 23:03 index.html.gz
-rw-r--r-- 1 root root 5,5K 2024-10-19 23:03 dh_shell_completions.1.en.html.gz
0.0.4#                                                                                                                                                                                                   

% ls -hltr /srv/man/www/testing/dh-shell-completions/ && head /srv/man/www/testing/dh-shell-completions/VERSION        
total 20K
-rw-r--r-- 1 root root 2,0K 2024-09-16 21:43 dh_shell_completions.1.en.gz
-rw-r--r-- 1 root root    5 2024-09-22 05:00 VERSION
-rw-r--r-- 1 root root 3,5K 2024-09-22 05:03 index.html.gz
-rw-r--r-- 1 root root 4,8K 2024-10-19 23:03 dh_shell_completions.1.en.html.gz
0.0.3#                                                                                                                                                                                                   

So the problem consists of multiple parts:

  1. We override the modtime of the raw manpage when extracting to what’s stored in the archive (i.e. the modtime of the uploader):
    if err := os.Chtimes(destPath, header.ModTime, header.ModTime); err != nil {
  2. Invalidating other versions of a manpage updates the HTML modtime, but re-uses the content (as an optimization).
  3. This breaks the assumption we make here: If the HTML version is more recent than the raw manpage, it must also contain the contents of that raw manpage:
    if err != nil || *forceRerender || htmlst.ModTime().Before(st.ModTime()) {

So, what can we do to fix the issue?

  • If we just delete the Chtimes call, the modtime would match the extraction time, which will fix the issue, but break the “Source last updated” line in user-facing HTML versions.
  • Another way to fix the issue would be to stop relying on the mtime and instead inspect the HTML file to figure out which version is contained in the HTML. However, during my investigation I realized that we incorrectly store a reference to the version of the current package, even when incorrectly re-using older content: the footer says “dh_shell_completions.1.en.gz (from dh-shell-completions 0.0.3)”, even though the content is from 0.0.2. So we would first need to fix that.
  • We could disable the content re-use optimization entirely. I haven’t checked how much slower that would make a typical run.

I’m not sure yet which path I like most. Maybe option 2 deserves a shot, and if it turns out to be too hard for some reason, we can resort to option 3.

I can kick off a run with a forced full re-rendering to get the current manpage archive fixed (will take a few days to complete and propagate, though).

@stapelberg
Copy link
Contributor

I can kick off a run with a forced full re-rendering to get the current manpage archive fixed (will take a few days to complete and propagate, though).

Looks like this was a bit quicker than expected: the corrected version now seems to be live.

@nc7s
Copy link
Member Author

nc7s commented Oct 23, 2024

Thanks for the detailed analysis. Some rough thoughts:

  • "Encode" package version info in HTML page creation time, and compare it with man page "last updated" time, leaving modification time free to change and represent modification of itself.
  • Link unstable/foo to foo/$ver where $ver is current version in unstable, so file creation/modification times are decoupled from package versions.
  • Split sections whose changes do not depend on actual content, e.g. "other versions", footer timestamps, etc. into partials, and <iframe> them into content pages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants