Skip to content

Broken links isssue tracker #386

Open
@hchandad

Description

@hchandad

Overview

The old documentation was created using hugo, after the migration to contentlayer multiple links no longer point to the right resource, I tried to collect as many and list them, some have been previously mentioned in the issue tracker.

related: #360 #372 #370 #384 #363 #341 #189 #188

The table below was generated by running the linkchecker tool

$ linkchecker -o csv http://localhost:3000 1>link-checker.csv

The csv output was further converted to a markdown format

URL List

URL Parent URL Real URL Fixed
/fonts/Inter.woff2 Parent URL Real URL
/community/team Parent URL Real URL
/blog/2024-02-16-unikraft-releases-v0.16.2 Parent URL Real URL
/docs/usage/advanced/kconfig/ Parent URL Real URL #389
/docs/develop/booting/ Parent URL Real URL #389
/docs/cli/reference/kraft/pkg/ls Parent URL Real URL #389
/docs/cli/reference/kraft/pkg/rm Parent URL Real URL #389
/docs/cli/reference/kraft/rm Parent URL Real URL #389
/docs/cli/reference/kraft/net/ls Parent URL Real URL #389
/docs/cli/reference/kraft/net/rm Parent URL Real URL #389
/docs/contributing/kraftkit Parent URL Real URL #380
docs Parent URL Real URL #389
/docs/internals/testing Parent URL Real URL #389
/guides/catalog-internals Parent URL Real URL #389
unikraft.org/docs/contributing/review-process/ Parent URL Real URL
/assets/files/eurosys2021-slides.pdf Parent URL Real URL #389
Multiple Image Assets Parent URL Real URL #389
%5B#link-to-commit%5D(unikraft/unikraft@142e842) Parent URL Real URL
/assets/imgs/unikraft-arch.jpg Parent URL Real URL #389
/docs/operations/plats/kvm/ Parent URL Real URL
/docs/operations/plats/xen/ Parent URL Real URL
/docs/operations/plats/linuxu/ Parent URL Real URL
/docs/cli/rootfs Parent URL Real URL #389
docs/develop/porting/#makefileuk Parent URL Real URL #389
/docs/cli/reference/kraft/cloud/deploy Parent URL Real URL
/guides/bincompat Parent URL Real URL
docs/contributing/suggest-changes Parent URL Real URL #389
docs/contributing/coding-conventions#definitions Parent URL Real URL #389

Getting the context of where a link is used

To figure out where the url is referenced in the files, git grep is useful , for example :

git grep -n 'docs/contributing/coding-conventions#definitions'
content/docs/contributing/coding-conventions.mdx:748:This is referenced in [the definitions](docs/contributing/coding-conventions#definitions) section for prefixes (i.e. the use of `uk_`, `ukplat_`, `ukarch_` prefixes).

Converting the csv output to markdown table

The following python script was used

convert.py
if __name__ == "__main__":
    import argparse
    import csv

    parser = argparse.ArgumentParser()
    parser.add_argument("-f", "--file", type=argparse.FileType())

    args = parser.parse_args()

    Columns = (
        "urlname",
        "parentname",
        "base",
        "result",
        "warningstring",
        "infostring",
        "valid",
        "url",
        "line",
        "column",
        "name",
        "dltime",
        "size",
        "checktime",
        "cached",
        "level",
        "modified",
    )

    def skip(iterator, n):
        for i in range(n):
            next(iterator)

    import sys

    if args.file:
        csvreader = csv.reader(args.file, delimiter=';')
        skip(csvreader, 4)
        for row in csvreader:
            line = {Columns[i]: value for i, value in enumerate(row)}
            try:
                mk_row = f"|{line['urlname']}|[Parent URL]({line['parentname']})|[Real URL]({line['url']})| |"
                print(mk_row)
            except KeyError:
                print(line, file=sys.stderr)

Further notes

  • It would be nice to integrate a URL checking tool into the development workflow e.i:
    • Add an entry in the contributing docs detailing how to run a url checker on the changes made as part of the pull requests process
    • Integrate a url checker tool, as a pre-commit hook , it would be nice if we can determine which pages are affected from a commit diff and run the checker on those page's only, this would allow for faster run times
  • some url checker tools already provide github action definitions, for example lychee which support's markdown directly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions