Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prevent duplicities in database when multiple pages request the same remote resource #13

Open
pvgenuchten opened this issue May 31, 2024 · 0 comments
Assignees

Comments

@pvgenuchten
Copy link
Contributor

pvgenuchten commented May 31, 2024

some resources, like favicon.ico are requested from every page, would be good to store the url only once and make a derived table which links the parent pages that link to this url, so if a broken link occurs, it is only stored once, but you know which parent pages are impacted by this broken link (and may need to be updated)

  • if a broken link is identified
  • verify if the link is already on the database, else insert it
  • insert the parent page to the database, referencing the link

At some point the checker can be smarter and skip links which have been previously tested (in the same run)

classDiagram
    Links <|-- test
    Links <|-- parent
    
    class Links{
      - id_link  
      - tested_url
      - average-availability
      - status
    }  
    class parent{
      - parent_url
      - fk_link
    }
    class test{
      - timestamp
      - test_result
      - fk_link
    }
    
Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Discussed
Development

No branches or pull requests

2 participants