Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tech Report: Technologies - major.minor versions granularity #48

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

max-ostapenko
Copy link
Contributor

@max-ostapenko max-ostapenko commented Jan 12, 2025

Related to HTTPArchive/httparchive.org#984

As the aggregation changes we have new schemas, and new tables for tech report.

I placed them in reports dataset:

  • tech_crux (successor of core_web_vitals.technologies)
  • tech_report_adoption
  • tech_report_categories
  • tech_report_core_web_vitals
  • tech_report_lighthouse
  • tech_report_page_weight
  • tech_report_technologies
  • tech_report_versions

Notes:

  • removed a few columns from tech_crux (as compared to core_web_vitals.technologies):

    • category
    • origins_with_good_cwv_2023 and origins_with_good_cwv_2024 - deduplicated in origins_with_good_cwv
  • all the metrics have 'ALL' version that aggregates at technology level and expected to match the current values:
    Screenshot 2025-01-26 at 23 55 37

  • corresponding to the current approach tech_report_versions has full adoption data from crawl.pages and tech_report_adoption has the smaller absolute values because of the JOIN with CrUX.
    Screenshot 2025-01-27 at 00 05 01

  • example of the technology versions:

SELECT
  version,
  origins
FROM `reports.tech_report_versions`
WHERE technology = 'WordPress' AND
  client = 'mobile'
ORDER BY
  origins DESC

Screenshot 2025-01-26 at 23 44 13

@max-ostapenko max-ostapenko changed the title Major versions granularity for Tech Reports Technology major versions granularity for Tech Reports Jan 12, 2025
@max-ostapenko max-ostapenko changed the title Technology major versions granularity for Tech Reports Tech Report: Technologies - major versions granularity Jan 12, 2025
@max-ostapenko max-ostapenko marked this pull request as ready for review January 26, 2025 22:46
definitions/output/reports/tech_crux.js Outdated Show resolved Hide resolved
definitions/output/reports/tech_crux.js Outdated Show resolved Hide resolved
definitions/output/reports/tech_crux.js Outdated Show resolved Hide resolved
definitions/output/reports/tech_crux.js Outdated Show resolved Hide resolved
definitions/output/reports/tech_crux.js Outdated Show resolved Hide resolved
@max-ostapenko
Copy link
Contributor Author

max-ostapenko commented Jan 27, 2025

@tunetheweb @rviscomi FYI
removed a few columns from tech_crux (as compared to core_web_vitals.technologies which is to be deprecated):

  • category
  • origins_with_good_cwv_2023 and origins_with_good_cwv_2024 - deduplicated in origins_with_good_cwv

@max-ostapenko max-ostapenko changed the title Tech Report: Technologies - major versions granularity Tech Report: Technologies - major.minor versions granularity Jan 30, 2025
@max-ostapenko
Copy link
Contributor Author

Some test cases for the version pattern:

SELECT
  REGEXP_EXTRACT(version, r'(?:(?:0|[1-9])\d*)(?:\.(?:0|[1-9])\d*)?')
FROM UNNEST(['1.2.3', '01976.2.83', '0003.3.4', '0.0.1', '1.2', 'version 5.1.2', '8']) AS version

1.2
01976.2
0003.3
0.0
1.2
5.1
8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants