Skip to content

Creates a datestamped HTML report and a corresponding Excel file listing all Wikipedia articles (in all languages) in which (one or more) images from a given category tree on Wikimedia Commons are used.

License

Notifications You must be signed in to change notification settings

KBNLwikimedia/GLAMorousToHTML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GLAMorousToHTML

Creates a datestamped HTML report and a corresponding Excel file listing all Wikipedia articles (in all languages) in which (one or more) images from a given category tree on Wikimedia Commons are used.

Latest update: 18 September 2024


What does it do?

This repo creates datestamped HTML reports with corresponding Excel files listing all Wikipedia articles (in all languages) in which (one or more) images/media from a given category tree on Wikimedia Commons are used.

Quick example

Here is quick example of such an HTML report and its corresponding Excel file for images from the collection of the KB, national library of the Netherlands. It is datestamped 04-09-2024.


What problem does it solve?

The KB uses the 'classical' GLAMorous tool to measure the use of KB media files (as stored in Wikimedia Commons) in Wikipedia articles. This tool reports 4 things:

  • 1 - The total number of KB media files in Category:Media contributed by Koninklijke Bibliotheek (Category "Media contributed by Koninklijke Bibliotheek" has XXXX files.)
  • 2 - The number of Wikipedia language versions in which KB media files are used (length of the table, omitting non-language Wikipedias, such as 'outreach.wikipedia', 'simple.wikipedia' or 'incubator.wikipedia')
  • 3 - The total number of times that these images show up in Wikipedia articles, in all language versions. (Total image usages).
  • 4 - The number of unique KB media files that are used in Wikipedia articles in all those languages. (Distinct images used)

Please note: 'Total image usages' does NOT equal the number of unique Wikipedia articles! A single unique image can illustrate multiple unique articles, and/or the other way around, 1 unique article can contain multiple distinct images. In other words: images-articles have many-to-many relationships.

What was still missing were functionalities to create

  • 5 - The number of unique Wikipedia articles in which KB media files are used,
  • 6 - A manifest overview of those articles, grouped per Wikipedia language version,
  • 7 - A structured output format that can be easily processed by tools, such as CSV of Excel files.

Bulk/group functionalities:

  • 8 - A method to generate these reports in bulk, so for multiple Commons categories trees at once (with one report per category tree).
  • 9 - Aggregated data and key figure statistics for sets of reports, eg. for grouped reports from a specific country.

That is why we developed the GLAMorousToHTML tool. It takes the XML-output of the GLAMorous tool and processes that data into HTML reports and Excel files.

GLAM reports

The GLAMorousToHTML tool has so for produced GLAM reports for the following heritage institutions, countries and regions:

When interpreting these reports, take note of

Publications

Technical notes

The technical notes give more info about

  1. The structure of the this repo, its files and folders
  2. Short description of their functions
  3. How to run this repo yourself
  4. Change log
  5. Features to be added

Please note that his page is still under construction and is therefore messy and incomplete.

Licensing

All original materials in this repo, expect for the flags, logos and publications are released under the CC0 1.0 Universal license, effectively donating all original content to the public domain.

For the publications listed above : see each article for its exact licensing condition.

Contact

This tool is developed and maintained by Olaf Janssen, Wikimedia coordinator @KB, national library of the Netherlands. You can find his contact details on his KB expert page or via his Wikimedia user page.

If you are interested in getting reports for your own GLAM institution, please send me a message.

About

Creates a datestamped HTML report and a corresponding Excel file listing all Wikipedia articles (in all languages) in which (one or more) images from a given category tree on Wikimedia Commons are used.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages