Fix: Prevents `MediawikiApi::ApiError Could not normalize image parameters filename.pdf. (urlparamnormal)` error #6187

empty-codes · 2025-02-08T17:45:14Z

What this PR does

Refactors get_urls method to assign placeholder thumbnails to .pdf and .djvu files and only query mediawiki accepted file formats and adds a save_placeholder_thumbnail method that assigns placeholder values

Doing this solves the MediawikiApi::ApiError: Could not normalize image parameters filename.pdf. (urlparamnormal)

The logic used was derived from @ragesoss 's logic in this commit used to solve issue #330 : 'Upload importer attempts to get thumbnails over and over for invalid files'. It was later removed as per issue #699 : 'Commons spec is failing' in this commit as mediawiki handled it. It has however resurfaced again.

Screenshots

Before:

I tested a failing set of pageids:

I then cloned the course that experienced the error: https://outreachdashboard.wmflabs.org/courses/Wikimedia_Indonesia/1Lib1Ref_di_Indonesia_Januari_2025/uploads and altered the code so the upload thumbnails would not be added and ran a manual update:

After refactoring:

Proof (logs / info) it works (I extracted snippets):

File:Van Dorp's Officieele Reisgids voor Spoor- en Tramswegen op Java (1900).pdf
[2025-02-08 17:05:36.119 DEBUG] CommonsUpload Load (5.5ms)  SELECT `commons_uploads`.* FROM `commons_uploads` WHERE `commons_uploads`.`file_name` = 'File:Van Dorp\'s Officieele Reisgids voor Spoor- en Tramswegen op Java (1900).pdf' LIMIT 1
#<CommonsUpload:0x00007f8958cc87e0>
[2025-02-08 17:05:36.209 DEBUG] TRANSACTION (0.3ms)  BEGIN
[2025-02-08 17:05:36.216 DEBUG] CommonsUpload Update (0.5ms)  UPDATE `commons_uploads` SET `commons_uploads`.`updated_at` = '2025-02-08 16:05:36', `commons_uploads`.`thumburl` = 'https://upload.wikimedia.org/wikipedia/commons/thumb/6/6c/No_image_3x4.svg/200px-No_image_3x4.svg.png', `commons_uploads`.`thumbwidth` = '200', `commons_uploads`.`thumbheight` = '150' WHERE `commons_uploads`.`id` = 158163184
[2025-02-08 17:05:36.250 DEBUG] TRANSACTION (2.6ms)  COMMIT
File:Van Dorp's Officieele Reisgids voor Spoor- en Tramswegen op Java (1898).pdf
[2025-02-08 17:05:36.268 DEBUG] CommonsUpload Load (0.9ms)  SELECT `commons_uploads`.* FROM `commons_uploads` WHERE `commons_uploads`.`file_name` = 'File:Van Dorp\'s Officieele Reisgids voor Spoor- en Tramswegen op Java (1898).pdf' LIMIT 1
#<CommonsUpload:0x00007f8958c743e8>
[2025-02-08 17:05:36.286 DEBUG] TRANSACTION (0.2ms)  BEGIN
[2025-02-08 17:05:36.292 DEBUG] CommonsUpload Update (0.4ms)  UPDATE `commons_uploads` SET `commons_uploads`.`updated_at` = '2025-02-08 16:05:36', `commons_uploads`.`thumburl` = 'https://upload.wikimedia.org/wikipedia/commons/thumb/6/6c/No_image_3x4.svg/200px-No_image_3x4.svg.png', `commons_uploads`.`thumbwidth` = '200', `commons_uploads`.`thumbheight` = '150' WHERE `commons_uploads`.`id` = 158163185
[2025-02-08 17:05:36.313 DEBUG] TRANSACTION (1.5ms)  COMMIT
File:Officieele reisgids der spoor- en tramwegen en aansluitende automobieldiensten op Java en Madoera (1935).pdf
[2025-02-08 17:05:36.329 DEBUG] CommonsUpload Load (0.9ms)  SELECT `commons_uploads`.* FROM `commons_uploads` WHERE `commons_uploads`.`file_name` = 'File:Officieele reisgids der spoor- en tramwegen en aansluitende automobieldiensten op Java en Madoera (1935).pdf' LIMIT 1
#<CommonsUpload:0x00007f8958c7d4e8>
[2025-02-08 17:05:36.345 DEBUG] TRANSACTION (0.2ms)  BEGIN
[2025-02-08 17:05:36.354 DEBUG] CommonsUpload Update (0.7ms)  UPDATE `commons_uploads` SET `commons_uploads`.`updated_at` = '2025-02-08 16:05:36', `commons_uploads`.`thumburl` = 'https://upload.wikimedia.org/wikipedia/commons/thumb/6/6c/No_image_3x4.svg/200px-No_image_3x4.svg.png', `commons_uploads`.`thumbwidth` = '200', `commons_uploads`.`thumbheight` = '150' WHERE `commons_uploads`.`id` = 158172982
[2025-02-08 17:05:36.380 DEBUG] TRANSACTION (1.6ms)  COMMIT

=> [{"pageid"=>158196457,
  "ns"=>6,
  "title"=>"File:Aerial view of Halte Bendo Kediri.tiff",
  "imagerepository"=>"local",
  "imageinfo"=>
   [{"thumburl"=>
      "https://upload.wikimedia.org/wikipedia/commons/thumb/0/0d/Aerial_view_of_Halte_Bendo_Kediri.tiff/lossy-page1-480px-Aerial_view_of_Halte_Bendo_Kediri.tiff.jpg",
     "thumbwidth"=>480,
     "thumbheight"=>480,
     "responsiveUrls"=>
      {"1.5"=>
        "https://upload.wikimedia.org/wikipedia/commons/thumb/0/0d/Aerial_view_of_Halte_Bendo_Kediri.tiff/lossy-page1-720px-Aerial_view_of_Halte_Bendo_Kediri.tiff.jpg",
       "2"=>
        "https://upload.wikimedia.org/wikipedia/commons/thumb/0/0d/Aerial_view_of_Halte_Bendo_Kediri.tiff/lossy-page1-960px-Aerial_view_of_Halte_Bendo_Kediri.tiff.jpg"},
     "url"=>"https://upload.wikimedia.org/wikipedia/commons/0/0d/Aerial_view_of_Halte_Bendo_Kediri.tiff",
     "descriptionurl"=>"https://commons.wikimedia.org/wiki/File:Aerial_view_of_Halte_Bendo_Kediri.tiff",
     "descriptionshorturl"=>"https://commons.wikimedia.org/w/index.php?curid=158196457"}]},
     {"pageid"=>158445924,
  "ns"=>6,
  "title"=>"File:Foto uit een album over de suikeronderneming Pesantren - (cropped).png",
  "imagerepository"=>"local",
  "imageinfo"=>
   [{"thumburl"=>
      "https://upload.wikimedia.org/wikipedia/commons/thumb/a/ac/Foto_uit_een_album_over_de_suikeronderneming_Pesantren_-_%28cropped%29.png/854px-Foto_uit_een_album_over_de_suikeronderneming_Pesantren_-_%28cropped%29.png",
     "thumbwidth"=>854,
     "thumbheight"=>480,
     "responsiveUrls"=>
      {"1.5"=>
        "https://upload.wikimedia.org/wikipedia/commons/thumb/a/ac/Foto_uit_een_album_over_de_suikeronderneming_Pesantren_-_%28cropped%29.png/1281px-Foto_uit_een_album_over_de_suikeronderneming_Pesantren_-_%28cropped%29.png",
       "2"=>"https://upload.wikimedia.org/wikipedia/commons/a/ac/Foto_uit_een_album_over_de_suikeronderneming_Pesantren_-_%28cropped%29.png"},
     "url"=>"https://upload.wikimedia.org/wikipedia/commons/a/ac/Foto_uit_een_album_over_de_suikeronderneming_Pesantren_-_%28cropped%29.png",
     "descriptionurl"=>"https://commons.wikimedia.org/wiki/File:Foto_uit_een_album_over_de_suikeronderneming_Pesantren_-_(cropped).png",
     "descriptionshorturl"=>"https://commons.wikimedia.org/w/index.php?curid=158445924"},
    {"thumburl"=>
      "https://upload.wikimedia.org/wikipedia/commons/archive/a/ac/20250124113940%21Foto_uit_een_album_over_de_suikeronderneming_Pesantren_-_%28cropped%29.png",
     "thumbwidth"=>190,
     "thumbheight"=>135,
     "url"=>"https://upload.wikimedia.org/wikipedia/commons/archive/a/ac/20250124113940%21Foto_uit_een_album_over_de_suikeronderneming_Pesantren_-_%28cropped%29.png",
     "descriptionurl"=>"https://commons.wikimedia.org/wiki/File:Foto_uit_een_album_over_de_suikeronderneming_Pesantren_-_(cropped).png",
     "descriptionshorturl"=>"https://commons.wikimedia.org/w/index.php?curid=158445924"}]},
 {"pageid"=>158659677,
  "ns"=>6,
  "title"=>"File:LL-Q13324 (min)-Zhilal Darma-nabu.wav",
  "imagerepository"=>"local",
  "imageinfo"=>
   [{"thumburl"=>"https://commons.wikimedia.org/w/resources/assets/file-type-icons/fileicon-ogg.png",
     "thumbwidth"=>400,
     "thumbheight"=>400,
     "url"=>"https://upload.wikimedia.org/wikipedia/commons/5/5d/LL-Q13324_%28min%29-Zhilal_Darma-nabu.wav",
     "descriptionurl"=>"https://commons.wikimedia.org/wiki/File:LL-Q13324_(min)-Zhilal_Darma-nabu.wav",
     "descriptionshorturl"=>"https://commons.wikimedia.org/w/index.php?curid=158659677"}]},

Test

Before:

After:

… and `.djvu` files and only query mediawiki accepted file formats - Adds a `save_placeholder_thumbnail` method that assigns placeholder values - Doing this solves the `MediawikiApi::ApiError Could not normalize image parameters filename.pdf. (urlparamnormal)`

ragesoss · 2025-02-10T19:26:11Z

lib/commons.rb

+      # mediawiki cannot generate thumbnails for pdf,djvu files
+      if title.match?(/\.(pdf|djvu)$/i)
+        bad_file = CommonsUpload.find_by(file_name: title)
+        save_placeholder_thumbnail(bad_file) if bad_file


I don't think is should be done within the get_urls method. The method name (get_something) implies that it's a read-only method will find and return data without changing the database, so this will be an unexpected side effect.

The UploadImporter#import_urls method would be a more appropriate place to handle setting a placeholder thumbnail.

ragesoss · 2025-02-10T19:29:37Z

lib/commons.rb

+
+      # mediawiki can generate thumbnails for jpg,png,tiff,wav files
+      # mediawiki cannot generate thumbnails for pdf,djvu files
+      if title.match?(/\.(pdf|djvu)$/i)


Is there something in the response value that can be used, rather than simply matching for the file suffix? There may be cases where the error happens even with a different filetype, and the set of non-supported filetypes might change over time. It would be better to look for the error/warning within the response, instead of assuming based on file type. If we just wanted to do this based on file time, we could instead set placeholder at the time the file gets saved, before even making a query for the thumbnail. But I think relying on mediawiki to determine whether a thumbnail can be made is better.

The file names are gotten with the build_info_query method which just returns the page id and titles, this is an example response.

I've tried changing the parameters of the image_info_query, if the iiurlheight parameter is removed for example, the error goes but it returns only the url of the files and not their thumburl for both image and non-image files meaning this line in import_urls:
file.thumburl = file_url['imageinfo'][0]['thumburl'] would not work as the imageinfo array would not have the thumburl property.

It seems that to get the thumburl in the response, iiurlheight / iiurlwidth must be added as a parameter. But apparently the non-image files don't have these properties hence the error.

Another possible way could be to query for mediatype as a iiprop and if it is a OFFICE mediatype which is what .pdf and .djvu fall under as seen here, add it to the non-image files. This is a list of mediawiki media types.

To my knowledge, the only way to match Could not normalise image parameters for.... is to allow the error happen and then handle/rescue it. My concern is that the error is raised and sent to sentry in wiki_api at this line @mediawiki.send(action, query), so if we want to match Could not normalise image parameters for..' directly, that means rescuing and handling the error in wiki_api's mediawikimethod, then in commons'sapi_get` which is what you did here.

So please which way do you think is most suitable?

Okay. I'd really like to avoid this kind of complex error handling, if possible. Do you have a query that triggers it? I'd like to look at the mediawiki response.

This one does: https://commons.wikimedia.org/w/api.php?action=query&format=json&prop=imageinfo&continue=&iilimit=50&iiprop=url&iiurlheight=480&pageids=158659677%7C158782155%7C158782222%7C158782224%7C158163184%7C158163185%7C158172982%7C158196457%7C158445883%7C158445924

This one too: https://commons.wikimedia.org/w/api.php?action=query&format=json&prop=imageinfo&continue=&iilimit=50&iiprop=url&iiurlheight=480&pageids=158568398%7C158598249%7C158600123%7C158600513%7C158605286%7C158606782%7C158607510%7C158638796%7C158639184%7C159318962

ragesoss reviewed Feb 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Prevents `MediawikiApi::ApiError Could not normalize image parameters filename.pdf. (urlparamnormal)` error #6187

Fix: Prevents `MediawikiApi::ApiError Could not normalize image parameters filename.pdf. (urlparamnormal)` error #6187

empty-codes commented Feb 8, 2025

ragesoss Feb 10, 2025

ragesoss Feb 10, 2025

empty-codes Feb 11, 2025 •

edited

Loading

ragesoss Feb 11, 2025

empty-codes Feb 11, 2025 •

edited

Loading

Fix: Prevents MediawikiApi::ApiError Could not normalize image parameters filename.pdf. (urlparamnormal) error #6187

Are you sure you want to change the base?

Fix: Prevents MediawikiApi::ApiError Could not normalize image parameters filename.pdf. (urlparamnormal) error #6187

Conversation

empty-codes commented Feb 8, 2025

What this PR does

Screenshots

Before:

After refactoring:

Proof (logs / info) it works (I extracted snippets):

Test

ragesoss Feb 10, 2025

Choose a reason for hiding this comment

ragesoss Feb 10, 2025

Choose a reason for hiding this comment

empty-codes Feb 11, 2025 • edited Loading

Choose a reason for hiding this comment

ragesoss Feb 11, 2025

Choose a reason for hiding this comment

empty-codes Feb 11, 2025 • edited Loading

Choose a reason for hiding this comment

Fix: Prevents `MediawikiApi::ApiError Could not normalize image parameters filename.pdf. (urlparamnormal)` error #6187

Fix: Prevents `MediawikiApi::ApiError Could not normalize image parameters filename.pdf. (urlparamnormal)` error #6187

empty-codes Feb 11, 2025 •

edited

Loading

empty-codes Feb 11, 2025 •

edited

Loading