Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mitigate corruption of scores on save #24908

Open
4 tasks done
krasko78 opened this issue Sep 24, 2024 · 14 comments · Fixed by #24911 · May be fixed by #25458
Open
4 tasks done

Mitigate corruption of scores on save #24908

krasko78 opened this issue Sep 24, 2024 · 14 comments · Fixed by #24911 · May be fixed by #25458
Assignees
Labels
corruption Issues involving file corruption feature request Used to suggest improvements or new capabilities

Comments

@krasko78
Copy link
Contributor

Issue type

Opening/saving issue

Description with steps to reproduce

As you know, there have been occasional but not rare reports of corrupted files after saving where the saved files are full of zeros and therefore unusable. Many users have lost work due to this problem with either backups not being available or corrupted as well. Let's try to mitigate this and find out what is causing it.

What we have so far:

  • As far as I have read, the issue happens on Windows only.
  • We don't know when, how and why it happens.
  • The corrupted files are always all-zeroes. it is not about a corrupt score in the sense of a corrupt xml or beats in a measure.
  • Since the saved file (albeit corrupt) causes MSS to think the save was successful, the issue goes unnoticed until the user reopens the score. As a result, we don't know if retrying the save would see the same corruption or succeed.
  • Currently, we directly write the output file in zip format to a separate file ending in "saving". At the end of the process, we simply replace the original score file with this "saving" one. We don't know if the corruption happens before the replacement or upon the replacement. TBH, upon the replacement seems very unlikely to me but since we don't know anything, we cannot rule out anything.

I am proposing the following changes:

  • When saving the file, MSS will check first the "saving" file for validity before replacing the original mscz file with it (and only if the validation passes of course)
  • after the replacement, MSS will check the replaced file (i.e. the original mscz file) for validity as well.
  • if any of the validation check fails, MSS will show a proper dialog to the user and the file will remain marked as unsaved.

The validation check could vary. I am thinking that just reading the main score file from the zipped file without parsing it should be enough. (Again, we are dealing with only one scenario where the entire file is full of 0-s).

For the testing, I propose that we have MSS deliberately write a file full of zeros if the file name ends with "- CORRUPTED.mscz" for example (at least this is what I have already implemented). For the release, we will remove this code.

For the dialog to show to the user, I am thinking of a series of dialogs but as @cbjeukendrup mentioned, this might be too much. My reasoning for a series of small dialogs (like a wizard to walk the user through the corruption situation) is this:

  • this is a serious situation that the user must be made well aware of
  • we should ensure the user will possibly be able to keep their changes
  • we could make MSS retry the save automatically, also save the score as uncompressed folder and/or XML
  • if all fails, we should ask the user to report the problem to us in the forum
  • in any case, we'd like to have at least some users post their uncompressed scores to us, tell us which file got corrupted (before or after the replacement), etc. so that we hopefully know more about the issue.

This seems a lot of information to me that will hardly fit in one dialog and even if it does fit, it might be overwhelming to the user.

P.S. I have implemented most of the propsed changes already except for the dialogs (a simple dialog for now).

Supporting files, videos and screenshots

n/a

What is the latest version of MuseScore Studio where this issue is present?

4.4.2

Regression

No.

Operating system

Windows

Additional context

No response

Checklist

  • This report follows the guidelines for reporting bugs and issues
  • I have verified that this issue has not been logged before, by searching the issue tracker for similar issues
  • I have attached all requested files and information to this report
  • I have attempted to identify the root problem as concisely as possible, and have used minimal reproducible examples where possible
@wizofaus
Copy link
Contributor

Surely we should just fix the underlying problem, I don't know of any other application, Windows or otherwise, that somehow ends up writing all 0s to a file when it's meant to actually store useful data.
Is there really no set of steps to replicate at all?

@krasko78
Copy link
Contributor Author

Well... The issue goes unnoticed until the user opens a corrupt score file. So, no, we don't have any steps for replication, at least not that I am aware of. With the proposed changes, we might shed some light if we get the users to cooperate. There are too many unknowns. It will be very interesting first to know whether the save will suceed or not if the user retries it immediately. We also need to know if the corruption happens on the temp file or after the temp file is moved over the original file. And receive some uncompressed folders at the moment of the failed save.

@krasko78
Copy link
Contributor Author

I've added a Draft pull request for people to look at what I have so far and help me refine it. So far I've only covered the save the entire score locally scenario.
Try saving any score with " - CORRUPTED" added at the end of the filename before the mscz extension (do not include the double quotes, case is important. The characters are: space, hyphen, space, CORRUPTED

krasko78 added a commit to krasko78/MuseScore that referenced this issue Sep 24, 2024
@cbjeukendrup
Copy link
Contributor

cbjeukendrup commented Sep 25, 2024

Is there really no set of steps to replicate at all?

No; that's the problem. This issue has existed for years now, even in MS3 times, and thus probably also before. See also #16339 (which also states that there are more variants of this issue than the "only zeros" variant).

I've looked at zipcontainer.cpp (which is based on Qt, iirc) multiple times, but haven't really spotted anything that looks wrong. In the hypothetical event that anyone ever has too much time, they could scrutinise every line of that file in detail, to find out if that might be the culprit. But we don't even know if the culprit will be in that file or somewhere else. So the search space for a theoretical analysis is at the moment just too big. Therefore we can't really do anything else than resorting to solutions like the proposed one, and adding some logging and calls-to-actions to users, in the hope that this will bring us closer to the core of the problem and reduce the search space.

@MarcSabatella
Copy link
Contributor

FWIW, I actually don’t remember there were any reports of this prior to MU3. And not even immediately upon release of 3.0 - maybe not for another year or so. I could be wrong, maybe @Jojo-Schmitz has more insight. But it’s been frustrating as indeed no one has any clue how to reproduce this.

@wizofaus
Copy link
Contributor

wizofaus commented Sep 25, 2024

When was the mscz format first introduced? I assume it definitely only happens with that, and not mscx files...

@MarcSabatella
Copy link
Contributor

MarcSabatella commented Sep 25, 2024

Before my time. And yes, I've never heard of this reported with MSCX< although that's surprising because only a handful of people in the world use that with any regularity. But again, we had no problem with MSCZ for years, then suddenly started seeing it pretty regularly - a report every few weeks or so.

A possible clue is that while it's exceedingly rare ovewall, there are people who say it has happened multiple times. That doesn't seem likely to be random chance. So I'm guessing it's some interaction involving a system dependency. Like some backup service being run, or a particular device driver, etc.

@Jojo-Schmitz
Copy link
Contributor

When was the mscz format first introduced? I assume it definitely only happens with that, and not mscx files...

See
https://musescore.org/blog/2008/09/21/musescore-093-released

So since 0.9.3, September 2008

@Jojo-Schmitz
Copy link
Contributor

Jojo-Schmitz commented Sep 25, 2024

FWIW, I actually don’t remember there were any reports of this prior to MU3. And not even immediately upon release of 3.0 - maybe not for another year or so. I could be wrong, maybe @Jojo-Schmitz has more insight. But it’s been frustrating as indeed no one has any clue how to reproduce this.

Seems indeed to have started with 3.0, even with the Alpha, see https://musescore.org/en/node/271185

Nah, much longer, see https://musescore.org/en/node/38096, November 2014

See also #16339

@mercuree
Copy link
Contributor

I assume that Musescore does not write null bytes to disk due to a software bug in Musescore.

  1. The problem occurs not only in Musescore. For example, notepad++ encountered such a problem Notepad++ is corrupting files and opening them with null values. notepad-plus-plus/notepad-plus-plus#4655 and there was long discussion about such issue, which may be useful.
  2. I myself encountered a similar problem many years ago, when after a blue screen my internet browser configuration file was filled with null bytes instead of settings.
  3. The problem may be associated with some kind of failure, reboot or sleep mode, unexpected shutdown, or battery discharge.

So I have doubts that checking the file after writing will help avoid the issue or improve the situation.

@krasko78
Copy link
Contributor Author

Thank you all for the discussion and the links. And for your review comments, I'll address those shortly.
Shouldn't #23807 be closed?

Wow, I didn't know there were corruptions other than the all zeros scenario and that this issue is much older than I thought!

@mercuree Checking the file won't avoid the issue but will significantly improve the situation. First of all, the user will be alerted so the issue won't be going unnoticed any more. Then, they will be able to retry the save which might succeed: if it does, great for the user, great for us as we will then know it. If the retry doesn't fix it, and the corruption happens before the original file is overwritten (which I very much hope and believe), the users will still have their last successful save intact. If the retry doesn't fix the issue and the corruption happens after the move, at least we will alert the user (and know it). At this point they can try saving to a different location, save as an uncompressed dir to try to save their score, contact us and send us something, etc. If the retry doesn't fix it, it might be a repeatable and reproducible problem with the users' scores at that moment, meaning if they find a way to send us what they are trying to save, we might be able to reprodice it and catch it. If retrying the save doesn't fix the issue, a user could also undo their last change in MSS and try saving again, then undo further and see if at some point the file will save (after ensuring they have a backup of their original file).

What are the issues with saving to an uncompressed folder?

@krasko78
Copy link
Contributor Author

Do we want to verify anything besides the main score file without parsing it (which is what I have so far)? Like enumerate all the files in the zip and read them?

@wizofaus
Copy link
Contributor

I looked at the Notepad++ fix for what ostensibly may be the same issue (though it seems likely there's more than one root cause by the many bug reports received). Basically they add an extra flag "c" to the fopen mode parameter, which seems to be a Windows-specific thing to change the commit behaviour on calling fflush, and then an explicit call to fflush before fclose. MuseScore doesn't use fopen/fflush though, rather QFile (part of Qt), which internally doesn't even use fopen/fflush for Windows, and even explicitly has a comment (in qfsfileengine_win.cpp, nativeFlush) about not needing to flush under Windows. QFile does seem to support working in "stdlib" mode, whereby the application calls fopen and passes the file handle in though, not sure if it's worth trying out.

The other thing I'd note that may be relevant is that when running under the debugger I've noticed that MuseScore quite often "crashes" on shutdown in a background thread - but the default Windows behaviour is such that if you don't have a debugger attached it's not even obvious to a user that such a crash has occurred. I don't know if MacOS or Linux is similar in this regard. It seems possible that if the main thread is trying to save your score but a crash happens in a background thread you're likely to end up with a corrupted file...but in that case it would be while writing to .mscz_saving, and presumably the attempt to rename it to .mscz would never happen. So it's doubtful that's a root cause.

It does seem that having some sort of sanity check before renaming the .mscz_saving file to .mscz (in NotationProject::doSave) would be worth considering though.

krasko78 added a commit to krasko78/MuseScore that referenced this issue Sep 25, 2024
krasko78 added a commit to krasko78/MuseScore that referenced this issue Sep 26, 2024
krasko78 added a commit to krasko78/MuseScore that referenced this issue Sep 27, 2024
…e and only if unsuccessful to not degrade save times + code review changes
krasko78 added a commit to krasko78/MuseScore that referenced this issue Sep 27, 2024
…e and only if unsuccessful to not degrade save times + code review changes
krasko78 added a commit to krasko78/MuseScore that referenced this issue Sep 27, 2024
…e and only if unsuccessful to not degrade save times + code review changes
krasko78 added a commit to krasko78/MuseScore that referenced this issue Oct 5, 2024
krasko78 added a commit to krasko78/MuseScore that referenced this issue Oct 5, 2024
krasko78 added a commit to krasko78/MuseScore that referenced this issue Oct 5, 2024
… out in the prescribed order (not on Windows)
krasko78 added a commit to krasko78/MuseScore that referenced this issue Oct 6, 2024
…iginal file and not the temporary "_saving" file
krasko78 added a commit to krasko78/MuseScore that referenced this issue Oct 7, 2024
krasko78 added a commit to krasko78/MuseScore that referenced this issue Oct 14, 2024
krasko78 added a commit to krasko78/MuseScore that referenced this issue Oct 14, 2024
krasko78 added a commit to krasko78/MuseScore that referenced this issue Oct 18, 2024
krasko78 added a commit to krasko78/MuseScore that referenced this issue Oct 19, 2024
@RomanPudashkin RomanPudashkin added feature request Used to suggest improvements or new capabilities corruption Issues involving file corruption labels Oct 21, 2024
RomanPudashkin added a commit that referenced this issue Oct 22, 2024
Leo-Cal pushed a commit to Leo-Cal/MuseScore that referenced this issue Oct 24, 2024
Leo-Cal pushed a commit to Leo-Cal/MuseScore that referenced this issue Oct 24, 2024
Leo-Cal pushed a commit to Leo-Cal/MuseScore that referenced this issue Oct 24, 2024
Leo-Cal pushed a commit to Leo-Cal/MuseScore that referenced this issue Oct 24, 2024
…e and only if unsuccessful to not degrade save times + code review changes
Leo-Cal pushed a commit to Leo-Cal/MuseScore that referenced this issue Oct 24, 2024
Leo-Cal pushed a commit to Leo-Cal/MuseScore that referenced this issue Oct 24, 2024
… out in the prescribed order (not on Windows)
Leo-Cal pushed a commit to Leo-Cal/MuseScore that referenced this issue Oct 24, 2024
…iginal file and not the temporary "_saving" file
Leo-Cal pushed a commit to Leo-Cal/MuseScore that referenced this issue Oct 24, 2024
Leo-Cal pushed a commit to Leo-Cal/MuseScore that referenced this issue Oct 24, 2024
Leo-Cal pushed a commit to Leo-Cal/MuseScore that referenced this issue Oct 24, 2024
Leo-Cal pushed a commit to Leo-Cal/MuseScore that referenced this issue Oct 24, 2024
Leo-Cal pushed a commit to Leo-Cal/MuseScore that referenced this issue Oct 24, 2024
@krasko78
Copy link
Contributor Author

krasko78 commented Nov 4, 2024

Hi all! I've just realized something. When we show the dialog about the corrupted save, MSS has already corrupted the original score but there should be a healthy backup file in the backup folder. Now, if the user decides to retry the save by clicking "Retry" on the dialog, which is the expected behavior, then if the retried save fails too, this time the healthy backup file will be overwritten with the now corrupted original score. I am wondering if we want to reopen this issue and quickly make a change such that the retried saves DO NOT create a backup file (unless there are other ideas)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
corruption Issues involving file corruption feature request Used to suggest improvements or new capabilities
Projects
Status: In Progress
7 participants