-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add comment about setting a default prefix that isn't just meta.id #2608
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for nf-core-main-site ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
failing playwright tests can be ignored for now |
|
||
```nextflow | ||
script: | ||
if ("$bam" == "${prefix}.bam") error "Input and output names are the same, set prefix in module configuration to disambiguate!" | ||
``` | ||
|
||
- If the input and output files are likely to have the same name, then an appropriate default prefix may be set, for example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- If the input and output files are likely to have the same name, then an appropriate default prefix may be set, for example: | |
- If the input and output files are likely to have the same name, then an appropriate default prefix MAY be set, for example: |
I feel this should be left for the developer to decide how the resulting file should be called. The error should make them aware of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting the default prefix does give the control to the developer, as they can overwrite it in the modules.config. The problem is when it's hard-coded into the output path.
I also added the -C
bash flag to the shell directive in the template so that should also prevent accidental clobbering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is when it's hard-coded into the output path.
I don't think I follow... isn't the suggestion here to technically embed a hardcoded string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the example clearly shows prefix having a default of "${meta.id}_sorted"
. If I wanted something different, I just update in the config
ext.prefix = { "${meta.id}_mysorted" }
The command should still look like:
mycommand --input $file > ${prefix}.out
but the file goes from <meta.id>_sorted.out
to <meta.id>_mysorted.out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is setting a default, but can be entirely overwritten in the usual way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disagree on both points (1. it should, pipeline devs should be very aware of the output files, 2. I don't think we should sacrifice flexibility just to avoid a small config file).
Bbut like I said - 'small mound' 😆 Maybe ask for one more opinion and you can merge
I mean, we could say that pipeline devs should make each module from scratch because that would be purer. I don't understand your objection here. This sacrifices no flexibility whatsoever, it defines a default prefix that can then be overwritten in exactly the usual way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't understand the objections. Is there any way you can clarify in a toy example of what the objection is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I follow James here. IMO it's better to have a common way of doing things in modules so you don't have to check how it's done in every module. If I know every module has the meta.id
as default, then I don't have to check the default every time. Adding different defaults could become confusing for some developers. But also not a hill I'm willing to die on :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My experience with using nf-core modules, when written by someone else is usually that I'll have to check the code anyway to know what the output channels are named at the least, and what inputs I'm expecting. If I then see the
if ( file.name == "${prefix}.ext" ){
I'll know there's a possibility for a filename collision. However, this is where I would expect the module to have a sensible default not to have filename collision if I just plug and play. More often than not though, the current state is that once I use it, I'll discover the default is not actually sensible and have to modify my modules.config to deal with it. This to me is a time waster. If we did use "sensible defaults", rather than "meta.id", then I'll see when I inspect the module to change prefix
if I wanted it differently, but I'd rather not assume that ${prefix}.ext
will result in a filename collision if I don't set my own prefix anyway. That's at least my experience, which is why I'm for this update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should move this discussion to the next maintainers meeting? This seems like a topic that could fit very well in the meetings
What is the state of this docs update, @SPPearce ? |
ah, yes, remember now. let's try to bring it up in the next meeting then. |
Sorry for being difficult 😁 let's talk about it in the next meeting indeed :) |
Co-authored-by: Nicolas Vannieuwkerke <[email protected]>
✅ Deploy Preview for nf-core-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
#small-mound-i-feel-strongly-about-but-not-enough-to-fight-everyone |
The issue is more we still don't understand the objection. Since you're experienced, at least I feel like, there is something we're not understanding about your viewpoint. We've established though that the developer still has control. |
I can't remember the depths of the discussion anymore, but mostly it's a development style. It firstly irks me because I firstly hate pipelines that tag loads of extra suffixes at the end ( But the main reason is: I strongly feel that developers should think very carefully about how output is presented to users. Automating output names with defaults to making the development experience 'easier' at the expense user experience is not a good practice in my personal opinion. Sure, it can be overridden by an organised pipeline developer, however most people are too busy and will take any opportunity to skip steps if they can. By allowing default prefixes it'll make people not think about this, and thus not think carefully about what goes in the results directory etc and just result in big mess and harder and less attractive to use the output. This of course would be mitigated with good documentation, but that still remains poor across all of bioinformatics (/rant). I personally prefer having a hard error when having a name conflict, as it forces the developer to think carefully about the name, then should it even be presented to the user, and then logically where should it go in the output directory etc etc. Rather than 'hoping' they'll do some TLC. But I do recognise that this is a personal opinion/development style about a relatively minor point so I won't block it for that reason when I've been outvoted ;). |
Agreed. Perhaps this should be a pipeline linting check (warning) or an nf-test check. Split filenames on underscores/periods/non-alpha-numeric characters and check the number of unique parts against total (duplicate words) and that the total is not more than say 7 (ultra-long names).
I think we all agree on the first sentence. I disagree that this change is at expense of user experience though. While it makes developer experience better by not automatically resulting in a filename collision, the user should see at most one Conversely though, are we currently making the user experience better by how we're setting the
I think this is the key part though. Putting in developer roadblocks means the developer has to directly act. I don't think it'll force them to put TLC into it though. Workflow design ( including naming outputs, etc) is still a skill imo. Sorry, I guess this wall of text wasn't necessary, but maybe there is one thing we still need to do, and that's define how much the prefix can differ from |
@netlify /docs/guidelines/components/modules