Skip to content
This repository has been archived by the owner on Aug 20, 2024. It is now read-only.

Add samplesheet examples #82

Merged
merged 7 commits into from
Aug 17, 2023

Conversation

adamrtalbot
Copy link
Collaborator

  • Changes samplesheet examples to start simple and build up
  • Moves splitting example to middle region
  • Tries to harmonise examples to use the same or similar channels. This idea should be taken further.

adamrtalbot and others added 2 commits July 26, 2023 11:10
- Changes samplesheet examples to start simple and build up
- Moves splitting example to middle region
- Tries to harmonise examples to use the same or similar channels. This idea should be taken further.
@adamrtalbot
Copy link
Collaborator Author

Relates to #81

Copy link
Collaborator

@nvnieuwk nvnieuwk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! This will really help, I have one comment :)

Might create the channel:

```nextflow
tuple val(meta), path(fastq_1), path(fastq_2), path(bed)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely convinced this is the cleanest and most clear way of representing the channel structure, maybe something like [meta, fastq_1.fastq, fastq_2.fastq, sample1.bed] will be more clear for beginners?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to match an input/output declaration for a process. Perhaps I could include both. What does the .fastq or .bed mean in your example?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was to emulate a filename, but yeah this is just a matter of representation. I'm not sure which way is the best or most clear to beginners. (although both should be fine I think)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look now, I've added a clarifying statement about what the channel is and how you would use it.

Copy link
Collaborator

@nvnieuwk nvnieuwk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I'm happy with this! Just one small comment and then it LGTM


// Resulting in:
[ [ id: "sample" ], fastq1.R1.fq.gz, fastq1.R2.fq.gz, sample1.bed]
[ [ id: "sample2" ], fastq2.R1.fq.gz, fastq2.R2.fq.gz, null ] // A missing value from the samplesheet is null
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[ [ id: "sample2" ], fastq2.R1.fq.gz, fastq2.R2.fq.gz, null ] // A missing value from the samplesheet is null
[ [ id: "sample2" ], fastq2.R1.fq.gz, fastq2.R2.fq.gz, [] ] // A missing value from the samplesheet is `[]`

Files will default to [] when they aren't given (null will cause errors in processes where the files are used)

@adamrtalbot
Copy link
Collaborator Author

One update worth reviewing - I've added an introduction and glossary section to the top.

@nvnieuwk
Copy link
Collaborator

One update worth reviewing - I've added an introduction and glossary section to the top.

Nice addition! Thanks for contributing :p Let me know when you are fully ready with this so I can merge it

@adamrtalbot
Copy link
Collaborator Author

One last thing I'm unsure about - should we harmonise so all examples use the same example channel(s)? That way it's more consistent.

@nvnieuwk
Copy link
Collaborator

That's a good idea! Although we probably won't be able to apply everything logically on the fastq input channel here, but could be good for the examples that it does make sense for

Copy link
Collaborator

@mirpedrol mirpedrol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of suggestions, but overall looks very good, thanks!!

docs/samplesheets/examples.md Outdated Show resolved Hide resolved
[ [ id: "sample2" ], fastq2.R1.fq.gz, fastq2.R2.fq.gz, [] ] // A missing value from the samplesheet is an empty list
```

This could be used in the Nextflow process as:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence sounds a bit confusing to me, as I would expect seeing the use of the channel instead of the process input declaration. Could we rephrase with something like "This channel can be used as input of a process where the input declaration is:"

docs/samplesheets/examples.md Outdated Show resolved Hide resolved
docs/samplesheets/examples.md Outdated Show resolved Hide resolved
@adamrtalbot
Copy link
Collaborator Author

@mirpedrol @nvnieuwk I don't have permission to merge so I'll need one of you to close this PR.

@mirpedrol
Copy link
Collaborator

Thank you @adamrtalbot!

@mirpedrol mirpedrol merged commit df5a42b into nextflow-io:master Aug 17, 2023
3 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants