Assignments Week 7

Introduction

The goal of this assignment is to perform a protein database search for one of the datasets that you are working with for your group project. This search should identify a range of peptides and their corresponding proteins.

Input data

You can use the .raw files from any of the Neisseria datasets, which can be downloaded from PRIDE. In addition, you will need a .fasta file for the corresponding reference proteome (downloaded from UniProt).

Tasks and output files

Generate a target-decoy database for the reference proteome that you are working with (together with the cRAP database of common contaminants)
Convert the downloaded .raw files into .mzML files.
Use the .mzML files together with the target-decoy data as input for a protein database search (using a tool of your choice, e.g. Ursgal, SearchGUI, MaxQuant, FragPipe, …).
Use statistical postprocessing (e.g. Percolator, Philosopher, MaxQuant, …) to calculate FDRs and/or PEPs for the peptide-spectrum matches generated in 1). Filter for a PEP (or FDR) of 1%.
Count the number of identified peptides and their corresponding proteins based on the result file generated.
Make sure to comment your code, so that others can read and understand it easily.
Create a README file describing how to run your code. Include requirements (e.g. Python packages that need to be installed) in that description, or as a separate requirements.txt file.
Commit your scripts, and final result files to your GitHub Classroom repository. Do NOT commit large input files (like .raw, .mzML, etc)

Additional MS student tasks (bonus credit for BS students)

Plot an annotated spectrum for any of the identified peptide spectrum matches.

Submission

You must submit the assignment through GitHub Classroom by 8 am Mar 7 to get full credit.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Assignments Week 7

Introduction

Input data

Tasks and output files

Additional MS student tasks (bonus credit for BS students)

Submission

About

Releases

Packages

Schulze-lab/Homework-7-protein-identification

Folders and files

Latest commit

History

Repository files navigation

Assignments Week 7

Introduction

Input data

Tasks and output files

Additional MS student tasks (bonus credit for BS students)

Submission

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages