Skip to content

Schulze-lab/Homework-2-Accessing-PRIDE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 

Repository files navigation

Assignments Week 2

Introduction

The aim of this assignment is for you to become familiar with proteomics databases like PRIDE. By taking advantage of programmatic access to these databases, you can screen for the datasets that correspond to the organism(s) that you are looking for. While details about the sample processing will need to be extracted manually in the end, this exercise will also give you additional starting points for the group project.

Input data

No input data is needed. You can start directly using the programmatic access to the database.

Tasks and output files

  1. Use programmatic access to PRIDE (e.g. pridepy, or ppx) to find all datasets corresponding to the genus Neisseria.
  2. Count the number of datasets for each species in that genus for which datasets in PRIDE exist.
  3. Report your findings either as a csv table, or a graph (or both).
  4. Make sure to comment your code, so that others can read and understand it easily.
  5. Create a README file describing how to run your code. Include requirements (e.g. Python packages that need to be installed) in that description, or as a separate requirements.txt file.
  6. Commit all your input files, scripts, and result files to your GitHub Classroom repository.

Additional MS student tasks (bonus credit for BS students)

  1. Search for at least two additional species from different genera.
  2. Count the number of datasets per year of publication for each species, and display the results in a graph.

Submission

You must submit the assignment by 8 am Feb 1, 8 am to get full credit.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published