Python for busy scientists. From "Hello World" to machine learning.
Our mascot: Hazel the Green Tree Python.
This is a Python crash course for complete beginners. In the first two weeks we will learn the fundamentals of Python using the freely available book Automate the Boring Stuff with Python (ATBS). The third and fourth week we will cover the basics of Python's data science ecosystem, enough to set you up to independently analyze and visualize data.
This is designed to be a "flipped" class rather than a traditional lecture-style course. That is, each week you will spend time learning about, and practicing, new material outside of class. Class time will be spent reinforcing key points, going over any remaining questions you have about the material, and reviewing code.
Here is how the class is structured by week:
This week you will set up your programming environment. We will be using Jupyter notebooks, which have become the de facto tool of choice for sharing code. For this week, just go to getting_started.md and follow the instructions there.
Now that you can use Python and run Jupyter notebooks, it's time to start digging into Python. We will cover the basics of Python, as well as the important topic of flow control (for example, using for
loops). This material corresponds to Chapters One and Two of Automate the Boring Stuff, and is covered in week1.ipynb.
We will finish our overview of the fundamentals of Python with a review of functions and the main types of data structures (lists, tuples, dictionaries, and strings). These topics correspond to Chapters 3-6 of Automate the Boring Stuff, and can be found in the Jupyter notebook week2.ipynb.
Once you've learned the fundamentals of Python, you will be ready to start doing things with data. We'll learn about virtual environments, as well as Python's vaunted data science ecosystem. After reviewing the different tools in the ecosystem, we will dive in and start to learn about the most important tools. In Week 3 you will learn about numpy
(a powerful numerical computing library) and matplotlib
(Python's main plotting library). For this week, see week3.ipynb, which includes links to excellent introductions to these libraries, as well as a chance to use them within the notebook. As always, practice is key!
This final week, we will finish our tour of the most important components of the data science ecosystem in Python. First, an overview of pandas
(a tool for analyzing tabular data sort of like you find in Excel). Second, we will learn about the powerful machine learning library scikit-learn
. After this week, you will be comfortable not only with the Python standard library, but also importing and using tools in the data science ecosystem when you need them!
If you are ready to learn Python, the way to learn is to do: coding is mostly muscle memory, so the best way to learn is via sustained practice. I recommend jumping in with Week 0, and hopping on over to getting_started.md and following the instructions there. Good luck!
- Developed with the support from NIH Bioinformatics and the Neurobehavioral Core at NIEHS.
- Green Tree Python image is from Wikimedia.