DDI session - 3rd November 2021
Click the "launch binder" button below to launch the interactive notebook.
Install Python to run code on your computer:
- If you're interested in doing data science or scientific computing, install Anaconda. This will install Python, Jupyter (what we've used today), Spyder (an IDE), together with lots of useful libraries, like pandas, seaborn, and many others. Installing new packages after that is also straightforward with
conda
. - If you just want Python (e.g. for scripting), you can also install it directly. Installing new packages can then be done e.g. with
pip
.
The current stable version of Python is 3.10 (released a few weeks ago). Version 3.8 will also be fully supported until 2024.
- The official Python documentation includes a comprehensive tutorial for beginners.
- Two excellent free online books by Jake VanderPlas:
- Software Carpentry is a non-profit which run regular workshops to teach Python (and other things!), for different levels of experience. All their teaching materials are open-source and freely available online, they're great to follow along for self-teaching. The Edinburgh branch is also quite active and holds regular workshops.
- The pandas documentation has excellent Getting Started tutorials and user guides. In particular, I'd recommend the tutorial "10 minutes to Pandas".
- The seaborn documentation also has great tutorials and a showcase gallery.
- scikit-learn is a fantastic library for machine learning, it comes with lots of tools and algorithms for preprocessing, classification, regression, etc. TensorFlow is also widely used. PyTorch is great for deep learning as it can take advantage of parallel architectures.
- For less data-oriented scientific computing, libraries like NumPy and SciPy (also with a great tutorial page) are widely used, together with matplotlib for plotting.