WIP π§ π pre spellcheck
Hello π
After setting/reinstalling a couple of machines from scratch in the last few months, I decided for once and for all to document my default data science settings and tools I typically used.
π‘ A pro tip ππΌ avoid dropping a cup of βοΈ on your machine π€¦π»ββοΈ
That includes installing programming languages such as R, Julia, and Python and their supporting IDEs RStudio and VScode. In addition, set the terminal, git, and install supporting tools such as iTerm2, oh-my-zsh, Docker, etc.
Update: This setting is up-to-date with macOS Ventura. However, most of the tools in this document should be OS agnostic (e.g., Windows, Linux, etc.) with some minor modifications.
This document covers the following:
- Setting Git and SSH
- Install Command Lines Tools
- Install Docker
- Setting Terminal
- Setting VScode
- Setting Python
- Install R and RStudio
- Shortcuts
- Setting Postgres
This section focuses on the core git settings, such as global definitions and setting SSH with your Github account.
All the settings in the sections are done through the command line (unless mentioned otherwise).
Let's start by checking the git
version running the following:
git --version
If this is a new computer or you did not set it before, it should prompt a window and ask you if you want to install the command line developer tools
:
The command line developer tools
is required to run git commands. Once installed, we can go back to the terminal and set the global git settings.
Git enables setting both local and global options. The global options will be used as default settings any time triggering a new repository with the git init
command. You can override the global settings on specific repo by using local settings. Below, we will define the following global settings:
- Git user name
- Git user email
- Default branch name
- Global git ignore file
- Default editor (for merging comments)
Setting global user name and email by using the config --global
command:
git config --global user.name "USER_NAME"
git config --global user.email "[email protected]"
Next, let's set the default branch name as main
using the init.defaultBranch
argument:
git config --global init.defaultBranch main
The global .gitignore
file enables you to set general ignore roles that will apply automatically to all repositories in your machine. This is useful when having repetitive cases of files you wish to ignore by default. A good example on Mac is the system file -.DS_Store, which is auto-generated on each folder, and you probably do not want to commit it. First, let's create the global .gitignore
file using the touch
command:
touch ~/.gitignore
Next, let's define this file as global:
git config --global core.excludesFile ~/.gitignore
Once the global ignore file is set, we can start adding the files we want git to ignore systematically. For example, let's add the .DS_Store
to the global ignore file:
echo .DS_Store >> ~/.gitignore
Note: You want to be careful about the files you add to the global ignore file. Unless it is applicable to all cases, such as the .DS_Store example, you should not add it to the global settings and define it locally to avoid a git disaster.
Git enables you to set the default shell code editor to create and edit your commit messages with the core.editor
argument. Git supports the main command line editors such as vim
, emacs
, nano
, etc. I set main as vim
:
git config --global core.editor "vim"
By default, all the global settins saved to the config
file under the .ssh
folder. You can review the saved settings, modify and add new ones manually by editing the config
file:
vim ~/.gitconfig
Setting SSH
key required to sync your local git repositories with the origin
. By default, when creating the SSH keys it writes the files under the .ssh
folder, if exists, otherwise it writes it down under the root folder. It is more "clean" to have it under the .ssh
folder, therefore, my settings below assume this folder exists.
Let's start by creating the .ssh
folder:
mkdir ~/.ssh
The ssh-keyget
command creates the SSH keys files:
To set SSH key on your local machine you need to use ssh-keyget
:
ssh-keygen -t ed25519 -C "[email protected]"
Note: The -t
argument defines the algorithm type for the authentication key, in this case I used ed25519
and the -C
argument enables adding comment,in this case the user name email for reference.
After runngint the ssh-keygen
command, it will prompt for setting file name and password (optional). By default it will save it under the root folder.
Note: this process will generate two files:
your_ssh_key
is the private key, you should not expose ityour_ssh_key.pub
is the public key which will be used to to set the SSH on Github
The next step is to register the key on your Github account. On your account main page go to the Settings
menu and select on the main menu SSH and GPG keys
(purple rectangle ππΌ) and click on the New SSH key
(yellow rectangle ππΌ):
Next, set the key name under the title text box (purple rectangle ππΌ), and paste your public key to the key
box (turquoise rectangle ππΌ):
Note: I set the machine nickname (e.g., MacBook Pro 2017, Mac Pro, etc.) as the key title to easily identify the relevant key in the future.
Next step is to update the config
file on the ~/.ssh
folder. You can edit the config
file with vim
:
vim ~/.ssh/config
and add somewhere on the file the following code:
Host *
AddKeysToAgent yes
UseKeychain yes
IdentityFile ~/.ssh/your_ssh_key
Where your_ssh_key
is the private key file name
Last, run the following to load the key:
ssh-add --apple-use-keychain ~/.ssh/your_ssh_key
- Github documentation - https://docs.github.com/en/[email protected]/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account
ssh-keyget
arguments - https://www.ssh.com/academy/ssh/keygen- A great video toturial about setting SSH: https://www.youtube.com/watch?v=RGOj5yH7evk&t=1230s&ab_channel=freeCodeCamp.org
- Setting Git ignore - https://www.atlassian.com/git/tutorials/saving-changes/gitignore
- Initial Git setup - https://git-scm.com/book/en/v2/Getting-Started-First-Time-Git-Setup
This section covers core command lines tools.
The Homebrew (or brew
) enables you to install CL packages and tools for Mac. To install brew
run from the terminal:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
After finishing the installation, you may need to run the following commends (follow the instractions at the end of the installation):
(echo; echo βeval β$(/opt/homebrew/bin/brew shellenv)ββ) >> /Users/USER_NAME/.zprofile
eval β$(/opt/homebrew/bin/brew shellenv)β
More info available: https://brew.sh/
The jq
is a lightweight and flexible command-line JSON processor. You can install it with brew
:
brew install jq
There are multiple ways to spin a VM locally to run Docker. I typically use Docker Desktop, and for learning purposes (e.g., Kubernetes) I also install Minikube.
Go to Docker website and follow the intallation instractions according to your OS:
Minikube enables you to set virtual environment to run Docker. This is mainly relevant if you are using macOS or Windows and want to run Docker via cli. To install Minikube you will need to install first kubectl, hyperkit. We will use brew
to install all those components:
brew install kubectl
brew install hyperkit
brew install docker
brew install minikube
Lunching minikube with the start
argument and setting the memory and cpu allocation:
> minikube start --memory 4096 --cpus 2 --driver hyperkit
π minikube v1.24.0 on Darwin 12.0.1
βͺ MINIKUBE_ACTIVE_DOCKERD=minikube
β¨ Using the hyperkit driver based on user configuration
π Starting control plane node minikube in cluster minikube
π₯ Creating hyperkit VM (CPUs=2, Memory=4096MB, Disk=20000MB) ...
π³ Preparing Kubernetes v1.22.3 on Docker 20.10.8 ...
βͺ Generating certificates and keys ...
βͺ Booting up control plane ...
βͺ Configuring RBAC rules ...
π Verifying Kubernetes components...
βͺ Using image gcr.io/k8s-minikube/storage-provisioner:v5
π Enabled addons: storage-provisioner, default-storageclass
π Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
Lunch Docker:
eval $(minikube -p minikube docker-env)
Check the Docker status:
> docker info
Client:
Context: default
Debug Mode: false
Server:
Containers: 15
Running: 14
Paused: 0
Stopped: 1
Images: 10
Server Version: 20.10.8
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: e25210fe30a0a703442421b0f60afac609f950a3
runc version: 4144b63817ebcc5b358fc2c8ef95f7cddd709aa7
init version: de40ad0
Security Options:
seccomp
Profile: default
Kernel Version: 4.19.202
Operating System: Buildroot 2021.02.4
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.847GiB
Name: minikube
ID: 2IME:DJBF:L32S:HA4Q:DFCX:2LRI:JBCQ:6ORQ:RHUE:Q4S6:7WYE:PUD7
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
provider=hyperkit
Experimental: false
Live Restore Enabled: false
Product License: Community Engine
- Minikube documentation - https://minikube.sigs.k8s.io/docs/start/
- Installation guide - https://www.youtube.com/watch?v=zwmjzU62LWQ&ab_channel=AutomationStepbyStep
- Kubectl - https://kubernetes.io/docs/reference/kubectl/overview/
- hyperkit - https://github.com/moby/hyperkit
This section focuses on installing and setting tools for working on the terminal.
The terminal
is the built-in emulator on mac. I personally love to work with iTerm2
as it provides additional functionality and customization options. iTerm2 is available only for mac, and can be installed directly from the iTerm2 website or via homebrew
:
> brew install --cask iterm2
.
.
.
==> Installing Cask iterm2
==> Moving App 'iTerm.app' to '/Applications/iTerm.app'
πΊ iterm2 was successfully installed!
The next step is to install Z shell or zsh
. The zsh
is shell flavor built on top of bash
, providing a variety of add-in tools on the terminal. We will use homebrew
again to install zsh
:
> brew install zsh
.
.
.
==> Installing zsh
==> Pouring zsh--5.8_1.monterey.bottle.tar.gz
πΊ /usr/local/Cellar/zsh/5.8_1: 1,531 files, 14.7MB
After installing the zsh
we will install oh-my-zsh
, an open-source framework for managing zsh
configuration. We wiil install it with the curl
command:
sh -c "$(curl -fsSL https://raw.github.com/ohmyzsh/ohmyzsh/master/tools/install.sh)"
You can note that your terminal view changed (you may need to reset your terminal to see the changes) and the default command line cursor looks like:
β ~
The default setting of Oh My Zsh
stored on ~/.zshrc
and you can modify the default theme by editing the file:
vim ~/.zshrc
I use the powerlevel10k
which can be install by cloning the Github repository (for oh-my-zsh
):
git clone --depth=1 https://github.com/romkatv/powerlevel10k.git ${ZSH_CUSTOM:-$HOME/.oh-my-zsh/custom}/themes/powerlevel10k
And then change the theme setting on the ~/.zshrc
by ZSH_THEME="powerlevel10k/powerlevel10k"
. After restarting the terminal, and reopening it you will a sequence of questions on that enables you to set the theme setting:
Install Meslo Nerd Font?
(y) Yes (recommended).
(n) No. Use the current font.
(q) Quit and do nothing.
Choice [ynq]:
Note: the Meslo Nerd
font is required to display symbles that is being used by the powerlevel10k
theme
You can always modify your selection by using:
p10k configure
The terminal after adding the powerlevel10k
theme looks like:
Installing zsh-syntax-highlighting
to add code highlight on the terminal:
brew install zsh-syntax-highlighting
After the installation is done you will need to clone the source code. I set the destination as home folder, defining the traget folder hidden:
git clone https://github.com/zsh-users/zsh-syntax-highlighting.git $HOME/.zsh-syntax-highlighting
echo "source $HOME/.zsh-syntax-highlighting/zsh-syntax-highlighting.zsh" >> ${ZDOTDIR:-$HOME}/.zshrc
After you reset your terminal, you should see be able to see the syntex highlight in green (in my case):
iTerm2
- https://iterm2.com/index.htmloh my zsh
- https://ohmyz.sh/- freeCodeCamp blog post - https://www.freecodecamp.org/news/how-to-configure-your-macos-terminal-with-zsh-like-a-pro-c0ab3f3c1156/
powerlevel10k
theme - https://github.com/romkatv/powerlevel10kzsh-syntax-highlighting
- https://github.com/zsh-users/zsh-syntax-highlighting/blob/master/INSTALL.md#in-your-zshrc
VScode is a general-purpose IDE and my favorite development environment. VScode supports mutliple OS such as Lunix, MacOS, Windows, and Raspberry Pi.
Installing VScode is straightforward - go to the VScode website https://code.visualstudio.com/ and click on the Download button (purple rectangle ππΌ):
Download the installation file and follow the instructions. Here are the default extensions settings:
"extensions": [
"quarto.quarto",
"ms-azuretools.vscode-docker",
"ms-python.python",
"rdebugger.r-debugger",
"ms-vscode-remote.remote-containers",
"yzhang.markdown-all-in-one",
"reditorsupport.r",
"redhat.vscode-yaml",
"REditorSupport.r",
"REditorSupport.r-lsp",
"RDebugger.r-debugger"
]
This section focuses on setting up a Python environment.
Miniconda is a great tool to set local Python environments. Go to the Miniconda installer page and download the installing package based on your operating system and Python version to install the most recent version. Once Miniconda installed you can install Python packaes with conda
:
conda install pandas
Likewise, you can use conda
to create an environment:
conda create -n myenv python
Get a list of environments:
conda info --envs
Create an environment and set the Python version:
conda create --name myenv python=3.9
Get package available versions:
conda search pandas
Activate an enviroment:
conda activate myenv
Get a list of installed packages in the environment:
conda list
Deactivate the enviroment:
conda deactivate
This section covers the installation and setting of additional tools and features such as screen spliting, shortcuts, etc.
To set in your machine R and RStudio you should start first with installing R from CRAN. Go to https://cran.r-project.org/ and select Download R for macOS
and select the release you wish to install and download.
Note: For macOS, there are two versions, depending on the type of your machine CPU - one for Apple silicon arm64
and second for Intel 64-bit
.
Once you finish to download the build you select open the pkg
fild and start to install it:
Note: Older releases available on CRAN Archive.
Once R installed, you can install RStudio - go to https://posit.co website under Products tab and select RStudio IDE and select the version and download it:
After finish to download it move the application into the Application folder.
Next, let's set the Global options -> go to Tools
and then select Global options
and update the following:
- General:
- Workspace - select
Never
toSave workspace to .RData on exit
option - History - untick the first options -
Always save history...
. This will avoid saving the session on quit
- Workspace - select
- Code:
- Code snippet - under the
Editing
tab ->Snippet
menu -> tick theEnable code snippets
option and selectEdit Snippets
button to edit your snippits. My default snippets available here - Rainbow parentheses π - under the
Display
tab, tick theRainbow parentheses
box
- Code snippet - under the
- Appearance:
- select the font type and size, and editor theme (Merbivore Soft):
- Clear console -
Ctrl
+L
- Clost current document -
Cmd
+W
- Move focus to the Source panel -
Cmd
+1
- Move focus to the Console panel -
Cmd
+2
- Move tab left -
Cmd
+]
- Move tab right -
Cmd
+[
- Move tab to first -
Cmd
+P
- Move tab to last -
Cmd
+\
- New Rmarkdown notebook -
Cmd
+R
The XQuartz is an open-source project that provides required for graphic applications (X11) for macOS (similar to the X.Org X Window System functionality). To install it go to https://www.xquartz.org/ - download and install it.
Orca is application for transferring plotly graphs into images. To install the app on macOS:
- Go to the project Github page and download the most recent release (i.e.,
mac-release.zip
) - Unzip the
mac-release.zip
file. - Double-click on the
orca-X.Y.Z.dmg
file. This will open an installation window. - Drag the orca icon into the Applications folder.
- Open finder and navigate to the Applications/ folder.
- Right-click on the orca icon and select Open from the context menu.
- A password dialog will appear asking for permission to add orca to your system
PATH
. - Enter you password and click OK.
- This should open an Installation Succeeded window.
- Open a new terminal and verify that the orca executable is available on your PATH.
> which orca
/usr/local/bin/orca
Resources
To install Julia, go to https://julialang.org/downloads/ to download the current stable version of Julia or older releases. On Mac, the next step after moving the dmg
file to the Applications folder, is to add Julia to PATH:
sudo mkdir -p /usr/local/bin
sudo rm -f /usr/local/bin/julia
sudo ln -s /Applications/Julia-1.7.app/Contents/Resources/julia/bin/julia /usr/local/bin/julia
Note: That the Julia version on the code above should aligned with the one installed on your local machine. More info avilable here.
WIP
Rectangle is a free and open-source tool for moving and resizing windows in Mac with keyboard shoortcuts. To install it go to https://rectangleapp.com and download it. Once installed you can modify the default setting:
- Change language - if you are using more than one language, you can add a keyboard shortcut for switching between them. Go to
System Preferences...
->keyboard
and select the shortcut tab. Under theInput Sources
tick theSelect the previous input source option
:
Note: that you can modify the keyboard shortcut by clicking shortcut definition in that row
PostgreSQL supprts most of the common OS such as Windows, macOS, Linux, etc.
To download go to Postgres project website and navigate to the Downlaod tab and select your OS, which will naviage it to the OS download page, and follow the instraction:
On mac I highly recommand to install PostgreSQL through the Postgres.app:
When opening the app, you should have a default server set to port 5432 (make sure that this port is available):
To launch the server click on the start
button:
By default, the server will create three databases - postgres
, YOUR_USER_NAME
, and template1
. You can add additional server (or remove) by clicking the +
or -
symbols on the left botton.
To run Postgres from the terminal you will have to set define the path of the app on your zshrc
file (on mac) by adding the following line:
export PATH=$PATH:/Applications/Postgres.app/Contents/Versions/14/bin/
Where /Applications/Postgres.app/Contents/Versions/14/bin/
is the local path on my machine.
Alternativly, you can set the alias from the terminal by running the following"
echo "export PATH=$PATH:/Applications/Postgres.app/Contents/Versions/14/bin/" >> ${ZDOTDIR:-$HOME}/.zshrc
If the port you set for the Postgres server is in use you should expect to get the following message when trying to start the server:
This mean that the port is either used by other Postgres server or other application. To check what ports in use and by which applications you can use the lsof
function on the terimnal:
sudo lsof -i :5432 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
postgres 124 postgres 7u IPv6 0xc250a5ea155736fb 0t0 TCP *:postgresql (LISTEN)
postgres 124 postgres 8u IPv4 0xc250a5ea164aa3b3 0t0 TCP *:postgresql (LISTEN)
Where the i
argument enables to search by port number, in the example above by 5432
. As can see from the output, the port is used by other Posrgres server. You can clear the port by using the pkill
command:
sudo pkill -u postgres
Where the u
arugment enbales to define the port you want to clear by the USER field, in this case postgres
.
Note: Before you are clearing the port, make sure you do not need the applications on that port.
- Tutorial - https://www.youtube.com/watch?v=qw--VYLpxG4&t=1073s&ab_channel=freeCodeCamp.org
- PostgreSQL - https://en.wikipedia.org/wiki/PostgreSQL
- Documentation - https://www.postgresql.org/docs/
The drawio-desktop
is a desktop version of the diagrams app for creating diagrams and workflow charts. The desktop version, per the project repository, is designed to be completely isolated from the Internet, apart from the update process.
Image credit: https://www.diagrams.net/
To install the desktop version go to the project repository and select the version you wish to install under the releases section:
For macOS users, once download the dmp
file and open it, move the build to the applications folder:
- Draw.io documentation - https://www.diagrams.net/
- drawio-desktop repository - https://github.com/jgraph/drawio-desktop
- Online version - https://app.diagrams.net/