Skip to content

Set‐up

Minchan Kim edited this page May 31, 2024 · 10 revisions

Repository: https://github.com/dsc-courses/bpd-reference

Before you start..

It is highly recommended to use GitHub Desktop! It gives a better environment in which we can see changes, push, fork, pull, and commit to a repository. Dealing with changes when your local repository is behind can be difficult because of the steps with troubleshooting (stashing, etc.), but GitHub Desktop provides a clean and understandable interface to deal with changes.

In addition, be sure to pull the repository every time before you work on it (assuming it's already been cloned). There may have been changes that may have been added, even to the file you may be planning to work on (styling, direct code, etc.), so be sure to keep your local repo up to date!

Step 1: Setting up local device for development

  1. Set global Git username: git config --global user.name "Your Name"
  2. Set your global Git email: git config --global user.email "[email protected]"
  3. Install Node.js: https://nodejs.org/en/download
  4. Clone the Repository: git clone https://github.com/dsc-courses/bpd-reference.git
  5. Navigate to the repository folder: cd bpd-reference
  6. Set Up the Repository: npm install
  7. Start the localhost development server: npm run start

Step 2: Understanding the repository

A simplified file structure of the repository:

bpd-reference/
├── .docusaurus                  # Build artifacts, caches; DO NOT TOUCH
├── blog                         # NOT USED
├── build                        # Production; DO NOT TOUCH
├── components                   # React components for displaying bpd data types and google slides.
├── docs                         # Site markdowns; MOST OF THE WORK IS DONE HERE <-----------------IMPORTANT!
│ ├── documentation              # Markdowns for documentation.
│ └── statistical-inference      # Markdowns for statistical-inference
├── node_modules                 # All node packages that the project depends on; DO NOT TOUCH
├── src                          # Contains additional source code
│ └── css                        # .css styling
├── static                       # Images and SVGs
└── docusaurus.config.js         # Configuration for Docusaurus site

As seen above, most of the work will actually be done in the bpd-reference/docs folder. Nearly eevery page that is seen in the front-end is created using a .md file. If work needs to be done on the main page, edit bpd-reference/src/components/HomepageFeatures and bpd-reference/src/components/pages.

Step 3: Making changes 🛠️

Here's bpd-reference/docs/building-organizing/bpd.read_csv().md as an example:

---
sidebar_position: 2                                                        <- position in this directory on the webpage
---

import DataFrameComponent from '@site/components/DataFrameComponent.jsx';  <- goes to site (bpd-reference), then to the specified path
import '@site/src/css/function.css';                                           
import '@site/src/css/function.css';

<code>bpd.read_csv(filepath)</code>                                        <- function we want to show - include the default parameters

<div className='base'>
    <!-- Description -->
    <p><strong>Read a comma-separated values (csv) file into DataFrame.</strong></p>
    <dl>
        <!-- Input -->
        <dt className='term'>Input:</dt>
        <dd className='parameter'>filepath : <em>string, path object, file-like object.</em></dd>
        <dd className='parameter-description'>Any valid string path is acceptable. The string could also be a URL.</dd>
        <!-- Returns -->
        <dt className='term'>Returns:</dt>
        <dd>df - DataFrame with read csv file.</dd>
        <!-- Return Type -->
        <dt className='term'>Return Type:</dt>
        <dd>DataFrame</dd>
    </dl>
</div>

---

```python                                                                                                      <- code block
pets = bpd.read_csv('pets.csv')
pets
```

<DataFrameComponent data={'{"columns":["ID","Species","Color","Weight","Age","Is_Cat","Owner_Comment"],"index":[0,1,2,3,4,5,6,7],"data":[["dog_001","dog","black",40.0,5.0,false,"      There are no bad dogs, only bad owners."],["cat_001","cat","golden",1.5,0.2,true,"My best birthday present ever!!!"],["cat_002","cat","black",15.0,9.0,true,"****All you need is love and a cat.****"],["dog_002","dog","white",80.0,2.0,false,"Love is a wet nose and a wagging tail."],["dog_003","dog","black",25.0,0.5,false,"Be the person your dog thinks you are."],["ham_001","hamster","black",1.0,3.0,false,"No, thank you!"],["ham_002","hamster","golden",0.25,0.2,false,"No, thank you!"],["cat_003","cat","black",10.0,0.0,true,"No, thank you!"]]}'} />
                               ↑
                               | DataFrameComponent object that displays DataFrame from json

For the most part, nearly every .md file that will be deployed to the website will contain the same structure. If there is a need to create more functions/methods, copy and pasting a .md file from that same folder should provide a good template in what to do.

⚠️⚠️To see changes on localhost:3000, make sure to save the file that has been worked on and reload the page to see the changes. Refer back to step 1's number 7 on running localhost.⚠️⚠️

If you need to insert a new DataFrame:

  1. Make sure you have import DataFrameComponent from '@site/components/DataFrameComponent.jsx'; at the top of the .md file.
  2. Turn the BabyPandas DataFrame into a usable json string. (Use the helper method defined in the notebook df2json(df) or df.to_df().to_json(orient='split'))
  3. Copy and paste it into the DataFrameComponent in the corresponding .md file. (e.g. <DataFrameComponent data={“[INSERT JSON]”} />)

If you need to insert a new Series:

  1. Make sure you have import SeriesComponent from '@site/components/SeriesComponent.jsx'; at the top of the .md file.
  2. Turn the BabyPandas Series into a pandas Series and run Series.to_json(orient='split') on the Series you want to display. (Or use the helper method defined in the notebook s2json(Series, Series_name). Series_name refer to the column name while getting the series from a DataFrame df.get(Series_name))
  3. Copy and paste it into the SeriesComponent in the corresponding .md file. (e.g. <SeriesComponent data={“[INSERT JSON]”} />)
  4. Add extra key-value pair manually of "dtype":"[INSERT TYPE]" after the 'name' key-value pair. (Refer to the SeriesComponent example in bpd-reference/docs/arrays-and-numpy/arr[].md)

Step 4: Deploying local changes to GitHub Pages

Tip: Unless you are confident in your abilities with git, please use GitHub Desktop to deal with merging, stashing, pulling, pushing, and other conflicts you may deal with.

  1. Add, commit, and push changes to repository.

For example:

cd ~/bpd-reference

git pull

git status

(you will see something like this after running git status)

On branch main
Your branch is up to date with 'origin/main'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   <modified file path1>
	modified:   <modified file path2>

(add each file you changed separately)

git add <modified file path1>

git add <modified file path2>

(or git add . if you are sure that you want to add all your changes)

git commit -m '<brief description of changes you made>'

git push origin main

⚠️⚠️IF YOU ALREADY HAVE AN SSH KEY, SKIP STEPS 2 AND 3⚠️⚠️

  1. Create SSH Key:
  • In the home directory of Terminal (cd), make a new folder (mkdir .ssh).
  • To generate an SSH key pair: ssh-keygen -t rsa -b 4096 -C "your_email.example.com"
  • When prompted to save key, press enter.
  • Insert a passphrase you will remember (you'll need it later!)
  1. Save the SSH key to GitHub:
  • After generating the key, add the public key to your GitHub account in the "SSH and GPG keys" section of your account settings.
  • Enter a title, keep the key type as Authentication Key, and paste your key you created in the above step (you can copy the key you created into your clipboard from the above step by typing this into Terminal: For macOS, use pbcopy < ~/.ssh/id_rsa_github.pub or pbcopy < ~/.ssh/id_rsa.pub, depending on where you store your key. For Windows, use cat ~/.ssh/id_rsa.pub).
  1. Go back to bpd-reference directory through Terminal.

  2. Deploy the site: USE_SSH=true npm run deploy

  • This uses your SSH key to deploy. Enter the passphrase you set when creating the SSH key.

Troubleshooting

"address already in use :::3000"

  • This means that the 3000 port is currently already being used by a different process. Before proceeding to the next steps, make sure that the other process is dealt with (saving, etc.)
  1. Find the process using the port:
  • Linux/Mac: `lsof -i :3000"
  • Windows (Command Prompt as admin): netstat -ano | findstr :3000
  1. Kill the process: (replace PID from above step)
  • Linux/Mac: kill -9 PID
  • Windows: taskkill /PID PID /F