diff --git a/.nojekyll b/.nojekyll index 979d47c..d5311f2 100644 --- a/.nojekyll +++ b/.nojekyll @@ -1 +1 @@ -b00e17fb \ No newline at end of file +8f7e717e \ No newline at end of file diff --git a/00_introduction.html b/00_introduction.html index 69e5da8..249b174 100644 --- a/00_introduction.html +++ b/00_introduction.html @@ -346,10 +346,12 @@

Learning modules

  1. Git, Pull Requests, and code reviews
  2. Python functions, classes, and modules
  3. Testing and auto-formatting diff --git a/01_version_control.html b/01_version_control.html index 0bbed02..ffe3b85 100644 --- a/01_version_control.html +++ b/01_version_control.html @@ -461,34 +461,6 @@

    Github flow

    display:block!important; } -
    -

    Time for a discussion

    -

    Discuss in learning teams (15 minutes):

    -
      -
    • Introduce your project briefly
    • -
    • Think about a project you’ve worked on in the past that involved collaborating with others on code. What challenges did you face, and how do you think Git and GitHub could have helped to address those challenges?
    • -
    -

    After break out session:

    -
      -
    • One person from each team briefly presents their discussion outcomes
    • -
    - -

    Desktop Application: GitHub Desktop

    diff --git a/01_version_control.qmd b/01_version_control.qmd index a2ce959..7feeff6 100644 --- a/01_version_control.qmd +++ b/01_version_control.qmd @@ -141,24 +141,6 @@ You can use Git from the *command line*, or with a graphical user interface (GUI ::: -## Time for a discussion {.smaller} - -Discuss in learning teams (15 minutes): - -* Introduce your project briefly -* Think about a project you've worked on in the past that involved collaborating with others on code. What challenges did you face, and how do you think Git and GitHub could have helped to address those challenges? - - -After break out session: - -* One person from each team briefly presents their discussion outcomes - -::: {.notes} -* What is the benefit of working in branches? -* What are some **best practices** for collaborating on code with others, and how can Git and GitHub help to support those best practices? -::: - - ## Desktop Application: [GitHub Desktop](https://desktop.github.com/) diff --git a/group_work/index.html b/group_work/index.html index 76d96f9..d86c51a 100644 --- a/group_work/index.html +++ b/group_work/index.html @@ -82,7 +82,10 @@

    On-line Group Discussion

  4. Module 1
  5. Module 2
  6. Module 3
  7. -
  8. Module 4
  9. +
  10. Module 4
  11. +
  12. Module 5
  13. +
  14. Module 6
  15. +
  16. Module 7
  17. diff --git a/group_work/module_01.html b/group_work/module_01.html index c125130..1571281 100644 --- a/group_work/module_01.html +++ b/group_work/module_01.html @@ -78,7 +78,7 @@

    Module 1

      -
    • Study this script clean_project_data_v4_final2.py for 3 minutes
    • +
    • Study this script clean_project_data_v4_final2.py for 3 minutes
    • Consider what you could do to improve it
    • Q1: Discuss in your group how to improve the script.
    • Q2: Version control. What is your experience with version control? diff --git a/group_work/module_01.md b/group_work/module_01.md new file mode 100644 index 0000000..5f8aad0 --- /dev/null +++ b/group_work/module_01.md @@ -0,0 +1,9 @@ +## Module 1 + +- Study this script [`clean_project_data_v4_final2.py`](../projects/data_cleaning/clean_project_data_v4_final2.qmd) for 3 minutes +- Consider what you could do to improve it +- Q1: Discuss in your group how to improve the script. +- Q2: Version control. What is your experience with version control? + - Think about a project you've worked on in the past that involved collaborating with others on code. What challenges did you face, and how do you think Git and GitHub could have helped to address those challenges? + +{{< include _footer.md >}} \ No newline at end of file diff --git a/group_work/module_02.md b/group_work/module_02.md new file mode 100644 index 0000000..b4ebcdc --- /dev/null +++ b/group_work/module_02.md @@ -0,0 +1,7 @@ +## Module 2 + +- Q1: In your course project homework, you refactored the script to use functions. How did it go? +- Q2: Classes. If you should introduce classes to improve the code, which classes should it be and why? +- Q3: [Optional] What are some problems with poorly designed code (based on your own experience or from the book)? + +{{< include _footer.md >}} \ No newline at end of file diff --git a/index.html b/index.html index 9dd8eff..ef85fd8 100644 --- a/index.html +++ b/index.html @@ -96,10 +96,12 @@

      Learning modules

      1. Git, Pull Requests, and code reviews
      2. Python functions, classes, and modules
      3. Testing and auto-formatting diff --git a/projects/data_cleaning/clean_project_data_v4_final2.html b/projects/data_cleaning/clean_project_data_v4_final2.html new file mode 100644 index 0000000..4dc6525 --- /dev/null +++ b/projects/data_cleaning/clean_project_data_v4_final2.html @@ -0,0 +1,542 @@ + + + + + + + + + +Python package development – clean_project_data_v4_final2 + + + + + + + + + + + + + + + + + + + + + + + + + +
        + +
        + + + + +
        + + + +
        +
        +
        clean_project_data_v4_final.py
        +
        +
        import pandas as pd
        +import numpy as np
        +from datetime import datetime, timedelta
        +import matplotlib.pyplot as plt
        +
        +# Create date range
        +date_rng = pd.date_range(start="1/1/2020", end="1/31/2020", freq="D")
        +
        +# Sample time series data with DateTimeIndex
        +data1 = pd.Series([1, 2, -1, 4, 5, 20, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 
        +                   21, 22, 24, 24, 24, 24, 24, 24, 29, 30, 31], index=date_rng)
        +data2 = pd.Series([5, 6, 200, 8, 9, 10, 11, 12, 300, 14, 15, 16, 17, 18, 19, 20, 21, 22, 
        +                   23, 24, 25, 26, 27, 27, 27, 30, 31, 32, 33, 34, 35], index=date_rng)
        +data3 = pd.Series([15, 16, 11, 18, 400, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 
        +                   32, 33, 34, 35, 36, 37, 38, 39, 45, 45, 45, 45, 45, 45], index=date_rng)
        +
        +
        +# Cleaning data1
        +print("\nCleaning data1")
        +data1_original = data1.copy()
        +
        +# Checking for jumps 
        +print("Checking for jumps in data1")
        +max_jump=10
        +prev_value = data1.iloc[0]
        +for t, value in data1.items():
        +    if abs(value - prev_value) <= max_jump:
        +        # "Value ok"
        +        data1[t] = value
        +        prev_value = value
        +    else:
        +        data1[t] = np.nan
        +        print("Jump detected and value removed on", t, ":", value)
        +print(f"Data removed: {data1_original[~data1_original.isin(data1)]}")
        +# print("Data1 after jump check:", data1)
        +
        +# Checking for values in range 
        +min_val = 0
        +max_val = 50
        +for t, value in data1.items():
        +    # print("Checking value on", t, ":", value)
        +    if min_val <= value <= max_val:
        +        pass
        +        # print("Value ok:", value)
        +    else:
        +        data1[t] = np.nan
        +        print("Value removed:", value)
        +print(f"Data removed: {data1_original[~data1_original.isin(data1)]}")
        +# print("Data1 after range check:", data1)
        +
        +
        +# Checking for flat periods 
        +print("Checking for flat periods in data1")
        +flat_period = 5
        +i = 0
        +while i < len(data1) - flat_period:
        +    if len(set(data1[i: i + flat_period + 1])) == 1: 
        +        print("Removing flat period starting at index:", i)
        +        data1[i: i + flat_period + 1] = np.nan
        +        i += flat_period
        +    else:
        +        i += 1
        +print(f"Data removed: {data1_original[~data1_original.isin(data1)]}")
        +# print("Data1 after flat period check:", data1)
        +
        +
        +# Cleaning data2
        +print("\nCleaning data2")
        +data2_original = data2.copy()
        +
        +# Checking for jumps 
        +print("Checking for jumps in data2")
        +max_jump=10
        +prev_value = data2.iloc[0]
        +for t, value in data2.items():
        +    if abs(value - prev_value) <= max_jump:
        +        # "Value ok"
        +        data2[t] = value
        +        prev_value = value
        +    else:
        +        data2[t] = np.nan
        +        print("Jump detected and value removed on", t, ":", value)
        +print(f"Data removed: {data2_original[~data2_original.isin(data2)]}")
        +# print("data2 after jump check:", data2)
        +
        +# Checking for values in range 
        +min_val = 0
        +max_val = 50
        +for t, value in data2.items():
        +    # print("Checking value on", t, ":", value)
        +    if min_val <= value <= max_val:
        +        pass
        +        # print("Value ok:", value)
        +    else:
        +        data2[t] = np.nan
        +        print("Value removed:", value)
        +print(f"Data removed: {data2_original[~data2_original.isin(data2)]}")
        +# print("data2 after range check:", data2)
        +
        +
        +# Checking for flat periods 
        +print("Checking for flat periods in data2")
        +flat_period = 5
        +i = 0
        +while i < len(data2) - flat_period:
        +    if len(set(data2[i: i + flat_period + 1])) == 1: 
        +        print("Removing flat period starting at index:", i)
        +        data2[i: i + flat_period + 1] = np.nan
        +        i += flat_period
        +    else:
        +        i += 1
        +print(f"Data removed: {data2_original[~data2_original.isin(data2)]}")
        +# print("data2 after flat period check:", data2)
        +
        +# print("Final cleaned data2:", data2)
        +
        +# Cleaning data3
        +print("\nCleaning data3")
        +data3_original = data3.copy()
        +
        +# Checking for jumps 
        +print("Checking for jumps in data3")
        +max_jump=10
        +prev_value = data3.iloc[0]
        +for t, value in data3.items():
        +    if abs(value - prev_value) <= max_jump:
        +        # "Value ok"
        +        data3[t] = value
        +        prev_value = value
        +    else:
        +        data3[t] = np.nan
        +        print("Jump detected and value removed on", t, ":", value)
        +print(f"Data removed: {data3_original[~data3_original.isin(data3)]}")
        +# print("data3 after jump check:", data3)
        +
        +# Checking for values in range 
        +min_val = 0
        +max_val = 50
        +for t, value in data3.items():
        +    # print("Checking value on", t, ":", value)
        +    if min_val <= value <= max_val:
        +        pass
        +        # print("Value ok:", value)
        +    else:
        +        data3[t] = np.nan
        +        print("Value removed:", value)
        +print(f"Data removed: {data3_original[~data3_original.isin(data3)]}")
        +# print("data3 after range check:", data3)
        +
        +
        +# Checking for flat periods 
        +print("Checking for flat periods in data3")
        +flat_period = 5
        +i = 0
        +while i < len(data3) - flat_period:
        +    if len(set(data3[i: i + flat_period + 1])) == 1: 
        +        print("Removing flat period starting at index:", i)
        +        data3[i: i + flat_period + 1] = np.nan
        +        i += flat_period
        +    else:
        +        i += 1
        +print(f"Data removed: {data3_original[~data3_original.isin(data3)]}")
        +# print("data3 after flat period check:", data3)
        +
        +# print("Final cleaned data3:", data3)
        +
        +## plot data showing outliers as red dots
        +plt.figure(figsize=(10, 5))
        +plt.plot(data1_original, '.', color="red")
        +plt.plot(data1, '.', color="green")
        +plt.title("Data1")
        +plt.show()
        +
        +plt.figure(figsize=(10, 5))
        +plt.plot(data2_original, '.', color="red")
        +plt.plot(data2, '.', color="green")
        +plt.title("Data2")
        +plt.show()
        +
        +plt.figure(figsize=(10, 5))
        +plt.plot(data3_original, '.', color="red")
        +plt.plot(data3, '.', color="green")
        +plt.title("Data3")
        +plt.show()
        +
        + + + +
        + +
        + + + + \ No newline at end of file diff --git a/search.json b/search.json index 12a2e13..8200f2a 100644 --- a/search.json +++ b/search.json @@ -1,189 +1,245 @@ [ { - "objectID": "projects/data_cleaning/Project_module_01.html", - "href": "projects/data_cleaning/Project_module_01.html", - "title": "Python package development", - "section": "", - "text": "1.1 GitHub repo\n\n1.1.1 Create a new GitHub repository “timeseriescleaner”\n\nprivate, no template, add readme, gitignore python, no license\n\n1.1.2 Go to repo settings/Collaborators add your instructors and your “buddy”\n1.1.3 Clone repo to local machine\n[Optional] Create virtual environment for this course project (use venv or mamba/conda environment)\n1.1.4 Download the provided Python script and add it to the repo\n1.1.5 Commit the file and push the changes (Check that the file can be found on GitHub)\n1.1.6 Open the project in vscode and make a single character change to the file (add a comment)\n1.1.7 Commit the changes (Check that it works on GitHub)\n\n1.2 Functions\n\n1.2.1 Create a local branch “refactor-functions”\n1.2.2 Refactor the code to use functions (clean_spikes, clean_outofrange, clean_flat, plot_timeseries)\n\nfor data in [data1, data2, data3]:\n\ndata_original = data.copy()\ndata = clean_spikes(data, max_jump=10)\ndata = clean_outofrange(data, min_val=0, max_val=50)\ndata = clean_flat(data, flat_period=5)\nplot_timeseries(data_original, data)\n\n\n1.2.3 Check that your code and produce the same results as before (you should not change the functionality!)\n1.2.4 Commit your code in 1 or more commits (in the end your code should be approximately 75 lines long)\n\nCreate a pull request in GitHub and “request review” from your reviewers\nWait for feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" - }, - { - "objectID": "projects/data_cleaning/Project_module_01.html#module-1-github-and-basic-functions", - "href": "projects/data_cleaning/Project_module_01.html#module-1-github-and-basic-functions", - "title": "Python package development", - "section": "", - "text": "1.1 GitHub repo\n\n1.1.1 Create a new GitHub repository “timeseriescleaner”\n\nprivate, no template, add readme, gitignore python, no license\n\n1.1.2 Go to repo settings/Collaborators add your instructors and your “buddy”\n1.1.3 Clone repo to local machine\n[Optional] Create virtual environment for this course project (use venv or mamba/conda environment)\n1.1.4 Download the provided Python script and add it to the repo\n1.1.5 Commit the file and push the changes (Check that the file can be found on GitHub)\n1.1.6 Open the project in vscode and make a single character change to the file (add a comment)\n1.1.7 Commit the changes (Check that it works on GitHub)\n\n1.2 Functions\n\n1.2.1 Create a local branch “refactor-functions”\n1.2.2 Refactor the code to use functions (clean_spikes, clean_outofrange, clean_flat, plot_timeseries)\n\nfor data in [data1, data2, data3]:\n\ndata_original = data.copy()\ndata = clean_spikes(data, max_jump=10)\ndata = clean_outofrange(data, min_val=0, max_val=50)\ndata = clean_flat(data, flat_period=5)\nplot_timeseries(data_original, data)\n\n\n1.2.3 Check that your code and produce the same results as before (you should not change the functionality!)\n1.2.4 Commit your code in 1 or more commits (in the end your code should be approximately 75 lines long)\n\nCreate a pull request in GitHub and “request review” from your reviewers\nWait for feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" + "objectID": "06_oop.html#object-oriented-design", + "href": "06_oop.html#object-oriented-design", + "title": "Object oriented design in Python", + "section": "Object oriented design", + "text": "Object oriented design\nBenefits of object oriented design:\n\nEncapsulation\nCode reuse (composition, inheritance)\nAbstraction" }, { - "objectID": "projects/data_cleaning/Project_module_04.html", - "href": "projects/data_cleaning/Project_module_04.html", - "title": "Python package development", - "section": "", - "text": "Create new branch “action-formatting” (Make sure changes from last module have been merged, and that you start from the main branch)\n4.1 Github Action\n\n4.1.1 Copy the GitHub action “python-app.yml” from the python template https://github.com/DHI/template-python-library to your own library (make sure it sits in the same folder).\n4.1.2 Change all occurrences of “my_library” in the yml file to your package name “tscleaner”\n4.1.3 Comment out the line with “ruff-action” with “#”\n4.1.4 Commit, push and create a pull request; the tests should now run, verify that they all run before you move on\n\n4.2 Ruff\n\n4.2.1 Enable the “ruff-action” be removing the “#” you added above\n4.2.2 Commit and push, your actions will probably fail now - inspect the problems by clicking the red cross (did you also get an email?)\n4.2.3 Install “ruff” on your local machine with mamba/conda/pip\n4.2.4 Navigate to your project root folder and run ruff with “ruff .”\n4.2.5 Add __all__ = [\"SpikeCleaner\", \"FlatPeriodCleaner\", \"OutOfRangeCleaner\", \"plot_timeseries\"] to your __init__.py file and fix remaining issues until ruff passes\n4.2.6 Commit, push and verify that you action now succeeds\n\n4.3 Black\n\n4.3.1 Install “black” on your local machine with mamba/conda/pip\n4.3.2 Run black from your project root folder; inspect the differences; commit\n\n4.4 pyproject.toml\n\nCopy the pyproject.toml from the python template https://github.com/DHI/template-python-library (this file will replace your setup.py)\nModify to fit your package\nRemove the setup.py\nCommit, push and verify that the GitHub action runs\nIf it fails, you probably forgot some dependencies - go back and fix\n[Optional] You should also re-install your local package with “>pip install –upgrade -e .”\n\n4.5 [Optional] Enable black and ruff extensions in VSCode; set black to run on save\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" + "objectID": "06_oop.html#encapsulation", + "href": "06_oop.html#encapsulation", + "title": "Object oriented design in Python", + "section": "Encapsulation", + "text": "Encapsulation\nclass Location:\n def __init__(self, name, longitude, latitude):\n self.name = name.upper() # Names are always uppercase\n self.longitude = longitude\n self.latitude = latitude\n\n>>> loc = Location(\"Antwerp\", 4.42, 51.22)\n>>> loc.name\n'ANTWERP'\n>>> loc.name = \"Antwerpen\"\n>>> loc.name\n\"Antwerpen\" 😟" }, { - "objectID": "projects/data_cleaning/Project_module_04.html#module-4-github-actions-and-auto-formatting", - "href": "projects/data_cleaning/Project_module_04.html#module-4-github-actions-and-auto-formatting", - "title": "Python package development", - "section": "", - "text": "Create new branch “action-formatting” (Make sure changes from last module have been merged, and that you start from the main branch)\n4.1 Github Action\n\n4.1.1 Copy the GitHub action “python-app.yml” from the python template https://github.com/DHI/template-python-library to your own library (make sure it sits in the same folder).\n4.1.2 Change all occurrences of “my_library” in the yml file to your package name “tscleaner”\n4.1.3 Comment out the line with “ruff-action” with “#”\n4.1.4 Commit, push and create a pull request; the tests should now run, verify that they all run before you move on\n\n4.2 Ruff\n\n4.2.1 Enable the “ruff-action” be removing the “#” you added above\n4.2.2 Commit and push, your actions will probably fail now - inspect the problems by clicking the red cross (did you also get an email?)\n4.2.3 Install “ruff” on your local machine with mamba/conda/pip\n4.2.4 Navigate to your project root folder and run ruff with “ruff .”\n4.2.5 Add __all__ = [\"SpikeCleaner\", \"FlatPeriodCleaner\", \"OutOfRangeCleaner\", \"plot_timeseries\"] to your __init__.py file and fix remaining issues until ruff passes\n4.2.6 Commit, push and verify that you action now succeeds\n\n4.3 Black\n\n4.3.1 Install “black” on your local machine with mamba/conda/pip\n4.3.2 Run black from your project root folder; inspect the differences; commit\n\n4.4 pyproject.toml\n\nCopy the pyproject.toml from the python template https://github.com/DHI/template-python-library (this file will replace your setup.py)\nModify to fit your package\nRemove the setup.py\nCommit, push and verify that the GitHub action runs\nIf it fails, you probably forgot some dependencies - go back and fix\n[Optional] You should also re-install your local package with “>pip install –upgrade -e .”\n\n4.5 [Optional] Enable black and ruff extensions in VSCode; set black to run on save\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" + "objectID": "06_oop.html#encapsulation---attributes", + "href": "06_oop.html#encapsulation---attributes", + "title": "Object oriented design in Python", + "section": "Encapsulation - Attributes", + "text": "Encapsulation - Attributes\nVariables prefixed with an underscore (self._name) is a convention to indicate that the instance variable is private.\nclass Location:\n def __init__(self, name, longitude, latitude):\n self._name = name.upper() # Names are always uppercase\n ...\n\n @property\n def name(self):\n return self._name\n\n @name.setter\n def name(self, value):\n self._name = value.upper()\n\n>>> loc = Location(\"Antwerp\", 4.42, 51.22)\n>>> loc.name = \"Antwerpen\"\n>>> loc.name\n\"ANTWERPEN\" 😊" }, { - "objectID": "projects/data_cleaning/Project_module_06.html", - "href": "projects/data_cleaning/Project_module_06.html", - "title": "Python package development", - "section": "", - "text": "Create new branch “oop-dataclasses” (Make sure changes from last module have been merged, and that you start from the main branch)\n5.1 Type Hints\n\nAdd type hints to all functions and methods. Commit\n\n5.2 Data class\n\nMake all the cleaner classes dataclasses.\n\nremove the init method (not needed anymore)\nCheck that the notebook still runs and that the classes indeed work as data classes (e.g. have a string representation and support equality testing etc)\nCommit\n\n5.3 Module level function\n\nMake a private module function _print_stats() that prints the number of cleaned values\ncall from each of the clean methods\n\n5.4 Composition or inheritance\n\nCreate a new cleaner class called CleanerWorkflow that takes a list of cleaners when constructed and has a clean method that run all the cleaners’ clean methods.\nModify the notebook to use the CleanerWorkflow instead of looping over the cleaners\nConsider what type of validation you would want CleanerWorkflow\nConsider whether it would be better to create a base class BaseCleaner - write down your considerations as a comment in the pull request, refer to specific lines of code\n\ne.g. how would you handle e.g. common plotting functionality in the cleaner classes?\n\n\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" + "objectID": "06_oop.html#composition", + "href": "06_oop.html#composition", + "title": "Object oriented design in Python", + "section": "Composition", + "text": "Composition\n\n\nComposition in object oriented design is a way to combine objects or data types into more complex objects.\n\n\n\n\n\nclassDiagram\n\n class Grid{\n + nx\n + dx\n + ny\n + dy\n + find_index()\n }\n\n class ItemInfo{\n + name\n + type\n + unit\n }\n\n class DataArray{\n + data\n + time\n + item\n + geometry\n + plot()\n }\n\n DataArray --* Grid\n DataArray --* ItemInfo" }, { - "objectID": "projects/data_cleaning/Project_module_06.html#module-6-object-oriented-design", - "href": "projects/data_cleaning/Project_module_06.html#module-6-object-oriented-design", - "title": "Python package development", - "section": "", - "text": "Create new branch “oop-dataclasses” (Make sure changes from last module have been merged, and that you start from the main branch)\n5.1 Type Hints\n\nAdd type hints to all functions and methods. Commit\n\n5.2 Data class\n\nMake all the cleaner classes dataclasses.\n\nremove the init method (not needed anymore)\nCheck that the notebook still runs and that the classes indeed work as data classes (e.g. have a string representation and support equality testing etc)\nCommit\n\n5.3 Module level function\n\nMake a private module function _print_stats() that prints the number of cleaned values\ncall from each of the clean methods\n\n5.4 Composition or inheritance\n\nCreate a new cleaner class called CleanerWorkflow that takes a list of cleaners when constructed and has a clean method that run all the cleaners’ clean methods.\nModify the notebook to use the CleanerWorkflow instead of looping over the cleaners\nConsider what type of validation you would want CleanerWorkflow\nConsider whether it would be better to create a base class BaseCleaner - write down your considerations as a comment in the pull request, refer to specific lines of code\n\ne.g. how would you handle e.g. common plotting functionality in the cleaner classes?\n\n\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" + "objectID": "06_oop.html#composition---example", + "href": "06_oop.html#composition---example", + "title": "Object oriented design in Python", + "section": "Composition - Example", + "text": "Composition - Example\nclass Grid:\n def __init__(self, nx, dx, ny, dy):\n self.nx = nx\n self.dx = dx\n self.ny = ny\n self.dy = dy\n \n def find_index(self, x,y):\n ...\n\nclass DataArray:\n def __init__(self, data, time, item, geometry):\n self.data = data\n self.time = time\n self.item = item\n self.geometry = geometry\n\n def plot(self):\n ..." }, { - "objectID": "projects/data_cleaning/Project_module_07.html", - "href": "projects/data_cleaning/Project_module_07.html", - "title": "Python package development", - "section": "", - "text": "Add a license\nChange version number to 0.1.0\nBuild the package\nPublish the package to the PyPI Test Server.\n\nBack to homework overview" + "objectID": "06_oop.html#inheritance", + "href": "06_oop.html#inheritance", + "title": "Object oriented design in Python", + "section": "Inheritance", + "text": "Inheritance" }, { - "objectID": "projects/data_cleaning/Project_module_07.html#module-7-publishing", - "href": "projects/data_cleaning/Project_module_07.html#module-7-publishing", - "title": "Python package development", - "section": "", - "text": "Add a license\nChange version number to 0.1.0\nBuild the package\nPublish the package to the PyPI Test Server.\n\nBack to homework overview" + "objectID": "06_oop.html#inheritance---example", + "href": "06_oop.html#inheritance---example", + "title": "Object oriented design in Python", + "section": "Inheritance - Example", + "text": "Inheritance - Example\n\n\n\n\n\nclassDiagram\n\nclass _GeometryFM{\n+ node_coordinates\n+ element_table\n}\n\nclass GeometryFM2D{\n+ interp2d()\n+ get_element_area()\n+ plot()\n}\n\nclass _GeometryFMLayered{\n- _n_layers\n- _n_sigma\n+ to_2d_geometry()\n}\n\nclass GeometryFM3D{\n+ plot()\n}\n\nclass GeometryFMVerticalProfile{\n+ plot()\n}\n _GeometryFM <|-- GeometryFM2D\n _GeometryFM <|-- _GeometryFMLayered\n _GeometryFMLayered <|-- GeometryFM3D\n _GeometryFMLayered <|-- GeometryFMVerticalProfile" }, { - "objectID": "projects/data_cleaning/notebook_A.html", - "href": "projects/data_cleaning/notebook_A.html", - "title": "Clean data from a file", - "section": "", - "text": "# useful if your change your modules after starting the kernel\n%load_ext autoreload\n%autoreload 2\nimport pandas as pd\nfrom cleaning import SpikeCleaner, OutOfRangeCleaner, FlatPeriodCleaner\nfrom plotting import plot_timeseries\nfn = \"./example_data1.csv\"\ndf = pd.read_csv(fn, index_col=0, parse_dates=True, dtype=float)\ndf.head(10)\n\n\n\n\n\n\n\n\nseries1\n\n\n\n\n2020-01-01\n1.0\n\n\n2020-01-02\n2.0\n\n\n2020-01-03\n-1.0\n\n\n2020-01-04\n4.0\n\n\n2020-01-05\n5.0\n\n\n2020-01-06\n20.0\n\n\n2020-01-07\n7.0\n\n\n2020-01-08\n8.0\n\n\n2020-01-09\n9.0\n\n\n2020-01-10\n10.0" + "objectID": "06_oop.html#inheritance---example-2", + "href": "06_oop.html#inheritance---example-2", + "title": "Object oriented design in Python", + "section": "Inheritance - Example (2)", + "text": "Inheritance - Example (2)\nclass _GeometryFMLayered(_GeometryFM):\n def __init__(self, nodes, elements, n_layers, n_sigma):\n # call the parent class init method\n super().__init__(\n nodes=nodes,\n elements=elements,\n )\n self._n_layers = n_layers\n self._n_sigma = n_sigma" }, { - "objectID": "projects/data_cleaning/notebook_A.html#try-out-one-cleaner-first", - "href": "projects/data_cleaning/notebook_A.html#try-out-one-cleaner-first", - "title": "Clean data from a file", - "section": "Try out one cleaner first", - "text": "Try out one cleaner first\n\ncleaner = SpikeCleaner(max_jump=10)\n\n\ndf[\"clean1\"] = cleaner.clean(df.series1)\n\n\ndf.head(10)\n\n\n\n\n\n\n\n\nseries1\nclean1\n\n\n\n\n2020-01-01\n1.0\n1.0\n\n\n2020-01-02\n2.0\n2.0\n\n\n2020-01-03\n-1.0\n-1.0\n\n\n2020-01-04\n4.0\n4.0\n\n\n2020-01-05\n5.0\n5.0\n\n\n2020-01-06\n20.0\nNaN\n\n\n2020-01-07\n7.0\n7.0\n\n\n2020-01-08\n8.0\n8.0\n\n\n2020-01-09\n9.0\n9.0\n\n\n2020-01-10\n10.0\n10.0\n\n\n\n\n\n\n\n\nplot_timeseries(df.series1, df.clean1)" + "objectID": "06_oop.html#composition-vs-inheritance", + "href": "06_oop.html#composition-vs-inheritance", + "title": "Object oriented design in Python", + "section": "Composition vs inheritance", + "text": "Composition vs inheritance\n\n\nInheritance is often used to reuse code, but this is not the main purpose of inheritance.\nInheritance is used to specialize behavior.\nIn most cases, composition is a better choice than inheritance.\nSome recent programming languages (e.g. Go & Rust) do not support this style of inheritance.\nUse inheritance only when it makes sense.\n\n\n\n\nHillard, 2020, Ch. 8 “The rules (and exceptions) of inheritance”" }, { - "objectID": "projects/data_cleaning/notebook_A.html#apply-all-cleaners", - "href": "projects/data_cleaning/notebook_A.html#apply-all-cleaners", - "title": "Clean data from a file", - "section": "Apply all cleaners", - "text": "Apply all cleaners\n\ncleaners = [\n SpikeCleaner(max_jump=10),\n OutOfRangeCleaner(min_val=0, max_val=50),\n FlatPeriodCleaner(flat_period=5),\n]\n\n\ncleaned_data = df.series1.copy()\nfor cleaner in cleaners:\n cleaned_data = cleaner.clean(cleaned_data)\n # plot_timeseries(df.series1, cleaned_data) # check for each step if something is not working\nplot_timeseries(df.series1, cleaned_data)" + "objectID": "06_oop.html#types", + "href": "06_oop.html#types", + "title": "Object oriented design in Python", + "section": "Types", + "text": "Types\nC#\nint n = 2;\nString s = \"Hello\";\n\npublic String RepeatedString(String s, int n) {\n return Enumerable.Repeat(s, n).Aggregate((a, b) => a + b);\n}\n\nPython\nn = 2\ns = \"Hello\"\n\ndef repeated_string(s, n):\n return s * n" }, { - "objectID": "00_introduction.html#instructors", - "href": "00_introduction.html#instructors", - "title": "Python package development", - "section": "Instructors", - "text": "Instructors\n\nHenrik Andersson - @ecomodeller\nJesper Sandvig Mariegaard - @jsmariegaard" + "objectID": "06_oop.html#types-1", + "href": "06_oop.html#types-1", + "title": "Object oriented design in Python", + "section": "Types", + "text": "Types\n\n\nPython is a dynamically typed language\nTypes are not checked at compile time\nTypes are checked at runtime\n\n\n\nPython with type hints\nn: int = 2\ns: str = \"Hello\"\n\ndef repeated_string(s:str, n:int) -> str:\n return s * n" }, { - "objectID": "00_introduction.html#participants", - "href": "00_introduction.html#participants", - "title": "Python package development", - "section": "Participants", - "text": "Participants\nIntroduce yourselves in a break out session later today." + "objectID": "06_oop.html#abstraction", + "href": "06_oop.html#abstraction", + "title": "Object oriented design in Python", + "section": "Abstraction", + "text": "Abstraction\n\n\nVersion A\ntotal = 0.0\nfor x in values:\n total = total +x\n\nVersion B\ntotal = sum(values)\n\n\n\n\n\nUsing functions, e.g. sum() allows us to operate on a higher level of abstraction.\nToo little abstraction will force you to write many lines of boiler-plate code\nToo much abstraction limits the flexibility\n✨Find the right level of abstraction!✨\n\n\n\n\nWhich version is easiest to understand?\nWhich version is easiest to change?" }, { - "objectID": "00_introduction.html#learning-modules", - "href": "00_introduction.html#learning-modules", - "title": "Python package development", - "section": "Learning modules", - "text": "Learning modules\n\nGit, Pull Requests, and code reviews\n\nHomework\n\nPython functions, classes, and modules\n\nHomework\n\nTesting and auto-formatting\n\nHomework\n\nDependencies and GitHub actions\n\nHomework\n\nDocumentation\n\nHomework\n\nObject oriented design in Python\n\nHomework\n\nDistributing your package\n\nHomework" + "objectID": "06_oop.html#collections-abstract-base-classes", + "href": "06_oop.html#collections-abstract-base-classes", + "title": "Object oriented design in Python", + "section": "Collections Abstract Base Classes", + "text": "Collections Abstract Base Classes\n\n\n\n\nclassDiagram\n Container <|-- Collection\n Sized <|-- Collection\n Iterable <|-- Collection\n \n class Container{\n __contains__(self, x)\n }\n\n class Sized{\n __len__(self)\n }\n\n class Iterable{\n __iter__(self)\n }\n\n\n\n\n\n\n\n\nIf a class implements __len__ it is a Sized object.\nIf a class implements __contains__ it is a Container object.\nIf a class implements __iter__ it is a Iterable object." }, { - "objectID": "00_introduction.html#learning-objectives", - "href": "00_introduction.html#learning-objectives", - "title": "Python package development", - "section": "Learning objectives", - "text": "Learning objectives\n\nimproved Python skills\nknowledge of how to create reusable Python code\nknow how to share code with others through a Python package" + "objectID": "06_oop.html#collections-abstract-base-classes-1", + "href": "06_oop.html#collections-abstract-base-classes-1", + "title": "Object oriented design in Python", + "section": "Collections Abstract Base Classes", + "text": "Collections Abstract Base Classes\n>>> a = [1, 2, 3]\n>>> 1 in a\nTrue\n>>> a.__contains__(1)\nTrue\n>>> len(a)\n3\n>>> a.__len__()\n3\n>>> for x in a:\n... v.append(x)\n>>> it = a.__iter__()\n>>> next(it)\n1\n>>> next(it)\n2\n>>> next(it)\n3\n>>> next(it)\nTraceback (most recent call last):\n File \"<stdin>\", line 1, in <module>\nStopIteration" }, { - "objectID": "00_introduction.html#format", - "href": "00_introduction.html#format", - "title": "Python package development", - "section": "Format", - "text": "Format\n\nOnline session (Zoom) Tuesday and Friday\nHomework assignments\nQuiz (learning platform)" + "objectID": "06_oop.html#collections-abstract-base-classes-2", + "href": "06_oop.html#collections-abstract-base-classes-2", + "title": "Object oriented design in Python", + "section": "Collections Abstract Base Classes", + "text": "Collections Abstract Base Classes\n\n\n\n\n\nclassDiagram\n Container <|-- Collection\n Sized <|-- Collection\n Iterable <|-- Collection\n Collection <|-- Sequence\n Collection <|-- Set\n Sequence <|-- MutableSequence\n Mapping <|-- MutableMapping\n Collection <|-- Mapping\n\n MutableSequence <|-- List\n Sequence <|-- Tuple\n MutableMapping <|-- Dict" }, { - "objectID": "00_introduction.html#course-material", - "href": "00_introduction.html#course-material", - "title": "Python package development", - "section": "Course material", - "text": "Course material\n\nHillard, 2020, Practices of the Python Pro, Manning\nSlides" + "objectID": "06_oop.html#pythonic", + "href": "06_oop.html#pythonic", + "title": "Object oriented design in Python", + "section": "Pythonic", + "text": "Pythonic\nIf you want your code to be Pythonic, you have to be familiar with these types and their methods.\nDundermethods:\n\n__getitem__\n__setitem__\n__len__\n__contains__\n…" }, { - "objectID": "00_introduction.html#poll", - "href": "00_introduction.html#poll", - "title": "Python package development", - "section": "Poll", - "text": "Poll\n\n\n\nPython package development" + "objectID": "06_oop.html#duck-typing", + "href": "06_oop.html#duck-typing", + "title": "Object oriented design in Python", + "section": "Duck typing", + "text": "Duck typing\n\n\n“If it walks like a duck and quacks like a duck, it’s a duck”\nFrom the perspective of the caller, it doesn’t matter if it is a rubber duck or a real duck.\nThe type of the object is not important, as long as it has the right methods.\nPython is different than C# or Java, where you would have to create an interface IToolbox and implement it for Toolbox." }, { - "objectID": "group_work/module_03.html", - "href": "group_work/module_03.html", - "title": "Python package development", - "section": "", - "text": "Q1: In your course project homework from last module, you implemented modules and classes, how did it go? Any reflections?\n\nQ2: What is you past experience with testing code?\nQ3: Things can go wrong when executing your code, how should you handle that? Check inputs? try-catch statements? What are pros and cons between different approaches?\n\nBack to overview" + "objectID": "06_oop.html#duck-typing---example", + "href": "06_oop.html#duck-typing---example", + "title": "Object oriented design in Python", + "section": "Duck typing - Example", + "text": "Duck typing - Example\nAn example is a Scikit learn transformers\n\nfit\ntransform\nfit_transform\n\nIf you want to make a transformer compatible with sklearn, you have to implement these methods." }, { - "objectID": "group_work/module_03.html#module-3", - "href": "group_work/module_03.html#module-3", - "title": "Python package development", - "section": "", - "text": "Q1: In your course project homework from last module, you implemented modules and classes, how did it go? Any reflections?\n\nQ2: What is you past experience with testing code?\nQ3: Things can go wrong when executing your code, how should you handle that? Check inputs? try-catch statements? What are pros and cons between different approaches?\n\nBack to overview" + "objectID": "06_oop.html#duck-typing---example-1", + "href": "06_oop.html#duck-typing---example-1", + "title": "Object oriented design in Python", + "section": "Duck typing - Example", + "text": "Duck typing - Example\nclass PositiveNumberTransformer:\n\n def fit(self, X, y=None):\n # no need to fit (still need to have the method!)\n return self\n\n def transform(self, X):\n return np.abs(X)\n\n def fit_transform(self, X, y=None):\n return self.fit(X, y).transform(X)" }, { - "objectID": "group_work/module_01.html", - "href": "group_work/module_01.html", - "title": "Python package development", - "section": "", - "text": "Study this script clean_project_data_v4_final2.py for 3 minutes\nConsider what you could do to improve it\nQ1: Discuss in your group how to improve the script.\nQ2: Version control. What is your experience with version control?\n\nThink about a project you’ve worked on in the past that involved collaborating with others on code. What challenges did you face, and how do you think Git and GitHub could have helped to address those challenges?\n\n\nBack to overview" + "objectID": "06_oop.html#duck-typing---mixins", + "href": "06_oop.html#duck-typing---mixins", + "title": "Object oriented design in Python", + "section": "Duck typing - Mixins", + "text": "Duck typing - Mixins\nWe can inherit some behavior from sklearn.base.TransformerMixin\nfrom sklearn.base import TransformerMixin\n\nclass RemoveOutliersTransformer(TransformerMixin):\n\n def __init__(self, lower_bound, upper_bound):\n self.lower_bound = lower_bound\n self.upper_bound = upper_bound\n self.lower_ = None\n self.upper_ = None\n\n def fit(self, X, y=None):\n self.lower_ = np.quantile(X, self.lower_bound)\n self.upper_ = np.quantile(X, self.upper_bound)\n\n def transform(self, X):\n return np.clip(X, self.lower_, self.upper_)\n\n # def fit_transform(self, X, y=None):\n # we get this for free, from TransformerMixin" }, { - "objectID": "group_work/module_01.html#module-1", - "href": "group_work/module_01.html#module-1", - "title": "Python package development", - "section": "", - "text": "Study this script clean_project_data_v4_final2.py for 3 minutes\nConsider what you could do to improve it\nQ1: Discuss in your group how to improve the script.\nQ2: Version control. What is your experience with version control?\n\nThink about a project you’ve worked on in the past that involved collaborating with others on code. What challenges did you face, and how do you think Git and GitHub could have helped to address those challenges?\n\n\nBack to overview" + "objectID": "06_oop.html#lets-revisit-the-date-interval", + "href": "06_oop.html#lets-revisit-the-date-interval", + "title": "Object oriented design in Python", + "section": "Let’s revisit the (date) Interval", + "text": "Let’s revisit the (date) Interval\nThe Interval class represent an interval in time.\nclass Interval:\n def __init__(self, start, end):\n self.start = start\n self.end = end\n\n def __contains__(self, x):\n return self.start < x < self.end\n\n>>> dr = Interval(date(2020, 1, 1), date(2020, 1, 31))\n\n>>> date(2020,1,15) in dr\nTrue\n>>> date(1970,1,1) in dr\nFalse\n\nWhat if we want to make another type of interval, e.g. a interval of numbers \\([1.0, 2.0]\\)?" }, { - "objectID": "group_work/module_02.html", - "href": "group_work/module_02.html", - "title": "Python package development", - "section": "", - "text": "Q1: In your course project homework, you refactored the script to use functions. How did it go?\nQ2: Classes. If you should introduce classes to improve the code, which classes should it be and why?\nQ3: [Optional] What are some problems with poorly designed code (based on your own experience or from the book)?\n\nBack to overview" + "objectID": "06_oop.html#a-number-interval", + "href": "06_oop.html#a-number-interval", + "title": "Object oriented design in Python", + "section": "A number interval", + "text": "A number interval\nclass Interval:\n def __init__(self, start, end):\n self.start = start\n self.end = end\n\n def __contains__(self, x):\n return self.start < x < self.end\n \n>>> interval = Interval(5, 10)\n\n>>> 8 in interval\nTrue\n>>> 12 in interval\nFalse\n\nAs long as the start, end and x are comparable, the Interval class is a generic class able to handle integers, floats, dates, datetimes, strings …" }, { - "objectID": "group_work/module_02.html#module-2", - "href": "group_work/module_02.html#module-2", - "title": "Python package development", - "section": "", - "text": "Q1: In your course project homework, you refactored the script to use functions. How did it go?\nQ2: Classes. If you should introduce classes to improve the code, which classes should it be and why?\nQ3: [Optional] What are some problems with poorly designed code (based on your own experience or from the book)?\n\nBack to overview" + "objectID": "06_oop.html#postels-law", + "href": "06_oop.html#postels-law", + "title": "Object oriented design in Python", + "section": "Postel’s law", + "text": "Postel’s law\na.k.a. the Robustness principle of software design\n\nBe liberal in what you accept\nBe conservative in what you send\n\n\ndef process(number: Union[int,str,float]) -> int:\n # make sure number is an int from now on\n number = int(number)\n\n result = number * 2\n return result" }, { - "objectID": "04_dependencies_ci.html#section", - "href": "04_dependencies_ci.html#section", - "title": "Dependencies and Continuous Integration", + "objectID": "06_oop.html#section", + "href": "06_oop.html#section", + "title": "Object oriented design in Python", "section": "", - "text": "Application\nA program that is run by a user\n\ncommand line tool\nscript\nweb application\n\nPin versions to ensure reproducibility, e.g. numpy==1.11.0\n\nLibrary\nA program that is used by another program\n\nPython package\nLow level library (C, Fortran, Rust, …)\n\nMake the requirements as loose as possible, e.g. numpy>=1.11.0\n\n\n\nMake the requirements loose, to avoid conflicts with other packages." + "text": "The consumers of your package (future self), will be grateful if you are not overly restricitive in what types you accept as input." }, { - "objectID": "04_dependencies_ci.html#dependency-management", - "href": "04_dependencies_ci.html#dependency-management", - "title": "Dependencies and Continuous Integration", - "section": "Dependency management", - "text": "Dependency management\nExample of pinning versions:\n\n\nrequirements.txt\n\nnumpy==1.11.0\nscipy==0.17.0\nmatplotlib==1.5.1\n\n\nOr using a range of versions:\n\n\nrequirements.txt\n\nnumpy>=1.11.0\nscipy>=0.17.0\nmatplotlib>=1.5.1,<=2.0.0\n\n\n\nInstall dependencies:\n$ pip install -r requirements.txt\n\nA common way to declare dependencies is to use a requirements.txt file." + "objectID": "06_oop.html#refactoring", + "href": "06_oop.html#refactoring", + "title": "Object oriented design in Python", + "section": "Refactoring", + "text": "Refactoring\n\n\nRefactoring is a way to improve the design of existing code\nChanging a software system in such a way that it does not alter the external behavior of the code, yet improves its internal structure\nRefactoring is a way to make code more readable and maintainable\nHousekeeping" }, { - "objectID": "04_dependencies_ci.html#creating-an-installable-package", - "href": "04_dependencies_ci.html#creating-an-installable-package", + "objectID": "06_oop.html#common-refactoring-techniques", + "href": "06_oop.html#common-refactoring-techniques", + "title": "Object oriented design in Python", + "section": "Common refactoring techniques:", + "text": "Common refactoring techniques:\n\nExtract method\nExtract variable\nRename method\nRename variable\nRename class\nInline method\nInline variable\nInline class" + }, + { + "objectID": "06_oop.html#rename-variable", + "href": "06_oop.html#rename-variable", + "title": "Object oriented design in Python", + "section": "Rename variable", + "text": "Rename variable\nBefore\nn = 0\nfor v in y:\n if v < 0:\n n = n + 1\n\nAfter\nFREEZING_POINT = 0.0\nn_freezing_days = 0\nfor temp in daily_max_temperatures:\n if temp < FREEZING_POINT:\n n_freezing_days = n_freezing_days + 1" + }, + { + "objectID": "06_oop.html#extract-variable", + "href": "06_oop.html#extract-variable", + "title": "Object oriented design in Python", + "section": "Extract variable", + "text": "Extract variable\nBefore\ndef predict(x):\n return min(0.0, 0.5 + 2.0 * min(0,x) + (random.random() - 0.5) / 10.0)\n\nAfter\ndef predict(x):\n scale = 10.0\n error = (random.random() - 0.5) / scale)\n a = 0.5\n b = 2.0 \n draft = a + b * x + error\n return min(0.0, draft)" + }, + { + "objectID": "06_oop.html#extract-method", + "href": "06_oop.html#extract-method", + "title": "Object oriented design in Python", + "section": "Extract method", + "text": "Extract method\ndef error(scale):\n return (random.random() - 0.5) / scale)\n\ndef linear_model(x, *, a=0.0, b=1.0):\n return a + b * x\n\ndef clip(x, *, min_value=0.0):\n return min(min_value, x)\n\ndef predict(x): \n draft = linear_model(x, a=0.5, b=2.0) + error(scale=10.0)\n return clip(draft, min_value=0.)" + }, + { + "objectID": "06_oop.html#inline-method", + "href": "06_oop.html#inline-method", + "title": "Object oriented design in Python", + "section": "Inline method", + "text": "Inline method\nOpposite of extract mehtod.\ndef predict(x): \n draft = linear_model(x, a=0.5, b=2.0) + error(scale=10.0)\n return min(0.0, x)" + }, + { + "objectID": "06_oop.html#composed-method", + "href": "06_oop.html#composed-method", + "title": "Object oriented design in Python", + "section": "Composed method", + "text": "Composed method\nBreak up a long method into smaller methods." + }, + { + "objectID": "06_oop.html#composed-method-1", + "href": "06_oop.html#composed-method-1", + "title": "Object oriented design in Python", + "section": "Composed method", + "text": "Composed method\n\nDivide your program into methods that perform one identifiable task\nKeep all of the operations in a method at the same level of abstraction.\nThis will naturally result in programs with many small methods, each a few lines long.\nWhen you use Extract method a bunch of times on a method the original method becomes a Composed method." + }, + { + "objectID": "04_dependencies_ci.html#section", + "href": "04_dependencies_ci.html#section", + "title": "Dependencies and Continuous Integration", + "section": "", + "text": "Application\nA program that is run by a user\n\ncommand line tool\nscript\nweb application\n\nPin versions to ensure reproducibility, e.g. numpy==1.11.0\n\nLibrary\nA program that is used by another program\n\nPython package\nLow level library (C, Fortran, Rust, …)\n\nMake the requirements as loose as possible, e.g. numpy>=1.11.0\n\n\n\nMake the requirements loose, to avoid conflicts with other packages." + }, + { + "objectID": "04_dependencies_ci.html#dependency-management", + "href": "04_dependencies_ci.html#dependency-management", + "title": "Dependencies and Continuous Integration", + "section": "Dependency management", + "text": "Dependency management\nExample of pinning versions:\n\n\nrequirements.txt\n\nnumpy==1.11.0\nscipy==0.17.0\nmatplotlib==1.5.1\n\n\nOr using a range of versions:\n\n\nrequirements.txt\n\nnumpy>=1.11.0\nscipy>=0.17.0\nmatplotlib>=1.5.1,<=2.0.0\n\n\n\nInstall dependencies:\n$ pip install -r requirements.txt\n\nA common way to declare dependencies is to use a requirements.txt file." + }, + { + "objectID": "04_dependencies_ci.html#creating-an-installable-package", + "href": "04_dependencies_ci.html#creating-an-installable-package", "title": "Dependencies and Continuous Integration", "section": "Creating an installable package", "text": "Creating an installable package" @@ -287,1249 +343,1193 @@ "text": "GitHub Releases\n\n\n\nGitHub releases are a way to publish software releases.\nYou can upload files, write release notes and tag the release.\nAs a minimum, the release will contain the source code at the time of the release.\nCreating a release can trigger other workflows, e.g. publishing a package to PyPI.\n\n\n\n\n\nhttps://github.com/pydata/xarray/releases/tag/v2022.12.0\n\n\n\nPython package development" }, { - "objectID": "01_version_control.html#why-use-version-control", - "href": "01_version_control.html#why-use-version-control", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "Why use version control?", - "text": "Why use version control?\n\n\n\n\n\nManage changes to code over time\nKeep track of changes and revert to previous versions if needed.\nCollaborate and merge changes from different people\nEnsure code stability\nBest practice for software development" + "objectID": "07_packaging.html#packaging", + "href": "07_packaging.html#packaging", + "title": "Distributing your Python package", + "section": "Packaging", + "text": "Packaging\nPackaging means creating a package that can be installed by pip.\nThere are many ways to create an installable package, and many ways to distribute it.\nWe will show how to create a package using hatchling, and how to distribute it on GitHub, PyPI and a private PyPI server." }, { - "objectID": "01_version_control.html#centralized-version-control", - "href": "01_version_control.html#centralized-version-control", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "Centralized version control", - "text": "Centralized version control\n\nSingle source with the entire history\nLocal copy with latest version . . .\nExamples: SVN, Surround" + "objectID": "07_packaging.html#benefits-of-packaging", + "href": "07_packaging.html#benefits-of-packaging", + "title": "Distributing your Python package", + "section": "Benefits of packaging", + "text": "Benefits of packaging\n\n\nDistribute your package to others\nInstall your package with pip\nSpecify dependencies\nReproducibility\nSpecify version\nRelease vs. development versions" }, { - "objectID": "01_version_control.html#distributed-version-control", - "href": "01_version_control.html#distributed-version-control", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "Distributed version control", - "text": "Distributed version control\n\nLocal copy has the entire history\nCommit changes to code offline\nAuthorative source (origin) . . .\nExamples: Git, Mercurial" + "objectID": "07_packaging.html#packaging-workflow", + "href": "07_packaging.html#packaging-workflow", + "title": "Distributing your Python package", + "section": "Packaging workflow", + "text": "Packaging workflow\n\nCreate a pyproject.toml in the root folder of the project\nBuild a package (e.g. myproject-0.1.0-py3-none-any.whl)\nUpload the package to location, where others can find it" }, { - "objectID": "01_version_control.html#git", - "href": "01_version_control.html#git", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "Git", - "text": "Git\nGit is a powerful tool for managing code changes and collaborating with others on a project.\n\nYou can use Git from the command line, or with a graphical user interface (GUI).\n\n\n> git add foo.py\n\n\n> git commit -m \"Nailed it\"\n\n\n> git push" + "objectID": "07_packaging.html#pyproject.toml", + "href": "07_packaging.html#pyproject.toml", + "title": "Distributing your Python package", + "section": "pyproject.toml", + "text": "pyproject.toml\n[build-system]\nrequires = [\"hatchling\"]\nbuild-backend = \"hatchling.build\"\n\n[project]\nname = \"my_library\"\nversion = \"0.0.1\"\ndependencies = [\n \"numpy\"\n]\n\nauthors = [\n { name=\"First Last\", email=\"initials@dhigroup.com\" },\n]\ndescription = \"Useful library\"\nreadme = \"README.md\"\nrequires-python = \">=3.7\"\nclassifiers = [\n \"Programming Language :: Python :: 3\",\n \"License :: OSI Approved :: MIT License\",\n \"Development Status :: 2 - Pre-Alpha\",\n \"Operating System :: OS Independent\",\n \"Topic :: Scientific/Engineering\",\n]\n\n[project.optional-dependencies]\ndev = [\"pytest\",\"flake8\",\"black\",\"sphinx\", \"myst-parser\",\"sphinx-book-theme\"]\ntest= [\"pytest\"]\n\n[project.urls]\n\"Homepage\" = \"https://github.com/DHI/my_library\"\n\"Bug Tracker\" = \"https://github.com/DHI/my_library/issues\"" }, { - "objectID": "01_version_control.html#basic-git-commands", - "href": "01_version_control.html#basic-git-commands", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "Basic Git commands", - "text": "Basic Git commands\n\n\ngit add: adds a file to the staging area\ngit commit: creates a new commit with the changes in the staging area\ngit status: shows the current status of your repository\ngit log: shows the commit history of your repository\ngit stash: temporarily save changes that are not ready to be committed" + "objectID": "07_packaging.html#versioning", + "href": "07_packaging.html#versioning", + "title": "Distributing your Python package", + "section": "Versioning", + "text": "Versioning\nVersioning your package is important for reproducibility and to avoid breaking changes.\n\n\n\nSemantic versioning use three numbers {major}.{minor}.{patch}, e.g. 1.1.0\n\n\nA new major version indicates breaking changes\nA new minor version indicates new features, without breaking changes\nA new patch version indicates a small change, e.g. a bug fix\nEach of the numbers can be higher than 9, e.g. 1.0.0 is more recent than 0.24.12" }, { - "objectID": "01_version_control.html#working-with-remote-repositories", - "href": "01_version_control.html#working-with-remote-repositories", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "Working with remote repositories", - "text": "Working with remote repositories\n\n\ngit clone: creates a copy of the codebase on your local machine.\ngit push: pushes changes back to the remote repository.\ngit pull: pulls changes from the remote repository." + "objectID": "07_packaging.html#version-1.0", + "href": "07_packaging.html#version-1.0", + "title": "Distributing your Python package", + "section": "Version 1.0", + "text": "Version 1.0\n\n\nA version number of 1.0 indicates that the package is ready for production\nThe API is stable, and breaking changes will only be introduced in new major versions\nThe package is well tested, and the documentation is complete\nStart with version 0.1.0 and increase the version number as you add features" }, { - "objectID": "01_version_control.html#branching-and-merging", - "href": "01_version_control.html#branching-and-merging", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "Branching and Merging", - "text": "Branching and Merging\n\nA branch is a separate version of your code that you can work on independently from the main branch.\ngit merge: merges changes back into the main branch (we will do this from GitHub)" + "objectID": "07_packaging.html#breaking-changes", + "href": "07_packaging.html#breaking-changes", + "title": "Distributing your Python package", + "section": "Breaking changes", + "text": "Breaking changes\nWhat is a breaking change?\n\n\nRemoving a function\nChanging the name of a function\nChanging the signature of a function (arguments, types, return value)\n\n\n\nTry to avoid breaking changes, if possible, but if you do, increase the major version number!" }, { - "objectID": "01_version_control.html#git-hosting-platforms", - "href": "01_version_control.html#git-hosting-platforms", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "Git hosting platforms", - "text": "Git hosting platforms" + "objectID": "07_packaging.html#installing-specific-versions", + "href": "07_packaging.html#installing-specific-versions", + "title": "Distributing your Python package", + "section": "Installing specific versions", + "text": "Installing specific versions\n\npip install my_library will install the latest version\npip install my_library==1.0.0 will install version 1.0.0\npip install my_library>=1.0.0 will install version 1.0.0 or higher" }, { - "objectID": "01_version_control.html#github", - "href": "01_version_control.html#github", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "GitHub", - "text": "GitHub\n\n\nGit repository hosting service\nCollaborate with others on codebase\nFork a repository to work on your own version\nPull requests for code review and merging changes\nIssue tracking and project management tools\nGitHub Pages for hosting websites" + "objectID": "07_packaging.html#pre-release-versions", + "href": "07_packaging.html#pre-release-versions", + "title": "Distributing your Python package", + "section": "Pre-release versions", + "text": "Pre-release versions\n\n\n\nVersions that are not ready for production\nIndicated by a suffix, e.g. 1.0.0rc1\nWill not be installed by default\nCan be installed with pip install my_library==1.0.0rc1\nListed on PyPI, but not on the search page" }, { - "objectID": "01_version_control.html#github-flow", - "href": "01_version_control.html#github-flow", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "Github flow", - "text": "Github flow\n\n\n\nCreate a branch\nMake changes\nCreate a pull request\nReview\nMerge\n\n\n\n\nClone a repository to work on a copy (optionally: fork first)\nCreate a branch for each new feature or fix\nCommit changes and push to remote repository\nOpen a pull request to propose changes and request code review\nMerge changes back into the main branch" + "objectID": "07_packaging.html#license", + "href": "07_packaging.html#license", + "title": "Distributing your Python package", + "section": "License", + "text": "License\n\n\nA license is a legal agreement between you and others who use your package\nIf you do not specify a license, others cannot use your package legally\nThe license is specified in the pyproject.toml file\nRead more about licenses on https://choosealicense.com/\nCheck if your package is compatible with the license of the dependencies" }, { - "objectID": "01_version_control.html#time-for-a-discussion", - "href": "01_version_control.html#time-for-a-discussion", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "Time for a discussion", - "text": "Time for a discussion\nDiscuss in learning teams (15 minutes):\n\nIntroduce your project briefly\nThink about a project you’ve worked on in the past that involved collaborating with others on code. What challenges did you face, and how do you think Git and GitHub could have helped to address those challenges?\n\nAfter break out session:\n\nOne person from each team briefly presents their discussion outcomes\n\n\n\nWhat is the benefit of working in branches?\nWhat are some best practices for collaborating on code with others, and how can Git and GitHub help to support those best practices?" + "objectID": "07_packaging.html#dependencies", + "href": "07_packaging.html#dependencies", + "title": "Distributing your Python package", + "section": "Dependencies", + "text": "Dependencies\n\n\nApplication\nA program that is run by a user\n\ncommand line tool\nscript\nweb application\n\nPin versions to ensure reproducibility, e.g. numpy==1.11.0\n\nLibrary\nA program that is used by another program\n\nPython package\nLow level library (C, Fortran, Rust, …)\n\nMake the requirements as loose as possible, e.g. numpy>=1.11.0\n\n\n\nMake the requirements loose, to avoid conflicts with other packages." }, { - "objectID": "01_version_control.html#desktop-application-github-desktop", - "href": "01_version_control.html#desktop-application-github-desktop", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "Desktop Application: GitHub Desktop", - "text": "Desktop Application: GitHub Desktop" + "objectID": "07_packaging.html#pyproject.toml-1", + "href": "07_packaging.html#pyproject.toml-1", + "title": "Distributing your Python package", + "section": "pyproject.toml", + "text": "pyproject.toml\n[build-system]\nrequires = [\"hatchling\"]\nbuild-backend = \"hatchling.build\"\n\n[project]\nname = \"my_library\"\nversion = \"0.0.1\"\ndependencies = [\n \"numpy\"\n]\n\nauthors = [\n { name=\"First Last\", email=\"initials@dhigroup.com\" },\n]\ndescription = \"Useful library\"\nreadme = \"README.md\"\nrequires-python = \">=3.7\"\nclassifiers = [\n \"Programming Language :: Python :: 3\",\n \"License :: OSI Approved :: MIT License\",\n \"Development Status :: 2 - Pre-Alpha\",\n \"Operating System :: OS Independent\",\n \"Topic :: Scientific/Engineering\",\n]\n\n[project.optional-dependencies]\ndev = [\"pytest\",\"flake8\",\"black\",\"sphinx\", \"myst-parser\",\"sphinx-book-theme\"]\ntest= [\"pytest\"]\n\n[project.urls]\n\"Homepage\" = \"https://github.com/DHI/my_library\"\n\"Bug Tracker\" = \"https://github.com/DHI/my_library/issues\"\n\n\nMandatory dependencies are specified in the dependencies section.\nOptional dependencies are specified in the optional-dependencies section." }, { - "objectID": "01_version_control.html#demo", - "href": "01_version_control.html#demo", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "Demo", - "text": "Demo" + "objectID": "07_packaging.html#classifiers", + "href": "07_packaging.html#classifiers", + "title": "Distributing your Python package", + "section": "Classifiers", + "text": "Classifiers\nclassifiers = [\n \"Programming Language :: Python :: 3\",\n \"License :: OSI Approved :: MIT License\",\n \"Development Status :: 2 - Pre-Alpha\",\n \"Operating System :: OS Independent\",\n \"Topic :: Scientific/Engineering\",\n]\n\nClassifiers are used to categorize your package\nLess relevant for internal packages\nOperating system (Windows, Linux, MacOS)\nDevelopment status (Alpha, Beta, Production/Stable)" }, { - "objectID": "01_version_control.html#github-best-practices", - "href": "01_version_control.html#github-best-practices", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "Github best practices", - "text": "Github best practices\n\n\nCommit often\nUse descriptive commit messages\nKeep pull requests small and focused\nUse “issues” to track work\nReview code regularly" + "objectID": "07_packaging.html#packaging-non-python-files", + "href": "07_packaging.html#packaging-non-python-files", + "title": "Distributing your Python package", + "section": "Packaging non-Python files", + "text": "Packaging non-Python files\n\nIncluding non-Python files can be useful for e.g. machine learning models.\nIf you use hatchling, you can include non-Python files in your package.\nhatchling uses .gitignore to determine which files to include." }, { - "objectID": "01_version_control.html#resources", - "href": "01_version_control.html#resources", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "Resources", - "text": "Resources\n\nGitHub: quickstart\nRealPython: git and github intro\nDatacamp: introduction to Git" + "objectID": "07_packaging.html#github-secrets", + "href": "07_packaging.html#github-secrets", + "title": "Distributing your Python package", + "section": "GitHub secrets", + "text": "GitHub secrets\n\nStore sensitive information, e.g. passwords, in your repository.\nSecrets are encrypted, and only visible to you and GitHub Actions.\nAdd secrets in the repository settings.\n\nTo use secrets as environment variables in GitHub Actions, add them to the env section of the workflow:\nenv:\n TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}" }, { - "objectID": "01_version_control.html#word-list", - "href": "01_version_control.html#word-list", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "Word list", - "text": "Word list\n\nClone\n\nmaking a local copy of a remote repository on your computer.\n\nRemote\n\na reference to a Git repository that is hosted on a remote server, typically on a service like GitHub.\n\nCommit\n\na record of changes made to a repository, including the changes themselves and a message describing what was changed.\n\nStage\n\nselecting changes that you want to include in the next commit.\n\nPush\n\nsending changes from your local repository to a remote repository.\n\nPull\n\nretrieving changes from a remote repository and merging them into your local repository.\n\nBranch\n\na separate line of development that can be used to work on new features or bug fixes without affecting the main codebase.\n\nPull request\n\na way to propose changes to a repository by asking the repository owner to “pull” in the changes from a branch or fork.\n\nStash\n\ntemporarily save changes that are not ready to be committed (bring them back later when needed).\n\nMerge\n\nthe process of combining changes from one branch or fork into another, typically the main codebase.\n\nRebase\n\na way to integrate changes from one branch into another by applying the changes from the first branch to the second branch as if they were made there all along.\n\nMerge conflict\n\nwhen Git is unable to automatically merge changes from two different branches, because the changes overlap or conflict.\n\nCheckout\n\nswitching between different branches or commits in a repository.\n\nFork\n\na copy of a repository that you create on your own account, which you can modify without affecting the original repository." + "objectID": "07_packaging.html#github-actions", + "href": "07_packaging.html#github-actions", + "title": "Distributing your Python package", + "section": "GitHub Actions", + "text": "GitHub Actions\n\n\n.github/workflows/python-package.yml\n\nname: Publish Python Package\non:\n release:\n types: [created]\njobs:\n deploy:\n runs-on: ubuntu-latest\n steps:\n - uses: actions/checkout@v2\n - name: Set up Python\n uses: actions/setup-python@v2\n with:\n python-version: '3.10'\n - name: Install dependencies\n run: |\n python -m pip install --upgrade pip\n pip install build\n - name: Build package\n run: python -m build\n \n - name: Publish to PyPI\n env:\n TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n run: |\n twine upload dist/*" }, { - "objectID": "01_version_control.html#summary", - "href": "01_version_control.html#summary", - "title": "Git, GitHub, Pull Requests, and code reviews", - "section": "Summary", - "text": "Summary\n\n\nVersion control is a tool for managing changes to code\nGit is a distributed version control system (software)\nGitHub is a platform for hosting and collaborating on Git repositories\nGitHub Desktop is a GUI for Git (and GitHub)\nPull requests are a way to propose changes to a repository\n\n\n\n\n\nPython package development" + "objectID": "07_packaging.html#private-pypi-server", + "href": "07_packaging.html#private-pypi-server", + "title": "Distributing your Python package", + "section": "Private PyPI server", + "text": "Private PyPI server\n\nPrivate packages can be be hosted on e.g. Azure Arfifacts or Posit Package Manager.\nThese servers behaves like PyPI, and can be used with pip\nAccess policies can be used to control who can install packages.\n\n\nExample:\n$ pip install --extra-index-url https://pkgs.dev.azure.com/dhigroup/_packaging/pond/pypi/simple/ sampling\nLooking in indexes: https://pypi.org/simple, https://pkgs.dev.azure.com/dhigroup/_packaging/pond/pypi/simple/\n...\nSuccessfully installed sampling-0.0.1" }, { - "objectID": "03_testing.html#testing", - "href": "03_testing.html#testing", - "title": "Testing, linting and formatting", - "section": "Testing", - "text": "Testing\nVerify code is working as expected.\nSimplest way to test is to run code and check output.\n\nAutomated testing checks output automatically.\nCode changes can break other parts of code.\nAutomatic testing verifies code is still working." + "objectID": "07_packaging.html#installing-a-development-version", + "href": "07_packaging.html#installing-a-development-version", + "title": "Distributing your Python package", + "section": "Installing a development version", + "text": "Installing a development version\n\nInstall latest dev version, e.g. pip install https://github.com/DHI/mikeio/archive/main.zip\nInstall from fix-interp branch, e.g. pip install https://github.com/DHI/mikeio/archive/fix-interp.zip" }, { - "objectID": "03_testing.html#testing-workflow", - "href": "03_testing.html#testing-workflow", - "title": "Testing, linting and formatting", - "section": "Testing workflow", - "text": "Testing workflow\n\n\n\n\nflowchart TD\n A[Prepare inputs]\n B[Describe expected output]\n C[Obtain actual output]\n D[Compare actual and\\n expected output]\n\n A --> B --> C --> D" + "objectID": "07_packaging.html#recap", + "href": "07_packaging.html#recap", + "title": "Distributing your Python package", + "section": "Recap", + "text": "Recap\n\nGit, Pull Requests, and code reviews\nPython functions, classes, and modules\nTypes, abstraction, and refactoring\nTesting and auto-formatting\nDependencies and GitHub actions\nDocumentation\nDistributing your package" }, { - "objectID": "03_testing.html#unit-testing", - "href": "03_testing.html#unit-testing", - "title": "Testing, linting and formatting", - "section": "Unit testing", - "text": "Unit testing\n\n\n\n\n\n\nDefinition “Unit”\n\n\n\nA small, fundamental piece of code.\nExecuted in isolation with appropriate inputs.\n\n\n\n\n\n\nA function is typically considered a “unit”\nLines of code within functions are smaller (can’t be isolated)\nClasses are considered bigger (but can be treated as units)" + "objectID": "07_packaging.html#git-pull-requests-and-code-reviews", + "href": "07_packaging.html#git-pull-requests-and-code-reviews", + "title": "Distributing your Python package", + "section": "Git, Pull Requests, and code reviews", + "text": "Git, Pull Requests, and code reviews" }, { - "objectID": "03_testing.html#a-good-unit-test", - "href": "03_testing.html#a-good-unit-test", - "title": "Testing, linting and formatting", - "section": "A good unit test", - "text": "A good unit test\n\n\n\n\nFully automated (next week)\nHas full control over all the pieces running (“fake” external dependencies)\nCan be run in any order\nRuns in memory (no DB or file access, for example)\nConsistently returns the same result (no random numbers)\nRuns fast\nTests a single logical concept in the system\nReadable\nMaintainable\nTrustworthy" + "objectID": "07_packaging.html#github-flow", + "href": "07_packaging.html#github-flow", + "title": "Distributing your Python package", + "section": "Github flow", + "text": "Github flow\n\n\nCreate a branch\nMake changes\nCreate a pull request\nReview\nMerge" }, { - "objectID": "03_testing.html#example", - "href": "03_testing.html#example", - "title": "Testing, linting and formatting", - "section": "Example", - "text": "Example\n\nget a timeseries of water levels\nfind the maxiumum water level each year\ncreate a summary report for the subset of data" + "objectID": "07_packaging.html#github-best-practices", + "href": "07_packaging.html#github-best-practices", + "title": "Distributing your Python package", + "section": "Github best practices", + "text": "Github best practices\n\nCommit often\nUse descriptive commit messages\nKeep pull requests small and focused\nUse “issues” to track work\nReview code regularly" }, { - "objectID": "03_testing.html#integration-testing", - "href": "03_testing.html#integration-testing", - "title": "Testing, linting and formatting", - "section": "Integration testing", - "text": "Integration testing\ndef test_integration():\n wl = get_water_level(time=\"2019-01-01\", location=\"Aarhus\")\n max_wls = get_max_water_level(wl, freq=\"Y\")\n report = summary_report(max_wls)\n\n assert report.title == \"Summary report\"\n assert report.text == \"The maximum water level in 2021 was 3.0 m\"" + "objectID": "07_packaging.html#python-functions-classes-and-modules", + "href": "07_packaging.html#python-functions-classes-and-modules", + "title": "Distributing your Python package", + "section": "Python functions, classes, and modules", + "text": "Python functions, classes, and modules" }, { - "objectID": "03_testing.html#testing-in-vs-code", - "href": "03_testing.html#testing-in-vs-code", - "title": "Testing, linting and formatting", - "section": "Testing in VS Code", - "text": "Testing in VS Code" + "objectID": "07_packaging.html#functions-as-black-boxes", + "href": "07_packaging.html#functions-as-black-boxes", + "title": "Distributing your Python package", + "section": "Functions as black boxes", + "text": "Functions as black boxes\n\n\n\n\nflowchart LR\n A(Input A) --> F[\"Black box\"]\n B(Input B) --> F\n F --> O(Output)\n\n style F fill:#000,color:#fff,stroke:#333,stroke-width:4px\n\n\n\n\n\n\nA function is a black box that takes some input and produces some output.\nThe input and output can be anything, including other functions.\nAs long as the input and output are the same, the function body can be modified." }, { - "objectID": "03_testing.html#fixtures", - "href": "03_testing.html#fixtures", - "title": "Testing, linting and formatting", - "section": "Fixtures", - "text": "Fixtures\n\n\nA piece of code that is used by multiple tests\nProvide data or services to tests\nDefined with @pytest.fixture\nSet up test environment\nPass fixtures as test arguments" + "objectID": "07_packaging.html#naming-conventions---general", + "href": "07_packaging.html#naming-conventions---general", + "title": "Distributing your Python package", + "section": "Naming conventions - general", + "text": "Naming conventions - general\n\nUse lowercase characters\nSeparate words with underscores\n\nmodel_name = \"NorthSeaModel\"\nn_epochs = 100\n\ndef my_function():\n pass" }, { - "objectID": "03_testing.html#fixture-example", - "href": "03_testing.html#fixture-example", - "title": "Testing, linting and formatting", - "section": "Fixture example", - "text": "Fixture example\n@pytest.fixture\ndef water_level():\n return TimeSeries([1.0, .., 3.0], start = \"2019-01-01\")\n\ndef test_get_max_water_level(water_level):\n max_wls = get_max_water_level(water_level, freq=\"Y\")\n \n assert len(max_wls) == 1\n assert max_wls[0] == 3.0" + "objectID": "07_packaging.html#constants", + "href": "07_packaging.html#constants", + "title": "Distributing your Python package", + "section": "Constants", + "text": "Constants\n\nUse all uppercase characters\n\nGRAVITY = 9.81\n\nAVOGADRO_CONSTANT = 6.02214076e23\n\nSECONDS_IN_A_DAY = 86400\n\nN_LEGS_PER_ANIMAL = {\n \"human\": 2,\n \"dog\": 4,\n \"spider\": 8,\n}" }, { - "objectID": "03_testing.html#test-coverage", - "href": "03_testing.html#test-coverage", - "title": "Testing, linting and formatting", - "section": "Test coverage", - "text": "Test coverage\n\n\nA measure of how much of your code is tested\nA good test suite should cover all the code\nInstall pytest-cov\nRun tests with coverage report\n\npytest --cov=myproj\n\nUse coverage report to identify untested code" + "objectID": "07_packaging.html#classes", + "href": "07_packaging.html#classes", + "title": "Distributing your Python package", + "section": "Classes", + "text": "Classes\n\nUse CamelCase for the name of the class\nUse lowercase characters for the name of the methods\nSeparate words with underscores\n\nclass RandomClassifier:\n\n def fit(self, X, y):\n self.classes_ = np.unique(y)\n\n def predict(self, X):\n return np.random.choice(self.classes_, size=len(X))\n\n def fit_predict(self, X, y):\n self.fit(X, y)\n return self.predict(X)" }, { - "objectID": "03_testing.html#test-coverage-report", - "href": "03_testing.html#test-coverage-report", - "title": "Testing, linting and formatting", - "section": "Test coverage report", - "text": "Test coverage report\npytest --cov=myproj tests/\n-------------------- coverage: ... ---------------------\nName Stmts Miss Cover\n----------------------------------------\nmyproj/__init__ 2 0 100%\nmyproj/myproj 257 13 94%\nmyproj/feature4286 94 7 92%\n----------------------------------------\nTOTAL 353 20 94%" + "objectID": "07_packaging.html#dataclasses", + "href": "07_packaging.html#dataclasses", + "title": "Distributing your Python package", + "section": "Dataclasses", + "text": "Dataclasses\nimport datetime\nfrom dataclasses import dataclass\n\n\n@dataclass\nclass Interval:\n start: date\n end: date\n\n>>> dr1 = Interval(start=datetime.date(2020, 1, 1), end=datetime.date(2020, 1, 31))\n>>> dr1\nInterval(start=datetime.date(2020, 1, 1), end=datetime.date(2020, 1, 31))\n>>> dr2 = Interval(start=datetime.date(2020, 1, 1), end=datetime.date(2020, 1, 31))\n>>> dr1 == dr2\nTrue" }, { - "objectID": "03_testing.html#testing-advice", - "href": "03_testing.html#testing-advice", - "title": "Testing, linting and formatting", - "section": "Testing advice", - "text": "Testing advice\n\n\n\n\n\n\nTest edge cases\n\n\n\nempty lists\nlists with a single element\nempty strings\nempty dictionaries\nNone\nnp.nan" + "objectID": "07_packaging.html#types-abstraction-and-refactoring", + "href": "07_packaging.html#types-abstraction-and-refactoring", + "title": "Distributing your Python package", + "section": "Types, abstraction, and refactoring", + "text": "Types, abstraction, and refactoring" }, { - "objectID": "03_testing.html#tests-act-as-specification", - "href": "03_testing.html#tests-act-as-specification", - "title": "Testing, linting and formatting", - "section": "Tests act as specification", - "text": "Tests act as specification\ndef test_operable_period_can_be_missing():\n\n assert is_operable(height=1.0, period=None)\n assert is_operable(height=1.0, period=np.nan)\n assert is_operable(height=1.0)\n assert not is_operable(height=11.0)\n\ndef test_height_can_not_be_missing():\n\n with pytest.raises(ValueError) as excinfo:\n is_operable(height=None)\n is_operable(height=np.nan)\n \n assert \"height\" in str(excinfo.value)" + "objectID": "07_packaging.html#pythonic", + "href": "07_packaging.html#pythonic", + "title": "Distributing your Python package", + "section": "Pythonic", + "text": "Pythonic\nIf you want your code to be Pythonic, you have to be familiar with these types and their methods.\nDundermethods:\n\n__getitem__\n__setitem__\n__len__\n__contains__\n…" }, { - "objectID": "03_testing.html#test-driven-development", - "href": "03_testing.html#test-driven-development", - "title": "Testing, linting and formatting", - "section": "Test driven development", - "text": "Test driven development\n\n\nWrite a test that fails ❌\nWrite the code to make the test pass ✅\nRefactor the code ⚒️\n\n\n\nThe benefit of this approach is that you are forced to think about the expected behaviour of your code before you write it.\nIt is also too easy to write a test that passes without actually testing the code." + "objectID": "07_packaging.html#duck-typing", + "href": "07_packaging.html#duck-typing", + "title": "Distributing your Python package", + "section": "Duck typing", + "text": "Duck typing\n\n“If it walks like a duck and quacks like a duck, it’s a duck”\nFrom the perspective of the caller, it doesn’t matter if it is a rubber duck or a real duck.\nThe type of the object is not important, as long as it has the right methods." }, { - "objectID": "03_testing.html#section", - "href": "03_testing.html#section", - "title": "Testing, linting and formatting", - "section": "", - "text": "and now for something completely different…" + "objectID": "07_packaging.html#testing-and-auto-formatting", + "href": "07_packaging.html#testing-and-auto-formatting", + "title": "Distributing your Python package", + "section": "Testing and auto-formatting", + "text": "Testing and auto-formatting" }, { - "objectID": "03_testing.html#the-zen-of-python", - "href": "03_testing.html#the-zen-of-python", - "title": "Testing, linting and formatting", - "section": "The Zen of Python", - "text": "The Zen of Python\nBeautiful is better than ugly.\nExplicit is better than implicit.\nSimple is better than complex.\nComplex is better than complicated.\nFlat is better than nested.\nSparse is better than dense.\nReadability counts.\n\n…\nErrors should never pass silently.\nUnless explicitly silenced.\n…" + "objectID": "07_packaging.html#unit-testing", + "href": "07_packaging.html#unit-testing", + "title": "Distributing your Python package", + "section": "Unit testing", + "text": "Unit testing\n\n\n\n\n\n\nDefinition “Unit”\n\n\n\nA small, fundamental piece of code.\nExecuted in isolation with appropriate inputs.\n\n\n\n\n\nA function is typically considered a “unit”\nLines of code within functions are smaller (can’t be isolated)\nClasses are considered bigger (but can be treated as units)" }, { - "objectID": "03_testing.html#exceptions", - "href": "03_testing.html#exceptions", - "title": "Testing, linting and formatting", - "section": "Exceptions", - "text": "Exceptions\n\n\nExceptions are a way to handle errors in your code.\nRaising an exception can prevent propagating bad values.\nExceptions are communication between the programmer and the user.\nThere are many built-in exceptions in Python\n\nIndexError\nKeyError\nValueError\nFileNotFoundError\n\nYou can also create your own custom exceptions, e.g. ModelInitialistionError, MissingLicenseError?" + "objectID": "07_packaging.html#a-good-unit-test", + "href": "07_packaging.html#a-good-unit-test", + "title": "Distributing your Python package", + "section": "A good unit test", + "text": "A good unit test\n\nFully automated\nHas full control over all the pieces running (“fake” external dependencies)\nCan be run in any order\nRuns in memory (no DB or file access, for example)\nConsistently returns the same result (no random numbers)\nRuns fast\nTests a single logical concept in the system\nReadable\nMaintainable\nTrustworthy" }, { - "objectID": "03_testing.html#example-1", - "href": "03_testing.html#example-1", - "title": "Testing, linting and formatting", - "section": "Example", - "text": "Example\n\n\nsrc/ops.py\n\ndef is_operable(height:float, period:float) -> bool:\n if height < 0.0:\n raise ValueError(f\"Supplied value of {height=} is unphysical.\")\n\n>>> is_operable(height=-1.0, period=4.0)\n\nTraceback (most recent call last):\n ...\nValueError: Supplied value of height=-1.0 is unphysical.\n\n\nIt is better to raise an exception (that can terminate the program), than to propagate a bad value." + "objectID": "07_packaging.html#thank-you", + "href": "07_packaging.html#thank-you", + "title": "Distributing your Python package", + "section": "Thank you!", + "text": "Thank you!\n\n\n\nPython package development" }, { - "objectID": "03_testing.html#warnings", - "href": "03_testing.html#warnings", - "title": "Testing, linting and formatting", - "section": "Warnings", - "text": "Warnings\nWarnings are a way to alert users of your code to potential issues or usage errors without actually halting the program’s execution.\n\n\nsrc/ops.py\n\nimport warnings\nwarnings.warn(\"This is a warning\")" + "objectID": "index.html", + "href": "index.html", + "title": "Python package development", + "section": "", + "text": "Introduction" }, { - "objectID": "03_testing.html#how-to-test-exceptions", - "href": "03_testing.html#how-to-test-exceptions", - "title": "Testing, linting and formatting", - "section": "How to test exceptions", - "text": "How to test exceptions\n\n\ntests/test_ops.py\n\nimport pytest\nfrom ops import is_operable\n\ndef test_negative_heights_are_not_valid():\n with pytest.raises(ValueError):\n is_operable(height=-1.0, period=4.0)\n\nThe same can be done with warnings." + "objectID": "index.html#learning-modules", + "href": "index.html#learning-modules", + "title": "Python package development", + "section": "Learning modules", + "text": "Learning modules\n\nGit, Pull Requests, and code reviews\n\nDiscussion\nHomework\n\nPython functions, classes, and modules\n\nDiscussion\nHomework\n\nTesting and auto-formatting\n\nHomework\n\nDependencies and GitHub actions\n\nHomework\n\nDocumentation\n\nHomework\n\nObject oriented design in Python\n\nHomework\n\nDistributing your package\n\nHomework\n\n\n©️ DHI 2023" }, { - "objectID": "03_testing.html#linting", - "href": "03_testing.html#linting", - "title": "Testing, linting and formatting", - "section": "Linting", - "text": "Linting\nA way to check your code for common errors and style issues.\nruff is a new tool for linting Python code.\n\nsyntax errors\nunused imports\nunused variables\nundefined names\ncode style (e.g. line length, indentation, whitespace, etc.)" + "objectID": "projects/data_cleaning/notebook_A.html", + "href": "projects/data_cleaning/notebook_A.html", + "title": "Clean data from a file", + "section": "", + "text": "# useful if your change your modules after starting the kernel\n%load_ext autoreload\n%autoreload 2\nimport pandas as pd\nfrom cleaning import SpikeCleaner, OutOfRangeCleaner, FlatPeriodCleaner\nfrom plotting import plot_timeseries\nfn = \"./example_data1.csv\"\ndf = pd.read_csv(fn, index_col=0, parse_dates=True, dtype=float)\ndf.head(10)\n\n\n\n\n\n\n\n\nseries1\n\n\n\n\n2020-01-01\n1.0\n\n\n2020-01-02\n2.0\n\n\n2020-01-03\n-1.0\n\n\n2020-01-04\n4.0\n\n\n2020-01-05\n5.0\n\n\n2020-01-06\n20.0\n\n\n2020-01-07\n7.0\n\n\n2020-01-08\n8.0\n\n\n2020-01-09\n9.0\n\n\n2020-01-10\n10.0" }, { - "objectID": "03_testing.html#linting-with-ruff", - "href": "03_testing.html#linting-with-ruff", - "title": "Testing, linting and formatting", - "section": "Linting with ruff", - "text": "Linting with ruff\n\n\nexamples/04_testing/process.py\n\nimport requests\nimport scipy\n\ndef preprocess(x, y, xout):\n\n x = x[~np.isnan(x)] \n method = \"cubic\"\n # interpolate missing values with cubic spline\n return scipy.interpolate.interp1d(x, y)(xout)\n\nRun ruff:\n$ ruff process.py\nprocess.py:1:8: F401 [*] `requests` imported but unused\nprocess.py:6:12: F821 Undefined name `np`\nprocess.py:7:5: F841 [*] Local variable `method` is assigned to but never used\nFound 3 errors.\n[*] 2 potentially fixable with the --fix option.\n\n\nLinting is a fast way to find common errors.\nUnused imports are confusing.\nUnused and undefined variables are usually a typo or a mistake. Fixing them can prevent bugs." + "objectID": "projects/data_cleaning/notebook_A.html#try-out-one-cleaner-first", + "href": "projects/data_cleaning/notebook_A.html#try-out-one-cleaner-first", + "title": "Clean data from a file", + "section": "Try out one cleaner first", + "text": "Try out one cleaner first\n\ncleaner = SpikeCleaner(max_jump=10)\n\n\ndf[\"clean1\"] = cleaner.clean(df.series1)\n\n\ndf.head(10)\n\n\n\n\n\n\n\n\nseries1\nclean1\n\n\n\n\n2020-01-01\n1.0\n1.0\n\n\n2020-01-02\n2.0\n2.0\n\n\n2020-01-03\n-1.0\n-1.0\n\n\n2020-01-04\n4.0\n4.0\n\n\n2020-01-05\n5.0\n5.0\n\n\n2020-01-06\n20.0\nNaN\n\n\n2020-01-07\n7.0\n7.0\n\n\n2020-01-08\n8.0\n8.0\n\n\n2020-01-09\n9.0\n9.0\n\n\n2020-01-10\n10.0\n10.0\n\n\n\n\n\n\n\n\nplot_timeseries(df.series1, df.clean1)" }, { - "objectID": "03_testing.html#formatting", - "href": "03_testing.html#formatting", - "title": "Testing, linting and formatting", - "section": "Formatting", - "text": "Formatting\n\n\nFormatting code for readability and maintainability is essential.\nblack is an opinionated automatic code formatter for Python.\nIt enforces its own rules for formatting, which are not configurable.\nHaving a unified style makes code changes easier to understand and collaborate on." + "objectID": "projects/data_cleaning/notebook_A.html#apply-all-cleaners", + "href": "projects/data_cleaning/notebook_A.html#apply-all-cleaners", + "title": "Clean data from a file", + "section": "Apply all cleaners", + "text": "Apply all cleaners\n\ncleaners = [\n SpikeCleaner(max_jump=10),\n OutOfRangeCleaner(min_val=0, max_val=50),\n FlatPeriodCleaner(flat_period=5),\n]\n\n\ncleaned_data = df.series1.copy()\nfor cleaner in cleaners:\n cleaned_data = cleaner.clean(cleaned_data)\n # plot_timeseries(df.series1, cleaned_data) # check for each step if something is not working\nplot_timeseries(df.series1, cleaned_data)" }, { - "objectID": "03_testing.html#running-black", - "href": "03_testing.html#running-black", - "title": "Testing, linting and formatting", - "section": "Running Black", - "text": "Running Black\n$ black .\nreformatted data_utils.py\nreformatted dfsu/__init__.py\nreformatted dataarray.py\nreformatted dataset.py\nreformatted spatial/geometry.py\nreformatted pfs/pfssection.py\n\nAll done! ✨ 🍰 ✨\n6 files reformatted, 27 files left unchanged." + "objectID": "projects/data_cleaning/Project_module_05.html", + "href": "projects/data_cleaning/Project_module_05.html", + "title": "Python package development", + "section": "", + "text": "Create new branch “docs” (Make sure changes from last module have been merged, and that you start from the main branch)\n6.1 README\n\nWrite a README file with basic information about the project.\n\n6.2 Docstrings\n\nWrite NumPy style docstrings for all functions and classes.\n[Optional] Install the autodocstrings extension in VSCode (set the style to NumPy)\n\n6.3 mkdocs\n\nInstall mkdocs, mkdocstrings and material design mamba/pip install mkdocstrings-python mkdocs-material\nCreate a mkdocs.yml file (copy from https://github.com/DHI/template-python-library and adapt).\nCreate a docs folder and create a markdown file index.md inside.\nCreate API documentation locally using >mkdocs serve.\nCheck the generated HTML documentation.\n\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" }, { - "objectID": "03_testing.html#running-black-1", - "href": "03_testing.html#running-black-1", - "title": "Testing, linting and formatting", - "section": "Running Black", - "text": "Running Black\nVisual Studio Code can be configured to run black automatically when saving a file using the Black extension." + "objectID": "projects/data_cleaning/Project_module_05.html#module-5-documentation", + "href": "projects/data_cleaning/Project_module_05.html#module-5-documentation", + "title": "Python package development", + "section": "", + "text": "Create new branch “docs” (Make sure changes from last module have been merged, and that you start from the main branch)\n6.1 README\n\nWrite a README file with basic information about the project.\n\n6.2 Docstrings\n\nWrite NumPy style docstrings for all functions and classes.\n[Optional] Install the autodocstrings extension in VSCode (set the style to NumPy)\n\n6.3 mkdocs\n\nInstall mkdocs, mkdocstrings and material design mamba/pip install mkdocstrings-python mkdocs-material\nCreate a mkdocs.yml file (copy from https://github.com/DHI/template-python-library and adapt).\nCreate a docs folder and create a markdown file index.md inside.\nCreate API documentation locally using >mkdocs serve.\nCheck the generated HTML documentation.\n\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" }, { - "objectID": "03_testing.html#profiling", - "href": "03_testing.html#profiling", - "title": "Testing, linting and formatting", - "section": "Profiling", - "text": "Profiling\n\n\nProfiling is a way to measure the performance of your code.\nIt can help you identify bottlenecks in your code.\nYour intuition about what is slow is often wrong.\nThe line_profiler package reports the time spent on each line of code.\nIt can be run inside a notebook using the lprun magic command." + "objectID": "projects/data_cleaning/index.html", + "href": "projects/data_cleaning/index.html", + "title": "Course project: Time Series Data Cleaning", + "section": "", + "text": "1.1 GitHub repo\n1.2 Functions\n\n\n\n\n\n2.1 Function arguments\n2.2 Modules\n2.3 Classes\n\n\n\n\n\n3.1 Installable package\n3.2 Pytest\n\n\n\n\n\n4.1 Github Action\n4.2 Ruff\n4.3 Black\n4.4 pyproject.toml\n\n\n\n\n\n5.1 Type Hints\n5.2 Data class\n5.3 Module level function\n5.4 Composition or inheritance\n\n\n\n\n\n6.1 README\n6.2 Docstrings\n6.3 mkdocs\n\n\n\n\n\nAdd a license\nChange version number to 0.1.0\nBuild the package with hatchling.\nPublish the package to the PyPI Test Server." }, { - "objectID": "03_testing.html#profiling---example-code", - "href": "03_testing.html#profiling---example-code", - "title": "Testing, linting and formatting", - "section": "Profiling - example code", - "text": "Profiling - example code\nimport numpy as np\n\ndef top_neighbors(points, radius=\"0.1\"):\n \"\"\"Don't use this function, it's only purpose is to be profiled.\"\"\"\n n = len(points)\n idx = np.array([int(x) for x in str.split(\"0 \"* n)])\n\n for i in range(n):\n for j in range(n):\n if i != j:\n d = np.sqrt(np.sum((points[i] - points[j])**2))\n if d < float(radius): \n idx[i] += 1\n for i in range(n):\n for j in range(n - i - 1):\n if idx[j] < idx[j + 1]:\n idx[j], idx[j + 1] = idx[j + 1], idx[j]\n points[j], points[j + 1] = points[j + 1], points[j]\n return points\n\ndef main():\n points = np.random.rand(1000, 2)\n top = top_neighbors(points)" + "objectID": "projects/data_cleaning/index.html#module-1-github-and-basic-functions", + "href": "projects/data_cleaning/index.html#module-1-github-and-basic-functions", + "title": "Course project: Time Series Data Cleaning", + "section": "", + "text": "1.1 GitHub repo\n1.2 Functions" }, { - "objectID": "03_testing.html#profiling---output", - "href": "03_testing.html#profiling---output", - "title": "Testing, linting and formatting", - "section": "Profiling - output", - "text": "Profiling - output\nInvoking the jupyter magic command lprun with:\n\nfunction to profile - top_neighbors\ncode to run - main()\n\n%lprun -f top_neighbors main()\n\n\nLine # Hits Time Per Hit % Time Line Contents\n==============================================================\n 3 def top_neighbors(points, radius=\"0.1\"):\n 4 \"\"\"Don't use this function, it's only purpose is to be profiled.\"\"\"\n 5 1 2800.0 2800.0 0.0 n = len(points)\n 6 1 353300.0 353300.0 0.0 idx = np.array([int(x) for x in str.split(\"0 \"* n)])\n 7 \n 8 1001 345100.0 344.8 0.0 for i in range(n):\n 9 1001000 378191701.0 377.8 2.2 for j in range(n):\n 10 1000000 328387205.0 328.4 1.9 if i != j:\n 11 999000 1e+10 14473.0 83.8 d = np.sqrt(np.sum((points[i] - points[j])**2))\n 12 999000 933778605.0 934.7 5.4 if d < float(radius): \n 13 28952 57010001.0 1969.1 0.3 idx[i] += 1\n 14 1001 367100.0 366.7 0.0 for i in range(n):\n 15 500500 144295203.0 288.3 0.8 for j in range(n - i - 1):\n 16 499500 302166901.0 604.9 1.8 if idx[j] < idx[j + 1]:\n 17 240227 212070500.0 882.8 1.2 idx[j], idx[j + 1] = idx[j + 1], idx[j]\n 18 240227 437538803.0 1821.4 2.5 points[j], points[j + 1] = points[j + 1], points[j]\n 19 1 500.0 500.0 0.0 return points\n\n\n\nPython package development" + "objectID": "projects/data_cleaning/index.html#module-2-modules-and-classes", + "href": "projects/data_cleaning/index.html#module-2-modules-and-classes", + "title": "Course project: Time Series Data Cleaning", + "section": "", + "text": "2.1 Function arguments\n2.2 Modules\n2.3 Classes" }, { - "objectID": "07_packaging.html#packaging", - "href": "07_packaging.html#packaging", - "title": "Distributing your Python package", - "section": "Packaging", - "text": "Packaging\nPackaging means creating a package that can be installed by pip.\nThere are many ways to create an installable package, and many ways to distribute it.\nWe will show how to create a package using hatchling, and how to distribute it on GitHub, PyPI and a private PyPI server." - }, - { - "objectID": "07_packaging.html#benefits-of-packaging", - "href": "07_packaging.html#benefits-of-packaging", - "title": "Distributing your Python package", - "section": "Benefits of packaging", - "text": "Benefits of packaging\n\n\nDistribute your package to others\nInstall your package with pip\nSpecify dependencies\nReproducibility\nSpecify version\nRelease vs. development versions" - }, - { - "objectID": "07_packaging.html#packaging-workflow", - "href": "07_packaging.html#packaging-workflow", - "title": "Distributing your Python package", - "section": "Packaging workflow", - "text": "Packaging workflow\n\nCreate a pyproject.toml in the root folder of the project\nBuild a package (e.g. myproject-0.1.0-py3-none-any.whl)\nUpload the package to location, where others can find it" - }, - { - "objectID": "07_packaging.html#pyproject.toml", - "href": "07_packaging.html#pyproject.toml", - "title": "Distributing your Python package", - "section": "pyproject.toml", - "text": "pyproject.toml\n[build-system]\nrequires = [\"hatchling\"]\nbuild-backend = \"hatchling.build\"\n\n[project]\nname = \"my_library\"\nversion = \"0.0.1\"\ndependencies = [\n \"numpy\"\n]\n\nauthors = [\n { name=\"First Last\", email=\"initials@dhigroup.com\" },\n]\ndescription = \"Useful library\"\nreadme = \"README.md\"\nrequires-python = \">=3.7\"\nclassifiers = [\n \"Programming Language :: Python :: 3\",\n \"License :: OSI Approved :: MIT License\",\n \"Development Status :: 2 - Pre-Alpha\",\n \"Operating System :: OS Independent\",\n \"Topic :: Scientific/Engineering\",\n]\n\n[project.optional-dependencies]\ndev = [\"pytest\",\"flake8\",\"black\",\"sphinx\", \"myst-parser\",\"sphinx-book-theme\"]\ntest= [\"pytest\"]\n\n[project.urls]\n\"Homepage\" = \"https://github.com/DHI/my_library\"\n\"Bug Tracker\" = \"https://github.com/DHI/my_library/issues\"" - }, - { - "objectID": "07_packaging.html#versioning", - "href": "07_packaging.html#versioning", - "title": "Distributing your Python package", - "section": "Versioning", - "text": "Versioning\nVersioning your package is important for reproducibility and to avoid breaking changes.\n\n\n\nSemantic versioning use three numbers {major}.{minor}.{patch}, e.g. 1.1.0\n\n\nA new major version indicates breaking changes\nA new minor version indicates new features, without breaking changes\nA new patch version indicates a small change, e.g. a bug fix\nEach of the numbers can be higher than 9, e.g. 1.0.0 is more recent than 0.24.12" - }, - { - "objectID": "07_packaging.html#version-1.0", - "href": "07_packaging.html#version-1.0", - "title": "Distributing your Python package", - "section": "Version 1.0", - "text": "Version 1.0\n\n\nA version number of 1.0 indicates that the package is ready for production\nThe API is stable, and breaking changes will only be introduced in new major versions\nThe package is well tested, and the documentation is complete\nStart with version 0.1.0 and increase the version number as you add features" - }, - { - "objectID": "07_packaging.html#breaking-changes", - "href": "07_packaging.html#breaking-changes", - "title": "Distributing your Python package", - "section": "Breaking changes", - "text": "Breaking changes\nWhat is a breaking change?\n\n\nRemoving a function\nChanging the name of a function\nChanging the signature of a function (arguments, types, return value)\n\n\n\nTry to avoid breaking changes, if possible, but if you do, increase the major version number!" - }, - { - "objectID": "07_packaging.html#installing-specific-versions", - "href": "07_packaging.html#installing-specific-versions", - "title": "Distributing your Python package", - "section": "Installing specific versions", - "text": "Installing specific versions\n\npip install my_library will install the latest version\npip install my_library==1.0.0 will install version 1.0.0\npip install my_library>=1.0.0 will install version 1.0.0 or higher" - }, - { - "objectID": "07_packaging.html#pre-release-versions", - "href": "07_packaging.html#pre-release-versions", - "title": "Distributing your Python package", - "section": "Pre-release versions", - "text": "Pre-release versions\n\n\n\nVersions that are not ready for production\nIndicated by a suffix, e.g. 1.0.0rc1\nWill not be installed by default\nCan be installed with pip install my_library==1.0.0rc1\nListed on PyPI, but not on the search page" + "objectID": "projects/data_cleaning/index.html#module-3-installable-package-and-pytest", + "href": "projects/data_cleaning/index.html#module-3-installable-package-and-pytest", + "title": "Course project: Time Series Data Cleaning", + "section": "", + "text": "3.1 Installable package\n3.2 Pytest" }, { - "objectID": "07_packaging.html#license", - "href": "07_packaging.html#license", - "title": "Distributing your Python package", - "section": "License", - "text": "License\n\n\nA license is a legal agreement between you and others who use your package\nIf you do not specify a license, others cannot use your package legally\nThe license is specified in the pyproject.toml file\nRead more about licenses on https://choosealicense.com/\nCheck if your package is compatible with the license of the dependencies" + "objectID": "projects/data_cleaning/index.html#module-4-github-actions-and-auto-formatting", + "href": "projects/data_cleaning/index.html#module-4-github-actions-and-auto-formatting", + "title": "Course project: Time Series Data Cleaning", + "section": "", + "text": "4.1 Github Action\n4.2 Ruff\n4.3 Black\n4.4 pyproject.toml" }, { - "objectID": "07_packaging.html#dependencies", - "href": "07_packaging.html#dependencies", - "title": "Distributing your Python package", - "section": "Dependencies", - "text": "Dependencies\n\n\nApplication\nA program that is run by a user\n\ncommand line tool\nscript\nweb application\n\nPin versions to ensure reproducibility, e.g. numpy==1.11.0\n\nLibrary\nA program that is used by another program\n\nPython package\nLow level library (C, Fortran, Rust, …)\n\nMake the requirements as loose as possible, e.g. numpy>=1.11.0\n\n\n\nMake the requirements loose, to avoid conflicts with other packages." + "objectID": "projects/data_cleaning/index.html#module-5-object-oriented-design", + "href": "projects/data_cleaning/index.html#module-5-object-oriented-design", + "title": "Course project: Time Series Data Cleaning", + "section": "", + "text": "5.1 Type Hints\n5.2 Data class\n5.3 Module level function\n5.4 Composition or inheritance" }, { - "objectID": "07_packaging.html#pyproject.toml-1", - "href": "07_packaging.html#pyproject.toml-1", - "title": "Distributing your Python package", - "section": "pyproject.toml", - "text": "pyproject.toml\n[build-system]\nrequires = [\"hatchling\"]\nbuild-backend = \"hatchling.build\"\n\n[project]\nname = \"my_library\"\nversion = \"0.0.1\"\ndependencies = [\n \"numpy\"\n]\n\nauthors = [\n { name=\"First Last\", email=\"initials@dhigroup.com\" },\n]\ndescription = \"Useful library\"\nreadme = \"README.md\"\nrequires-python = \">=3.7\"\nclassifiers = [\n \"Programming Language :: Python :: 3\",\n \"License :: OSI Approved :: MIT License\",\n \"Development Status :: 2 - Pre-Alpha\",\n \"Operating System :: OS Independent\",\n \"Topic :: Scientific/Engineering\",\n]\n\n[project.optional-dependencies]\ndev = [\"pytest\",\"flake8\",\"black\",\"sphinx\", \"myst-parser\",\"sphinx-book-theme\"]\ntest= [\"pytest\"]\n\n[project.urls]\n\"Homepage\" = \"https://github.com/DHI/my_library\"\n\"Bug Tracker\" = \"https://github.com/DHI/my_library/issues\"\n\n\nMandatory dependencies are specified in the dependencies section.\nOptional dependencies are specified in the optional-dependencies section." + "objectID": "projects/data_cleaning/index.html#module-6-documentation", + "href": "projects/data_cleaning/index.html#module-6-documentation", + "title": "Course project: Time Series Data Cleaning", + "section": "", + "text": "6.1 README\n6.2 Docstrings\n6.3 mkdocs" }, { - "objectID": "07_packaging.html#classifiers", - "href": "07_packaging.html#classifiers", - "title": "Distributing your Python package", - "section": "Classifiers", - "text": "Classifiers\nclassifiers = [\n \"Programming Language :: Python :: 3\",\n \"License :: OSI Approved :: MIT License\",\n \"Development Status :: 2 - Pre-Alpha\",\n \"Operating System :: OS Independent\",\n \"Topic :: Scientific/Engineering\",\n]\n\nClassifiers are used to categorize your package\nLess relevant for internal packages\nOperating system (Windows, Linux, MacOS)\nDevelopment status (Alpha, Beta, Production/Stable)" + "objectID": "projects/data_cleaning/index.html#module-7-publishing", + "href": "projects/data_cleaning/index.html#module-7-publishing", + "title": "Course project: Time Series Data Cleaning", + "section": "", + "text": "Add a license\nChange version number to 0.1.0\nBuild the package with hatchling.\nPublish the package to the PyPI Test Server." }, { - "objectID": "07_packaging.html#packaging-non-python-files", - "href": "07_packaging.html#packaging-non-python-files", - "title": "Distributing your Python package", - "section": "Packaging non-Python files", - "text": "Packaging non-Python files\n\nIncluding non-Python files can be useful for e.g. machine learning models.\nIf you use hatchling, you can include non-Python files in your package.\nhatchling uses .gitignore to determine which files to include." + "objectID": "projects/data_cleaning/Project_module_01.html", + "href": "projects/data_cleaning/Project_module_01.html", + "title": "Python package development", + "section": "", + "text": "1.1 GitHub repo\n\n1.1.1 Create a new GitHub repository “timeseriescleaner”\n\nprivate, no template, add readme, gitignore python, no license\n\n1.1.2 Go to repo settings/Collaborators add your instructors and your “buddy”\n1.1.3 Clone repo to local machine\n[Optional] Create virtual environment for this course project (use venv or mamba/conda environment)\n1.1.4 Download the provided Python script and add it to the repo\n1.1.5 Commit the file and push the changes (Check that the file can be found on GitHub)\n1.1.6 Open the project in vscode and make a single character change to the file (add a comment)\n1.1.7 Commit the changes (Check that it works on GitHub)\n\n1.2 Functions\n\n1.2.1 Create a local branch “refactor-functions”\n1.2.2 Refactor the code to use functions (clean_spikes, clean_outofrange, clean_flat, plot_timeseries)\n\nfor data in [data1, data2, data3]:\n\ndata_original = data.copy()\ndata = clean_spikes(data, max_jump=10)\ndata = clean_outofrange(data, min_val=0, max_val=50)\ndata = clean_flat(data, flat_period=5)\nplot_timeseries(data_original, data)\n\n\n1.2.3 Check that your code and produce the same results as before (you should not change the functionality!)\n1.2.4 Commit your code in 1 or more commits (in the end your code should be approximately 75 lines long)\n\nCreate a pull request in GitHub and “request review” from your reviewers\nWait for feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" }, { - "objectID": "07_packaging.html#github-secrets", - "href": "07_packaging.html#github-secrets", - "title": "Distributing your Python package", - "section": "GitHub secrets", - "text": "GitHub secrets\n\nStore sensitive information, e.g. passwords, in your repository.\nSecrets are encrypted, and only visible to you and GitHub Actions.\nAdd secrets in the repository settings.\n\nTo use secrets as environment variables in GitHub Actions, add them to the env section of the workflow:\nenv:\n TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}" + "objectID": "projects/data_cleaning/Project_module_01.html#module-1-github-and-basic-functions", + "href": "projects/data_cleaning/Project_module_01.html#module-1-github-and-basic-functions", + "title": "Python package development", + "section": "", + "text": "1.1 GitHub repo\n\n1.1.1 Create a new GitHub repository “timeseriescleaner”\n\nprivate, no template, add readme, gitignore python, no license\n\n1.1.2 Go to repo settings/Collaborators add your instructors and your “buddy”\n1.1.3 Clone repo to local machine\n[Optional] Create virtual environment for this course project (use venv or mamba/conda environment)\n1.1.4 Download the provided Python script and add it to the repo\n1.1.5 Commit the file and push the changes (Check that the file can be found on GitHub)\n1.1.6 Open the project in vscode and make a single character change to the file (add a comment)\n1.1.7 Commit the changes (Check that it works on GitHub)\n\n1.2 Functions\n\n1.2.1 Create a local branch “refactor-functions”\n1.2.2 Refactor the code to use functions (clean_spikes, clean_outofrange, clean_flat, plot_timeseries)\n\nfor data in [data1, data2, data3]:\n\ndata_original = data.copy()\ndata = clean_spikes(data, max_jump=10)\ndata = clean_outofrange(data, min_val=0, max_val=50)\ndata = clean_flat(data, flat_period=5)\nplot_timeseries(data_original, data)\n\n\n1.2.3 Check that your code and produce the same results as before (you should not change the functionality!)\n1.2.4 Commit your code in 1 or more commits (in the end your code should be approximately 75 lines long)\n\nCreate a pull request in GitHub and “request review” from your reviewers\nWait for feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" }, { - "objectID": "07_packaging.html#github-actions", - "href": "07_packaging.html#github-actions", - "title": "Distributing your Python package", - "section": "GitHub Actions", - "text": "GitHub Actions\n\n\n.github/workflows/python-package.yml\n\nname: Publish Python Package\non:\n release:\n types: [created]\njobs:\n deploy:\n runs-on: ubuntu-latest\n steps:\n - uses: actions/checkout@v2\n - name: Set up Python\n uses: actions/setup-python@v2\n with:\n python-version: '3.10'\n - name: Install dependencies\n run: |\n python -m pip install --upgrade pip\n pip install build\n - name: Build package\n run: python -m build\n \n - name: Publish to PyPI\n env:\n TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n run: |\n twine upload dist/*" + "objectID": "projects/data_cleaning/Project_module_06.html", + "href": "projects/data_cleaning/Project_module_06.html", + "title": "Python package development", + "section": "", + "text": "Create new branch “oop-dataclasses” (Make sure changes from last module have been merged, and that you start from the main branch)\n5.1 Type Hints\n\nAdd type hints to all functions and methods. Commit\n\n5.2 Data class\n\nMake all the cleaner classes dataclasses.\n\nremove the init method (not needed anymore)\nCheck that the notebook still runs and that the classes indeed work as data classes (e.g. have a string representation and support equality testing etc)\nCommit\n\n5.3 Module level function\n\nMake a private module function _print_stats() that prints the number of cleaned values\ncall from each of the clean methods\n\n5.4 Composition or inheritance\n\nCreate a new cleaner class called CleanerWorkflow that takes a list of cleaners when constructed and has a clean method that run all the cleaners’ clean methods.\nModify the notebook to use the CleanerWorkflow instead of looping over the cleaners\nConsider what type of validation you would want CleanerWorkflow\nConsider whether it would be better to create a base class BaseCleaner - write down your considerations as a comment in the pull request, refer to specific lines of code\n\ne.g. how would you handle e.g. common plotting functionality in the cleaner classes?\n\n\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" }, { - "objectID": "07_packaging.html#private-pypi-server", - "href": "07_packaging.html#private-pypi-server", - "title": "Distributing your Python package", - "section": "Private PyPI server", - "text": "Private PyPI server\n\nPrivate packages can be be hosted on e.g. Azure Arfifacts or Posit Package Manager.\nThese servers behaves like PyPI, and can be used with pip\nAccess policies can be used to control who can install packages.\n\n\nExample:\n$ pip install --extra-index-url https://pkgs.dev.azure.com/dhigroup/_packaging/pond/pypi/simple/ sampling\nLooking in indexes: https://pypi.org/simple, https://pkgs.dev.azure.com/dhigroup/_packaging/pond/pypi/simple/\n...\nSuccessfully installed sampling-0.0.1" + "objectID": "projects/data_cleaning/Project_module_06.html#module-6-object-oriented-design", + "href": "projects/data_cleaning/Project_module_06.html#module-6-object-oriented-design", + "title": "Python package development", + "section": "", + "text": "Create new branch “oop-dataclasses” (Make sure changes from last module have been merged, and that you start from the main branch)\n5.1 Type Hints\n\nAdd type hints to all functions and methods. Commit\n\n5.2 Data class\n\nMake all the cleaner classes dataclasses.\n\nremove the init method (not needed anymore)\nCheck that the notebook still runs and that the classes indeed work as data classes (e.g. have a string representation and support equality testing etc)\nCommit\n\n5.3 Module level function\n\nMake a private module function _print_stats() that prints the number of cleaned values\ncall from each of the clean methods\n\n5.4 Composition or inheritance\n\nCreate a new cleaner class called CleanerWorkflow that takes a list of cleaners when constructed and has a clean method that run all the cleaners’ clean methods.\nModify the notebook to use the CleanerWorkflow instead of looping over the cleaners\nConsider what type of validation you would want CleanerWorkflow\nConsider whether it would be better to create a base class BaseCleaner - write down your considerations as a comment in the pull request, refer to specific lines of code\n\ne.g. how would you handle e.g. common plotting functionality in the cleaner classes?\n\n\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" }, { - "objectID": "07_packaging.html#installing-a-development-version", - "href": "07_packaging.html#installing-a-development-version", - "title": "Distributing your Python package", - "section": "Installing a development version", - "text": "Installing a development version\n\nInstall latest dev version, e.g. pip install https://github.com/DHI/mikeio/archive/main.zip\nInstall from fix-interp branch, e.g. pip install https://github.com/DHI/mikeio/archive/fix-interp.zip" + "objectID": "05_documentation.html#why-document-your-code", + "href": "05_documentation.html#why-document-your-code", + "title": "Documentation", + "section": "Why document your code?", + "text": "Why document your code?\n\n\n\nMake it easier for others to use your code\nMake it easier for you to use your code" }, { - "objectID": "07_packaging.html#recap", - "href": "07_packaging.html#recap", - "title": "Distributing your Python package", - "section": "Recap", - "text": "Recap\n\nGit, Pull Requests, and code reviews\nPython functions, classes, and modules\nTypes, abstraction, and refactoring\nTesting and auto-formatting\nDependencies and GitHub actions\nDocumentation\nDistributing your package" + "objectID": "05_documentation.html#readme.md", + "href": "05_documentation.html#readme.md", + "title": "Documentation", + "section": "Readme.md", + "text": "Readme.md\n\nA readme file is a text file that introduces and explains a project.\nAlways include a readme file in your project.\nYou can put readme files in any directory, and you can have more than one in a single project." }, { - "objectID": "07_packaging.html#git-pull-requests-and-code-reviews", - "href": "07_packaging.html#git-pull-requests-and-code-reviews", - "title": "Distributing your Python package", - "section": "Git, Pull Requests, and code reviews", - "text": "Git, Pull Requests, and code reviews" + "objectID": "05_documentation.html#requirements", + "href": "05_documentation.html#requirements", + "title": "Documentation", + "section": "Requirements", + "text": "Requirements\n\nMention the requirements for your package\n\nOperating system\nPython version\nOther non-Python dependencies, e.g. VC++ redistributables\n\nInclude information on how to install your package\n\npip install my_package\npip install pip install https://github.com/DHI/{repo}/archive/main.zip" }, { - "objectID": "07_packaging.html#github-flow", - "href": "07_packaging.html#github-flow", - "title": "Distributing your Python package", - "section": "Github flow", - "text": "Github flow\n\n\nCreate a branch\nMake changes\nCreate a pull request\nReview\nMerge" + "objectID": "05_documentation.html#notebooks", + "href": "05_documentation.html#notebooks", + "title": "Documentation", + "section": "Notebooks", + "text": "Notebooks\n\nJupyter notebooks are a great way to document your code\nGood for prototyping\nIn a later stage, notebooks can be used to demonstrate how to use your code\nNot a replacement for documentation for a professional package" }, { - "objectID": "07_packaging.html#github-best-practices", - "href": "07_packaging.html#github-best-practices", - "title": "Distributing your Python package", - "section": "Github best practices", - "text": "Github best practices\n\nCommit often\nUse descriptive commit messages\nKeep pull requests small and focused\nUse “issues” to track work\nReview code regularly" + "objectID": "05_documentation.html#docstrings", + "href": "05_documentation.html#docstrings", + "title": "Documentation", + "section": "Docstrings", + "text": "Docstrings\n\"\"\"K-means clustering.\"\"\"\n\nclass KMeans(_BaseKMeans):\n \"\"\"K-Means clustering.\n \n Parameters\n ----------\n n_clusters : int, default=8\n The number of clusters to form as well as the number of\n centroids to generate.\n\n Examples\n --------\n >>> X = np.array([[1, 2], [1, 4], [1, 0],\n ... [10, 2], [10, 4], [10, 0]])\n >>> kmeans = KMeans(n_clusters=2, random_state=0, n_init=\"auto\").fit(X)\n >>> kmeans.labels_\n array([1, 1, 1, 0, 0, 0], dtype=int32)\nsklearn.KMeans" }, { - "objectID": "07_packaging.html#python-functions-classes-and-modules", - "href": "07_packaging.html#python-functions-classes-and-modules", - "title": "Distributing your Python package", - "section": "Python functions, classes, and modules", - "text": "Python functions, classes, and modules" + "objectID": "05_documentation.html#docstring---numpy-format", + "href": "05_documentation.html#docstring---numpy-format", + "title": "Documentation", + "section": "Docstring - Numpy format", + "text": "Docstring - Numpy format\ndef function_name(param1, param2, param3):\n \"\"\"Short summary.\n \n Long description.\n \n Parameters\n ----------\n param1 : int\n Description of `param1`.\n param2 : str\n Description of `param2`.\n param3 : list of str\n Description of `param3`.\n \n Returns\n -------\n bool\n Description of return value.\n \"\"\"\n pass\n\nThere are several docstring formats. The most common is the numpy format, used by scikit-learn, pandas, numpy, scipy, etc." }, { - "objectID": "07_packaging.html#functions-as-black-boxes", - "href": "07_packaging.html#functions-as-black-boxes", - "title": "Distributing your Python package", - "section": "Functions as black boxes", - "text": "Functions as black boxes\n\n\n\n\nflowchart LR\n A(Input A) --> F[\"Black box\"]\n B(Input B) --> F\n F --> O(Output)\n\n style F fill:#000,color:#fff,stroke:#333,stroke-width:4px\n\n\n\n\n\n\nA function is a black box that takes some input and produces some output.\nThe input and output can be anything, including other functions.\nAs long as the input and output are the same, the function body can be modified." + "objectID": "05_documentation.html#type-hints", + "href": "05_documentation.html#type-hints", + "title": "Documentation", + "section": "Type hints", + "text": "Type hints\nFrom Python 3.6, type hints can be used in addition to the type in the docstring.\ndef remove_outlier(data:pd.DataFrame, column:str, threshold:float=3) -> pd.DataFrame:\n \"\"\"Remove outliers from a dataframe.\n \n Parameters\n ----------\n data : pd.DataFrame\n Dataframe to remove outliers from.\n column : str\n Column to remove outliers from.\n threshold : float, optional\n Number of standard deviations to use as threshold, by default 3" }, { - "objectID": "07_packaging.html#naming-conventions---general", - "href": "07_packaging.html#naming-conventions---general", - "title": "Distributing your Python package", - "section": "Naming conventions - general", - "text": "Naming conventions - general\n\nUse lowercase characters\nSeparate words with underscores\n\nmodel_name = \"NorthSeaModel\"\nn_epochs = 100\n\ndef my_function():\n pass" + "objectID": "05_documentation.html#doctest", + "href": "05_documentation.html#doctest", + "title": "Documentation", + "section": "doctest", + "text": "doctest\nUsing code without documentation is hard, but using code with wrong documentation is even harder.\nHow can you make sure that the documentation is correct?\n\nThe answer is the doctest module built in to the Python standard library.\n\n\n\n\n\n\n\n\nTip\n\n\nThe extensive standard library is why Python is described as a language with “batteries included!”" }, { - "objectID": "07_packaging.html#constants", - "href": "07_packaging.html#constants", - "title": "Distributing your Python package", - "section": "Constants", - "text": "Constants\n\nUse all uppercase characters\n\nGRAVITY = 9.81\n\nAVOGADRO_CONSTANT = 6.02214076e23\n\nSECONDS_IN_A_DAY = 86400\n\nN_LEGS_PER_ANIMAL = {\n \"human\": 2,\n \"dog\": 4,\n \"spider\": 8,\n}" + "objectID": "05_documentation.html#documentation-generators", + "href": "05_documentation.html#documentation-generators", + "title": "Documentation", + "section": "Documentation generators", + "text": "Documentation generators\n\nSphinx\nmkdocs\n\n\nSphinx has been around for a long time, has lot’s of functionality but is based on reStructuredText. mkdocs is a new kid on the block, based on markdown and has a lot of functionality." }, { - "objectID": "07_packaging.html#classes", - "href": "07_packaging.html#classes", - "title": "Distributing your Python package", - "section": "Classes", - "text": "Classes\n\nUse CamelCase for the name of the class\nUse lowercase characters for the name of the methods\nSeparate words with underscores\n\nclass RandomClassifier:\n\n def fit(self, X, y):\n self.classes_ = np.unique(y)\n\n def predict(self, X):\n return np.random.choice(self.classes_, size=len(X))\n\n def fit_predict(self, X, y):\n self.fit(X, y)\n return self.predict(X)" + "objectID": "05_documentation.html#mkdocs", + "href": "05_documentation.html#mkdocs", + "title": "Documentation", + "section": "mkdocs", + "text": "mkdocs\n\nText is written in markdown\nEasy to use\nAPI documentation can be generated with mkdocstrings\nThe end result is a static website that can be hosted on e.g. GitHub pages" }, { - "objectID": "07_packaging.html#dataclasses", - "href": "07_packaging.html#dataclasses", - "title": "Distributing your Python package", - "section": "Dataclasses", - "text": "Dataclasses\nimport datetime\nfrom dataclasses import dataclass\n\n\n@dataclass\nclass Interval:\n start: date\n end: date\n\n>>> dr1 = Interval(start=datetime.date(2020, 1, 1), end=datetime.date(2020, 1, 31))\n>>> dr1\nInterval(start=datetime.date(2020, 1, 1), end=datetime.date(2020, 1, 31))\n>>> dr2 = Interval(start=datetime.date(2020, 1, 1), end=datetime.date(2020, 1, 31))\n>>> dr1 == dr2\nTrue" + "objectID": "05_documentation.html#configuration", + "href": "05_documentation.html#configuration", + "title": "Documentation", + "section": "Configuration", + "text": "Configuration\n\n\nmkdocs.yml\n\nsite_name: my_library\n\ntheme: \"material\" # or readthedocs, mkdocs, etc.\n\nplugins:\n- mkdocstrings:\n handlers:\n python:\n options:\n show_source: false # change if you want able to show source code\n heading_level: 2\n docstring_style: \"numpy\" # important!, since default is google" }, { - "objectID": "07_packaging.html#types-abstraction-and-refactoring", - "href": "07_packaging.html#types-abstraction-and-refactoring", - "title": "Distributing your Python package", - "section": "Types, abstraction, and refactoring", - "text": "Types, abstraction, and refactoring" + "objectID": "05_documentation.html#api-docs", + "href": "05_documentation.html#api-docs", + "title": "Documentation", + "section": "API docs", + "text": "API docs\n\n\ninstall mkdocstrings\n$ pip install mkdocstrings[python]\nInstall theme, e.g. material\n$ pip install mkdocs-material\nAdd plugin to mkdocs.yml (see above)\nCreate index.md in docs folder\nRun mkdocs serve to view locally\n\n\n\ndocs/index.md\n# Reference\n\n::: my_library.simulation" }, { - "objectID": "07_packaging.html#pythonic", - "href": "07_packaging.html#pythonic", - "title": "Distributing your Python package", - "section": "Pythonic", - "text": "Pythonic\nIf you want your code to be Pythonic, you have to be familiar with these types and their methods.\nDundermethods:\n\n__getitem__\n__setitem__\n__len__\n__contains__\n…" + "objectID": "05_documentation.html#github-pages", + "href": "05_documentation.html#github-pages", + "title": "Documentation", + "section": "GitHub pages", + "text": "GitHub pages\n\n\nOnce you have a static website, you need to share it with the world\nGitHub pages allows you to easily host a static website on GitHub\nThe website is available at https://dhi.github.io/<repository>/\nThe website can be created locally by manually editing html pages.\nFor use as documentation, it is easier to use a documentation generator like mkdocs." }, { - "objectID": "07_packaging.html#duck-typing", - "href": "07_packaging.html#duck-typing", - "title": "Distributing your Python package", - "section": "Duck typing", - "text": "Duck typing\n\n“If it walks like a duck and quacks like a duck, it’s a duck”\nFrom the perspective of the caller, it doesn’t matter if it is a rubber duck or a real duck.\nThe type of the object is not important, as long as it has the right methods." + "objectID": "05_documentation.html#github-pages-1", + "href": "05_documentation.html#github-pages-1", + "title": "Documentation", + "section": "GitHub pages", + "text": "GitHub pages" }, { - "objectID": "07_packaging.html#testing-and-auto-formatting", - "href": "07_packaging.html#testing-and-auto-formatting", - "title": "Distributing your Python package", - "section": "Testing and auto-formatting", - "text": "Testing and auto-formatting" + "objectID": "05_documentation.html#private-website", + "href": "05_documentation.html#private-website", + "title": "Documentation", + "section": "“Private” website", + "text": "“Private” website\n\nA GitHub repository can be made private\nThe website is still publicly available\nIn order to “hide” it from search engines, add a robots.txt file to the root of the website\nThis is not a secure way to hide a website, but it is a simple way to hide it from search engines.\n\n\n\nrobots.txt\n\nUser-agent: *\nDisallow: /\n\n\n\n\nPython package development" }, { - "objectID": "07_packaging.html#unit-testing", - "href": "07_packaging.html#unit-testing", - "title": "Distributing your Python package", - "section": "Unit testing", - "text": "Unit testing\n\n\n\n\n\n\nDefinition “Unit”\n\n\n\nA small, fundamental piece of code.\nExecuted in isolation with appropriate inputs.\n\n\n\n\n\nA function is typically considered a “unit”\nLines of code within functions are smaller (can’t be isolated)\nClasses are considered bigger (but can be treated as units)" + "objectID": "group_work/index.html", + "href": "group_work/index.html", + "title": "On-line Group Discussion", + "section": "", + "text": "On-line Group Discussion\nRelated to the time series cleaning project.\n\nModule 1\nModule 2\nModule 3\nModule 4\nModule 5\nModule 6\nModule 7" }, { - "objectID": "07_packaging.html#a-good-unit-test", - "href": "07_packaging.html#a-good-unit-test", - "title": "Distributing your Python package", - "section": "A good unit test", - "text": "A good unit test\n\nFully automated\nHas full control over all the pieces running (“fake” external dependencies)\nCan be run in any order\nRuns in memory (no DB or file access, for example)\nConsistently returns the same result (no random numbers)\nRuns fast\nTests a single logical concept in the system\nReadable\nMaintainable\nTrustworthy" + "objectID": "group_work/module_04.html", + "href": "group_work/module_04.html", + "title": "Python package development", + "section": "", + "text": "In progress…\nBack to overview" }, { - "objectID": "07_packaging.html#thank-you", - "href": "07_packaging.html#thank-you", - "title": "Distributing your Python package", - "section": "Thank you!", - "text": "Thank you!\n\n\n\nPython package development" + "objectID": "group_work/module_04.html#module-04", + "href": "group_work/module_04.html#module-04", + "title": "Python package development", + "section": "", + "text": "In progress…\nBack to overview" }, { - "objectID": "course_structure.html", - "href": "course_structure.html", + "objectID": "group_work/module_02.html", + "href": "group_work/module_02.html", "title": "Python package development", "section": "", - "text": "flowchart TD\n\n M1(Git, Pull Requests, and code reviews)\n M2(Python functions, classes, and modules)\n M3(Testing and auto-formatting)\n M4(Dependencies and GitHub actions)\n M5(Documentation)\n M6(Object oriented design in Python)\n M7(Distributing your package)\n\n B1[1. The bigger picture]\n B2[2. Separations of concern]\n B3[3. Abstraction and encapsulation]\n B4[4. Designing for high performance]\n B5[5. Testing your software]\n B6[6. Separations of concerns in practice]\n B7[7. Extensibility and flexibility]\n B8[8. The rules and exceptions of inheritance]\n B9[9. Keeping things lightweight]\n B10[10. Achieving loose coupling]\n\n M1 --> M2 --> M3 --> M4 --> M5 --> M6 --> M7\n\n B1 --> M2\n B2 --> M2\n B3 --> M6\n B8 --> M6\n B4 --> M4\n B5 --> M4\n B6 --> M5\n B7 --> M3\n\n B9 --> M7\n B10 --> M7" + "text": "Q1: In your course project homework, you refactored the script to use functions. How did it go?\nQ2: Classes. If you should introduce classes to improve the code, which classes should it be and why?\nQ3: [Optional] What are some problems with poorly designed code (based on your own experience or from the book)?\n\nBack to overview" }, { - "objectID": "06_oop.html#object-oriented-design", - "href": "06_oop.html#object-oriented-design", - "title": "Object oriented design in Python", - "section": "Object oriented design", - "text": "Object oriented design\nBenefits of object oriented design:\n\nEncapsulation\nCode reuse (composition, inheritance)\nAbstraction" + "objectID": "group_work/module_02.html#module-2", + "href": "group_work/module_02.html#module-2", + "title": "Python package development", + "section": "", + "text": "Q1: In your course project homework, you refactored the script to use functions. How did it go?\nQ2: Classes. If you should introduce classes to improve the code, which classes should it be and why?\nQ3: [Optional] What are some problems with poorly designed code (based on your own experience or from the book)?\n\nBack to overview" }, { - "objectID": "06_oop.html#encapsulation", - "href": "06_oop.html#encapsulation", - "title": "Object oriented design in Python", - "section": "Encapsulation", - "text": "Encapsulation\nclass Location:\n def __init__(self, name, longitude, latitude):\n self.name = name.upper() # Names are always uppercase\n self.longitude = longitude\n self.latitude = latitude\n\n>>> loc = Location(\"Antwerp\", 4.42, 51.22)\n>>> loc.name\n'ANTWERP'\n>>> loc.name = \"Antwerpen\"\n>>> loc.name\n\"Antwerpen\" 😟" + "objectID": "group_work/module_01.html", + "href": "group_work/module_01.html", + "title": "Python package development", + "section": "", + "text": "Study this script clean_project_data_v4_final2.py for 3 minutes\nConsider what you could do to improve it\nQ1: Discuss in your group how to improve the script.\nQ2: Version control. What is your experience with version control?\n\nThink about a project you’ve worked on in the past that involved collaborating with others on code. What challenges did you face, and how do you think Git and GitHub could have helped to address those challenges?\n\n\nBack to overview" }, { - "objectID": "06_oop.html#encapsulation---attributes", - "href": "06_oop.html#encapsulation---attributes", - "title": "Object oriented design in Python", - "section": "Encapsulation - Attributes", - "text": "Encapsulation - Attributes\nVariables prefixed with an underscore (self._name) is a convention to indicate that the instance variable is private.\nclass Location:\n def __init__(self, name, longitude, latitude):\n self._name = name.upper() # Names are always uppercase\n ...\n\n @property\n def name(self):\n return self._name\n\n @name.setter\n def name(self, value):\n self._name = value.upper()\n\n>>> loc = Location(\"Antwerp\", 4.42, 51.22)\n>>> loc.name = \"Antwerpen\"\n>>> loc.name\n\"ANTWERPEN\" 😊" + "objectID": "group_work/module_01.html#module-1", + "href": "group_work/module_01.html#module-1", + "title": "Python package development", + "section": "", + "text": "Study this script clean_project_data_v4_final2.py for 3 minutes\nConsider what you could do to improve it\nQ1: Discuss in your group how to improve the script.\nQ2: Version control. What is your experience with version control?\n\nThink about a project you’ve worked on in the past that involved collaborating with others on code. What challenges did you face, and how do you think Git and GitHub could have helped to address those challenges?\n\n\nBack to overview" }, { - "objectID": "06_oop.html#composition", - "href": "06_oop.html#composition", - "title": "Object oriented design in Python", - "section": "Composition", - "text": "Composition\n\n\nComposition in object oriented design is a way to combine objects or data types into more complex objects.\n\n\n\n\n\nclassDiagram\n\n class Grid{\n + nx\n + dx\n + ny\n + dy\n + find_index()\n }\n\n class ItemInfo{\n + name\n + type\n + unit\n }\n\n class DataArray{\n + data\n + time\n + item\n + geometry\n + plot()\n }\n\n DataArray --* Grid\n DataArray --* ItemInfo" + "objectID": "group_work/module_03.html", + "href": "group_work/module_03.html", + "title": "Python package development", + "section": "", + "text": "Q1: In your course project homework from last module, you implemented modules and classes, how did it go? Any reflections?\n\nQ2: What is you past experience with testing code?\nQ3: Things can go wrong when executing your code, how should you handle that? Check inputs? try-catch statements? What are pros and cons between different approaches?\n\nBack to overview" }, { - "objectID": "06_oop.html#composition---example", - "href": "06_oop.html#composition---example", - "title": "Object oriented design in Python", - "section": "Composition - Example", - "text": "Composition - Example\nclass Grid:\n def __init__(self, nx, dx, ny, dy):\n self.nx = nx\n self.dx = dx\n self.ny = ny\n self.dy = dy\n \n def find_index(self, x,y):\n ...\n\nclass DataArray:\n def __init__(self, data, time, item, geometry):\n self.data = data\n self.time = time\n self.item = item\n self.geometry = geometry\n\n def plot(self):\n ..." + "objectID": "group_work/module_03.html#module-3", + "href": "group_work/module_03.html#module-3", + "title": "Python package development", + "section": "", + "text": "Q1: In your course project homework from last module, you implemented modules and classes, how did it go? Any reflections?\n\nQ2: What is you past experience with testing code?\nQ3: Things can go wrong when executing your code, how should you handle that? Check inputs? try-catch statements? What are pros and cons between different approaches?\n\nBack to overview" }, { - "objectID": "06_oop.html#inheritance", - "href": "06_oop.html#inheritance", - "title": "Object oriented design in Python", - "section": "Inheritance", - "text": "Inheritance" + "objectID": "02_function_classes.html#functions-as-black-boxes", + "href": "02_function_classes.html#functions-as-black-boxes", + "title": "Functions, classes and modules", + "section": "Functions as black boxes", + "text": "Functions as black boxes\n\n\n\n\nflowchart LR\n A(Input A) --> F[\"Black box\"]\n B(Input B) --> F\n F --> O(Output)\n\n style F fill:#000,color:#fff,stroke:#333,stroke-width:4px\n\n\n\n\n\n\n\nA function is a black box that takes some input and produces some output.\nThe input and output can be anything, including other functions.\nAs long as the input and output are the same, the function body can be modified." }, { - "objectID": "06_oop.html#inheritance---example", - "href": "06_oop.html#inheritance---example", - "title": "Object oriented design in Python", - "section": "Inheritance - Example", - "text": "Inheritance - Example\n\n\n\n\n\nclassDiagram\n\nclass _GeometryFM{\n+ node_coordinates\n+ element_table\n}\n\nclass GeometryFM2D{\n+ interp2d()\n+ get_element_area()\n+ plot()\n}\n\nclass _GeometryFMLayered{\n- _n_layers\n- _n_sigma\n+ to_2d_geometry()\n}\n\nclass GeometryFM3D{\n+ plot()\n}\n\nclass GeometryFMVerticalProfile{\n+ plot()\n}\n _GeometryFM <|-- GeometryFM2D\n _GeometryFM <|-- _GeometryFMLayered\n _GeometryFMLayered <|-- GeometryFM3D\n _GeometryFMLayered <|-- GeometryFMVerticalProfile" + "objectID": "02_function_classes.html#pure-functions", + "href": "02_function_classes.html#pure-functions", + "title": "Functions, classes and modules", + "section": "Pure functions", + "text": "Pure functions\nA pure function returns the same output for the same input.\ndef f(x)\n return x**2\n\n>> f(2)\n4\n>> f(2)\n4" }, { - "objectID": "06_oop.html#inheritance---example-2", - "href": "06_oop.html#inheritance---example-2", - "title": "Object oriented design in Python", - "section": "Inheritance - Example (2)", - "text": "Inheritance - Example (2)\nclass _GeometryFMLayered(_GeometryFM):\n def __init__(self, nodes, elements, n_layers, n_sigma):\n # call the parent class init method\n super().__init__(\n nodes=nodes,\n elements=elements,\n )\n self._n_layers = n_layers\n self._n_sigma = n_sigma" + "objectID": "02_function_classes.html#side-effects", + "href": "02_function_classes.html#side-effects", + "title": "Functions, classes and modules", + "section": "Side effects", + "text": "Side effects\nA function can have side effects (besides returning a value)\ndef f_with_side_effect(x):\n with open(\"output.txt\", \"a\") as f:\n f.write(str(x))\n return x**2\n\nThe function has x as input, returns the square of x, but also appends x to a file. If you run the function a second time, the file will contain two lines." }, { - "objectID": "06_oop.html#composition-vs-inheritance", - "href": "06_oop.html#composition-vs-inheritance", - "title": "Object oriented design in Python", - "section": "Composition vs inheritance", - "text": "Composition vs inheritance\n\n\nInheritance is often used to reuse code, but this is not the main purpose of inheritance.\nInheritance is used to specialize behavior.\nIn most cases, composition is a better choice than inheritance.\nSome recent programming languages (e.g. Go & Rust) do not support this style of inheritance.\nUse inheritance only when it makes sense.\n\n\n\n\nHillard, 2020, Ch. 8 “The rules (and exceptions) of inheritance”" + "objectID": "02_function_classes.html#side-effects-1", + "href": "02_function_classes.html#side-effects-1", + "title": "Functions, classes and modules", + "section": "Side effects", + "text": "Side effects\nPure functions without side effects are easier to reason about.\nBut sometimes side effects are necessary.\n\nWriting to a file\nWriting to a database\nPrinting to the screen\nCreating a plot" }, { - "objectID": "06_oop.html#types", - "href": "06_oop.html#types", - "title": "Object oriented design in Python", - "section": "Types", - "text": "Types\nC#\nint n = 2;\nString s = \"Hello\";\n\npublic String RepeatedString(String s, int n) {\n return Enumerable.Repeat(s, n).Aggregate((a, b) => a + b);\n}\n\nPython\nn = 2\ns = \"Hello\"\n\ndef repeated_string(s, n):\n return s * n" + "objectID": "02_function_classes.html#modifying-input-arguments", + "href": "02_function_classes.html#modifying-input-arguments", + "title": "Functions, classes and modules", + "section": "Modifying input arguments", + "text": "Modifying input arguments\ndef difficult_function(values):\n for i in range(len(values)):\n values[i] = min(0, values[i]) # 😟\n return values\n\n>>> x = [1,2,-1]\n>>> difficult_function(x)\n[0, 0, -1]\n>>> x\n[0, 0, -1]\n\nThis function modifies the input array, which might come as a surprise. The array is passed by reference, so the function can modify it." }, { - "objectID": "06_oop.html#types-1", - "href": "06_oop.html#types-1", - "title": "Object oriented design in Python", - "section": "Types", - "text": "Types\n\n\nPython is a dynamically typed language\nTypes are not checked at compile time\nTypes are checked at runtime\n\n\n\nPython with type hints\nn: int = 2\ns: str = \"Hello\"\n\ndef repeated_string(s:str, n:int) -> str:\n return s * n" + "objectID": "02_function_classes.html#positional-arguments", + "href": "02_function_classes.html#positional-arguments", + "title": "Functions, classes and modules", + "section": "Positional arguments", + "text": "Positional arguments\ndef f(x, y):\n return x + y\n\n>>> f(1, 2)\n3" }, { - "objectID": "06_oop.html#abstraction", - "href": "06_oop.html#abstraction", - "title": "Object oriented design in Python", - "section": "Abstraction", - "text": "Abstraction\n\n\nVersion A\ntotal = 0.0\nfor x in values:\n total = total +x\n\nVersion B\ntotal = sum(values)\n\n\n\n\n\nUsing functions, e.g. sum() allows us to operate on a higher level of abstraction.\nToo little abstraction will force you to write many lines of boiler-plate code\nToo much abstraction limits the flexibility\n✨Find the right level of abstraction!✨\n\n\n\n\nWhich version is easiest to understand?\nWhich version is easiest to change?" + "objectID": "02_function_classes.html#keyword-arguments", + "href": "02_function_classes.html#keyword-arguments", + "title": "Functions, classes and modules", + "section": "Keyword arguments", + "text": "Keyword arguments\ndef f(x, y):\n return x + y\n\n>>> f(x=1, y=2)\n3" }, { - "objectID": "06_oop.html#collections-abstract-base-classes", - "href": "06_oop.html#collections-abstract-base-classes", - "title": "Object oriented design in Python", - "section": "Collections Abstract Base Classes", - "text": "Collections Abstract Base Classes\n\n\n\n\nclassDiagram\n Container <|-- Collection\n Sized <|-- Collection\n Iterable <|-- Collection\n \n class Container{\n __contains__(self, x)\n }\n\n class Sized{\n __len__(self)\n }\n\n class Iterable{\n __iter__(self)\n }\n\n\n\n\n\n\n\n\nIf a class implements __len__ it is a Sized object.\nIf a class implements __contains__ it is a Container object.\nIf a class implements __iter__ it is a Iterable object." + "objectID": "02_function_classes.html#positional-arguments-1", + "href": "02_function_classes.html#positional-arguments-1", + "title": "Functions, classes and modules", + "section": "Positional arguments", + "text": "Positional arguments\n\n\nVersion 1\ndef is_operable(height, period):\n\n return height < 2.0 and period < 6.0\n\n>>> is_operable(1.0, 3.0)\nTrue\n\nVersion 2\ndef is_operable(period, height=0.0):\n # dont forget, that arguments are swapped 👍\n return height < 2.0 and period < 6.0\n\n>>> is_operable(1.0, 3.0)\nFalse 😟\n\n\n\nThe order of the arguments is swapped, since we want to make height an optional argument (more on that later). This breaks existing code, since the order of the arguments is changed." }, { - "objectID": "06_oop.html#collections-abstract-base-classes-1", - "href": "06_oop.html#collections-abstract-base-classes-1", - "title": "Object oriented design in Python", - "section": "Collections Abstract Base Classes", - "text": "Collections Abstract Base Classes\n>>> a = [1, 2, 3]\n>>> 1 in a\nTrue\n>>> a.__contains__(1)\nTrue\n>>> len(a)\n3\n>>> a.__len__()\n3\n>>> for x in a:\n... v.append(x)\n>>> it = a.__iter__()\n>>> next(it)\n1\n>>> next(it)\n2\n>>> next(it)\n3\n>>> next(it)\nTraceback (most recent call last):\n File \"<stdin>\", line 1, in <module>\nStopIteration" + "objectID": "02_function_classes.html#keyword-only-arguments", + "href": "02_function_classes.html#keyword-only-arguments", + "title": "Functions, classes and modules", + "section": "Keyword-only arguments", + "text": "Keyword-only arguments\ndef f(*, x, y):\n return x + y\n\n>>> f(1,2)\nTraceback (most recent call last):\n File \"<stdin>\", line 1, in <module>\nTypeError: f() takes 0 positional arguments but 2 were given" }, { - "objectID": "06_oop.html#collections-abstract-base-classes-2", - "href": "06_oop.html#collections-abstract-base-classes-2", - "title": "Object oriented design in Python", - "section": "Collections Abstract Base Classes", - "text": "Collections Abstract Base Classes\n\n\n\n\n\nclassDiagram\n Container <|-- Collection\n Sized <|-- Collection\n Iterable <|-- Collection\n Collection <|-- Sequence\n Collection <|-- Set\n Sequence <|-- MutableSequence\n Mapping <|-- MutableMapping\n Collection <|-- Mapping\n\n MutableSequence <|-- List\n Sequence <|-- Tuple\n MutableMapping <|-- Dict" + "objectID": "02_function_classes.html#optionaldefault-arguments", + "href": "02_function_classes.html#optionaldefault-arguments", + "title": "Functions, classes and modules", + "section": "Optional(=default) arguments", + "text": "Optional(=default) arguments\ndef f(x, n=2):\n return x**n\n\n>>> f(2)\n4\n>>> f(2, n=3)\n8\n\nMakes it easy to use a function with many arguments." }, { - "objectID": "06_oop.html#pythonic", - "href": "06_oop.html#pythonic", - "title": "Object oriented design in Python", - "section": "Pythonic", - "text": "Pythonic\nIf you want your code to be Pythonic, you have to be familiar with these types and their methods.\nDundermethods:\n\n__getitem__\n__setitem__\n__len__\n__contains__\n…" + "objectID": "02_function_classes.html#mutable-default-arguments", + "href": "02_function_classes.html#mutable-default-arguments", + "title": "Functions, classes and modules", + "section": "Mutable default arguments", + "text": "Mutable default arguments\nPython’s default arguments are evaluated once when the function is defined, not each time the function is called.\n\ndef add_to_cart(x, cart=[]): # this line is evaluated only once 😮\n cart.append(x)\n return cart\n\n>>> add_to_cart(1, cart=[2])\n[2, 1]\n\n>>> add_to_cart(1)\n[1]\n>>> add_to_cart(2)\n[1, 2]\n\nPython’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well." }, { - "objectID": "06_oop.html#duck-typing", - "href": "06_oop.html#duck-typing", - "title": "Object oriented design in Python", - "section": "Duck typing", - "text": "Duck typing\n\n\n“If it walks like a duck and quacks like a duck, it’s a duck”\nFrom the perspective of the caller, it doesn’t matter if it is a rubber duck or a real duck.\nThe type of the object is not important, as long as it has the right methods.\nPython is different than C# or Java, where you would have to create an interface IToolbox and implement it for Toolbox." + "objectID": "02_function_classes.html#how-to-use-default-mutable-arguments", + "href": "02_function_classes.html#how-to-use-default-mutable-arguments", + "title": "Functions, classes and modules", + "section": "How to use default (mutable) arguments", + "text": "How to use default (mutable) arguments\ndef add_to_cart_safe(x, cart=None):\n if cart is None:\n cart = [] # this line is evaluated each time the function is called\n cart.append(x)\n return cart" }, { - "objectID": "06_oop.html#duck-typing---example", - "href": "06_oop.html#duck-typing---example", - "title": "Object oriented design in Python", - "section": "Duck typing - Example", - "text": "Duck typing - Example\nAn example is a Scikit learn transformers\n\nfit\ntransform\nfit_transform\n\nIf you want to make a transformer compatible with sklearn, you have to implement these methods." + "objectID": "02_function_classes.html#changing-return-types", + "href": "02_function_classes.html#changing-return-types", + "title": "Functions, classes and modules", + "section": "Changing return types", + "text": "Changing return types\nSince Python is a dynamic language, the type of the returned variable is allowed to vary.\ndef foo(x):\n if x >=0:\n return x\n else:\n return \"x is negative\"\n\nBut it usually a bad idea, since you can not tell from reading the code, which type will be returned." }, { - "objectID": "06_oop.html#duck-typing---example-1", - "href": "06_oop.html#duck-typing---example-1", - "title": "Object oriented design in Python", - "section": "Duck typing - Example", - "text": "Duck typing - Example\nclass PositiveNumberTransformer:\n\n def fit(self, X, y=None):\n # no need to fit (still need to have the method!)\n return self\n\n def transform(self, X):\n return np.abs(X)\n\n def fit_transform(self, X, y=None):\n return self.fit(X, y).transform(X)" + "objectID": "02_function_classes.html#changing-return-types-1", + "href": "02_function_classes.html#changing-return-types-1", + "title": "Functions, classes and modules", + "section": "Changing return types", + "text": "Changing return types\ndef is_operable(height, period):\n if height < 10:\n return height < 5.0 and period > 4.0\n else:\n return \"No way!\"\n\n>>> if is_operable(height=12.0, period=5.0):\n... print(\"Go ahead!\")\n...\nGo ahead!\n\n\n\n\n\n\n\nImportant\n\n\nIs this the result you expected?\n\n\n\n\n\nA non-empty string or a non-zero value is considered “truthy” in Python!" }, { - "objectID": "06_oop.html#duck-typing---mixins", - "href": "06_oop.html#duck-typing---mixins", - "title": "Object oriented design in Python", - "section": "Duck typing - Mixins", - "text": "Duck typing - Mixins\nWe can inherit some behavior from sklearn.base.TransformerMixin\nfrom sklearn.base import TransformerMixin\n\nclass RemoveOutliersTransformer(TransformerMixin):\n\n def __init__(self, lower_bound, upper_bound):\n self.lower_bound = lower_bound\n self.upper_bound = upper_bound\n self.lower_ = None\n self.upper_ = None\n\n def fit(self, X, y=None):\n self.lower_ = np.quantile(X, self.lower_bound)\n self.upper_ = np.quantile(X, self.upper_bound)\n\n def transform(self, X):\n return np.clip(X, self.lower_, self.upper_)\n\n # def fit_transform(self, X, y=None):\n # we get this for free, from TransformerMixin" + "objectID": "02_function_classes.html#type-hints", + "href": "02_function_classes.html#type-hints", + "title": "Functions, classes and modules", + "section": "Type hints", + "text": "Type hints\nPython is a dynamically typed language -> the type of a variable is determined at runtime.\n\nBut we can add type hints to help the reader (and the code editor).\ndef is_operable(height: float, period: float) -> bool:\n ..." }, { - "objectID": "06_oop.html#lets-revisit-the-date-interval", - "href": "06_oop.html#lets-revisit-the-date-interval", - "title": "Object oriented design in Python", - "section": "Let’s revisit the (date) Interval", - "text": "Let’s revisit the (date) Interval\nThe Interval class represent an interval in time.\nclass Interval:\n def __init__(self, start, end):\n self.start = start\n self.end = end\n\n def __contains__(self, x):\n return self.start < x < self.end\n\n>>> dr = Interval(date(2020, 1, 1), date(2020, 1, 31))\n\n>>> date(2020,1,15) in dr\nTrue\n>>> date(1970,1,1) in dr\nFalse\n\nWhat if we want to make another type of interval, e.g. a interval of numbers \\([1.0, 2.0]\\)?" + "objectID": "02_function_classes.html#classes", + "href": "02_function_classes.html#classes", + "title": "Functions, classes and modules", + "section": "Classes", + "text": "Classes\nclass WeirdToolbox:\n tools = [] # class variable ☹️\n\n\n>>> t1 = WeirdToolbox()\n>>> t1.tools.append(\"hammer\")\n>>> t1.tools\n[\"hammer\"]\n\n>>> t2 = WeirdToolbox()\n>>> t2.tools.append(\"screwdriver\")\n>>> t2.tools\n[\"hammer\", \"screwdriver\"]\n\nClass variables are rarely what you want, since they are shared between all instances of the class." }, { - "objectID": "06_oop.html#a-number-interval", - "href": "06_oop.html#a-number-interval", - "title": "Object oriented design in Python", - "section": "A number interval", - "text": "A number interval\nclass Interval:\n def __init__(self, start, end):\n self.start = start\n self.end = end\n\n def __contains__(self, x):\n return self.start < x < self.end\n \n>>> interval = Interval(5, 10)\n\n>>> 8 in interval\nTrue\n>>> 12 in interval\nFalse\n\nAs long as the start, end and x are comparable, the Interval class is a generic class able to handle integers, floats, dates, datetimes, strings …" + "objectID": "02_function_classes.html#classes-1", + "href": "02_function_classes.html#classes-1", + "title": "Functions, classes and modules", + "section": "Classes", + "text": "Classes\nclass Toolbox:\n def __init__(self):\n self.tools = [] # instance variable 😃\n\n>>> t1 = Toolbox()\n>>> t1.tools.append(\"hammer\")\n>>> t1.tools\n[\"hammer\"]\n\n>>> t2 = Toolbox()\n>>> t2.tools.append(\"screwdriver\")\n>>> t2.tools\n[\"screwdriver\"]\n\nInstance variables are created when the instance is created, and are unique to each instance." }, { - "objectID": "06_oop.html#postels-law", - "href": "06_oop.html#postels-law", - "title": "Object oriented design in Python", - "section": "Postel’s law", - "text": "Postel’s law\na.k.a. the Robustness principle of software design\n\nBe liberal in what you accept\nBe conservative in what you send\n\n\ndef process(number: Union[int,str,float]) -> int:\n # make sure number is an int from now on\n number = int(number)\n\n result = number * 2\n return result" + "objectID": "02_function_classes.html#static-methods", + "href": "02_function_classes.html#static-methods", + "title": "Functions, classes and modules", + "section": "Static methods", + "text": "Static methods\nfrom datetime import date\n\nclass Interval:\n def __init__(self, start:date, end:date):\n self.start = start\n self.end = end\n\n>>> dr = Interval(date(2020, 1, 1), date(2020, 1, 31))\n>>> dr.start\ndatetime.date(2020, 1, 1)\n>>> dr.end\ndatetime.date(2020, 1, 31)\n\nHere is an example of useful class, but it is a bit cumbersome to create an instance." }, { - "objectID": "06_oop.html#section", - "href": "06_oop.html#section", - "title": "Object oriented design in Python", - "section": "", - "text": "The consumers of your package (future self), will be grateful if you are not overly restricitive in what types you accept as input." + "objectID": "02_function_classes.html#static-methods-1", + "href": "02_function_classes.html#static-methods-1", + "title": "Functions, classes and modules", + "section": "Static methods", + "text": "Static methods\nfrom datetime import date\n\nclass Interval:\n def __init__(self, start:date, end:date):\n self.start = start\n self.end = end\n\n @staticmethod\n def from_string(date_string):\n start_str, end_str = date_string.split(\"|\")\n start = date.fromisoformat(start_str)\n end = date.fromisoformat(end_str)\n return Interval(start, end)\n\n>>> dr = Interval.from_string(\"2020-01-01|2020-01-31\")\n>>> dr\n<__main__.Interval at 0x7fb99efcfb90>\n\nSince we commonly use ISO formatted dates separated by a pipe, we can add a static method to create an instance from a string. This makes it easier to create an instance." }, { - "objectID": "06_oop.html#refactoring", - "href": "06_oop.html#refactoring", - "title": "Object oriented design in Python", - "section": "Refactoring", - "text": "Refactoring\n\n\nRefactoring is a way to improve the design of existing code\nChanging a software system in such a way that it does not alter the external behavior of the code, yet improves its internal structure\nRefactoring is a way to make code more readable and maintainable\nHousekeeping" + "objectID": "02_function_classes.html#dataclasses", + "href": "02_function_classes.html#dataclasses", + "title": "Functions, classes and modules", + "section": "Dataclasses", + "text": "Dataclasses\nfrom dataclasses import dataclass\n\n@dataclass\nclass Interval:\n start: date\n end: date\n\n @staticmethod\n def from_string(date_string):\n start_str, end_str = date_string.split(\"|\")\n start = date.fromisoformat(start_str)\n end = date.fromisoformat(end_str)\n return Interval(start, end)\n\n>>> dr = Interval.from_string(\"2020-01-01|2020-01-31\")\n>>> dr\nInterval(start=datetime.date(2020, 1, 1), end=datetime.date(2020, 1, 31))\n\nDataclasses are a new feature in Python 3.7, they are a convenient way to create classes with a few attributes. The variables are instance variables, and the class has a constructor that takes the same arguments as the variables." }, { - "objectID": "06_oop.html#common-refactoring-techniques", - "href": "06_oop.html#common-refactoring-techniques", - "title": "Object oriented design in Python", - "section": "Common refactoring techniques:", - "text": "Common refactoring techniques:\n\nExtract method\nExtract variable\nRename method\nRename variable\nRename class\nInline method\nInline variable\nInline class" + "objectID": "02_function_classes.html#equality", + "href": "02_function_classes.html#equality", + "title": "Functions, classes and modules", + "section": "Equality", + "text": "Equality\nOn a regular class, equality is based on the memory address of the object.\nclass Interval:\n def __init__(self, start:date, end:date):\n self.start = start\n self.end = end\n\n>>> dr1 = Interval(start=date(2020, 1, 1), end=date(2020, 1, 31))\n>>> dr2 = Interval(start=date(2020, 1, 1), end=date(2020, 1, 31))\n>>> dr1 == dr2\nFalse\n\nThis is not very useful, since we want to compare the values of the attributes." }, { - "objectID": "06_oop.html#rename-variable", - "href": "06_oop.html#rename-variable", - "title": "Object oriented design in Python", - "section": "Rename variable", - "text": "Rename variable\nBefore\nn = 0\nfor v in y:\n if v < 0:\n n = n + 1\n\nAfter\nFREEZING_POINT = 0.0\nn_freezing_days = 0\nfor temp in daily_max_temperatures:\n if temp < FREEZING_POINT:\n n_freezing_days = n_freezing_days + 1" + "objectID": "02_function_classes.html#equality-1", + "href": "02_function_classes.html#equality-1", + "title": "Functions, classes and modules", + "section": "Equality", + "text": "Equality\nclass Interval:\n def __init__(self, start:date, end:date):\n self.start = start\n self.end = end\n\n def __eq__(self, other):\n return self.start == other.start and self.end == other.end\n\n>>> dr1 = Interval(start=date(2020, 1, 1), end=date(2020, 1, 31))\n>>> dr2 = Interval(start=date(2020, 1, 1), end=date(2020, 1, 31))\n>>> dr1 == dr2\nTrue\n\nWe can override the __eq__ method to compare the values of the attributes." }, { - "objectID": "06_oop.html#extract-variable", - "href": "06_oop.html#extract-variable", - "title": "Object oriented design in Python", - "section": "Extract variable", - "text": "Extract variable\nBefore\ndef predict(x):\n return min(0.0, 0.5 + 2.0 * min(0,x) + (random.random() - 0.5) / 10.0)\n\nAfter\ndef predict(x):\n scale = 10.0\n error = (random.random() - 0.5) / scale)\n a = 0.5\n b = 2.0 \n draft = a + b * x + error\n return min(0.0, draft)" + "objectID": "02_function_classes.html#data-classes", + "href": "02_function_classes.html#data-classes", + "title": "Functions, classes and modules", + "section": "Data classes", + "text": "Data classes\nfrom dataclasses import dataclass, field\n\n@dataclass\nclass Quantity:\n unit: str = field(compare=True)\n standard_name: field(compare=True)\n name: str = field(compare=False, default=None)\n\n\n>>> t1 = Quantity(name=\"temp\", unit=\"C\", standard_name=\"air_temperature\")\n>>> t2 = Quantity(name=\"temperature\", unit=\"C\", standard_name=\"air_temperature\")\n\n>>> t1 == t2\nTrue\n\n>>> d1 = Quantity(unit=\"m\", standard_name=\"depth\")\n>>> d1 == t2\nFalse" }, { - "objectID": "06_oop.html#extract-method", - "href": "06_oop.html#extract-method", - "title": "Object oriented design in Python", - "section": "Extract method", - "text": "Extract method\ndef error(scale):\n return (random.random() - 0.5) / scale)\n\ndef linear_model(x, *, a=0.0, b=1.0):\n return a + b * x\n\ndef clip(x, *, min_value=0.0):\n return min(min_value, x)\n\ndef predict(x): \n draft = linear_model(x, a=0.5, b=2.0) + error(scale=10.0)\n return clip(draft, min_value=0.)" + "objectID": "02_function_classes.html#data-classes-1", + "href": "02_function_classes.html#data-classes-1", + "title": "Functions, classes and modules", + "section": "Data classes", + "text": "Data classes\n\n\nCompact notation of fields with type hints\nEquality based on values of fields\nUseful string represenation by default\nIt is still a regular class" }, { - "objectID": "06_oop.html#inline-method", - "href": "06_oop.html#inline-method", - "title": "Object oriented design in Python", - "section": "Inline method", - "text": "Inline method\nOpposite of extract mehtod.\ndef predict(x): \n draft = linear_model(x, a=0.5, b=2.0) + error(scale=10.0)\n return min(0.0, x)" + "objectID": "02_function_classes.html#modules", + "href": "02_function_classes.html#modules", + "title": "Functions, classes and modules", + "section": "Modules", + "text": "Modules\nModules are files containing Python code (functions, classes, constants) that belong together.\n$tree analytics/\nanalytics/\n├── __init__.py\n├── date.py\n└── tools.py\n\nThe analytics package contains two modules:\n\ntools module\ndate module" }, { - "objectID": "06_oop.html#composed-method", - "href": "06_oop.html#composed-method", - "title": "Object oriented design in Python", - "section": "Composed method", - "text": "Composed method\nBreak up a long method into smaller methods." + "objectID": "02_function_classes.html#packages", + "href": "02_function_classes.html#packages", + "title": "Functions, classes and modules", + "section": "Packages", + "text": "Packages\n\n\nA package is a directory containing modules\nEach package in Python is a directory which MUST contain a special file called __init__.py\nThe __init__.py can be empty, and it indicates that the directory it contains is a Python package\n__init__.py can also execute initialization code" }, { - "objectID": "06_oop.html#composed-method-1", - "href": "06_oop.html#composed-method-1", - "title": "Object oriented design in Python", - "section": "Composed method", - "text": "Composed method\n\nDivide your program into methods that perform one identifiable task\nKeep all of the operations in a method at the same level of abstraction.\nThis will naturally result in programs with many small methods, each a few lines long.\nWhen you use Extract method a bunch of times on a method the original method becomes a Composed method." + "objectID": "02_function_classes.html#init__.py", + "href": "02_function_classes.html#init__.py", + "title": "Functions, classes and modules", + "section": "__init__.py", + "text": "__init__.py\nExample: mikeio/pfs/__init__.py:\nfrom .pfsdocument import Pfs, PfsDocument\nfrom .pfssection import PfsNonUniqueList, PfsSection\n\ndef read_pfs(filename, encoding=\"cp1252\", unique_keywords=False):\n \"\"\"Read a pfs file for further analysis/manipulation\"\"\"\n \n return PfsDocument(filename, encoding=encoding, unique_keywords=unique_keywords)\n\nThe imports in __init__.py let’s you separate the implementation into multiple files.\n>>> mikeio.pfs.pfssection.PfsSection\n<class 'mikeio.pfs.pfssection.PfsSection'>\n>>> mikeio.pfs.PfsSection\n<class 'mikeio.pfs.pfssection.PfsSection'>\n\nThe PfsSection and PfsDocument are imported from the pfssection.py and pfsdocument.py modules. to the mikeio.pfs namespace." }, { - "objectID": "index.html", - "href": "index.html", - "title": "Python package development", - "section": "", - "text": "Introduction" + "objectID": "02_function_classes.html#python-naming-conventions", + "href": "02_function_classes.html#python-naming-conventions", + "title": "Functions, classes and modules", + "section": "Python naming conventions", + "text": "Python naming conventions\nBy adhering to the naming conventions, your code will be easier to read for other Python developers.\n\nvariables, functions and methods: lowercase_with_underscores\nclasses: CamelCase\nconstants: UPPERCASE_WITH_UNDERSCORES" }, { - "objectID": "index.html#learning-modules", - "href": "index.html#learning-modules", - "title": "Python package development", - "section": "Learning modules", - "text": "Learning modules\n\nGit, Pull Requests, and code reviews\n\nHomework\n\nPython functions, classes, and modules\n\nHomework\n\nTesting and auto-formatting\n\nHomework\n\nDependencies and GitHub actions\n\nHomework\n\nDocumentation\n\nHomework\n\nObject oriented design in Python\n\nHomework\n\nDistributing your package\n\nHomework\n\n\n©️ DHI 2023" + "objectID": "02_function_classes.html#variables-function-and-method-names", + "href": "02_function_classes.html#variables-function-and-method-names", + "title": "Functions, classes and modules", + "section": "Variables, function and method names", + "text": "Variables, function and method names\n\nUse lowercase characters\nSeparate words with underscores\n\n\nmodel_name = \"NorthSeaModel\"\nn_epochs = 100\n\ndef my_function():\n pass" }, { - "objectID": "group_work/index.html", - "href": "group_work/index.html", - "title": "On-line Group Discussion", - "section": "", - "text": "On-line Group Discussion\nRelated to the time series cleaning project.\n\nModule 1\nModule 2\nModule 3\nModule 4" + "objectID": "02_function_classes.html#constants", + "href": "02_function_classes.html#constants", + "title": "Functions, classes and modules", + "section": "Constants", + "text": "Constants\n\nUse all uppercase characters\n\nGRAVITY = 9.81\n\nAVOGADRO_CONSTANT = 6.02214076e23\n\nSECONDS_IN_A_DAY = 86400\n\nN_LEGS_PER_ANIMAL = {\n \"human\": 2,\n \"dog\": 4,\n \"spider\": 8,\n}\n\nPython will not prevent you from changing the value of a constant, but it is a convention to use all uppercase characters for constants." }, { - "objectID": "group_work/module_04.html", - "href": "group_work/module_04.html", - "title": "Python package development", - "section": "", - "text": "In progress…\nBack to overview" + "objectID": "02_function_classes.html#classes-2", + "href": "02_function_classes.html#classes-2", + "title": "Functions, classes and modules", + "section": "Classes", + "text": "Classes\n\nUse CamelCase for the name of the class\nUse lowercase characters for the name of the methods\nSeparate words with underscores\n\n\nclass RandomClassifier: # CamelCase ✅\n\n def fit(self, X, y):\n self.classes_ = np.unique(y)\n\n def predict(self, X):\n return np.random.choice(self.classes_, size=len(X))\n\n def fit_predict(self, X, y): # lowercase ✅\n self.fit(X, y)\n return self.predict(X)\n\n\n\nPython package development" }, { - "objectID": "group_work/module_04.html#module-04", - "href": "group_work/module_04.html#module-04", - "title": "Python package development", - "section": "", - "text": "In progress…\nBack to overview" + "objectID": "01_version_control.html#why-use-version-control", + "href": "01_version_control.html#why-use-version-control", + "title": "Git, GitHub, Pull Requests, and code reviews", + "section": "Why use version control?", + "text": "Why use version control?\n\n\n\n\n\nManage changes to code over time\nKeep track of changes and revert to previous versions if needed.\nCollaborate and merge changes from different people\nEnsure code stability\nBest practice for software development" }, { - "objectID": "05_documentation.html#why-document-your-code", - "href": "05_documentation.html#why-document-your-code", - "title": "Documentation", - "section": "Why document your code?", - "text": "Why document your code?\n\n\n\nMake it easier for others to use your code\nMake it easier for you to use your code" + "objectID": "01_version_control.html#centralized-version-control", + "href": "01_version_control.html#centralized-version-control", + "title": "Git, GitHub, Pull Requests, and code reviews", + "section": "Centralized version control", + "text": "Centralized version control\n\nSingle source with the entire history\nLocal copy with latest version . . .\nExamples: SVN, Surround" }, { - "objectID": "05_documentation.html#readme.md", - "href": "05_documentation.html#readme.md", - "title": "Documentation", - "section": "Readme.md", - "text": "Readme.md\n\nA readme file is a text file that introduces and explains a project.\nAlways include a readme file in your project.\nYou can put readme files in any directory, and you can have more than one in a single project." + "objectID": "01_version_control.html#distributed-version-control", + "href": "01_version_control.html#distributed-version-control", + "title": "Git, GitHub, Pull Requests, and code reviews", + "section": "Distributed version control", + "text": "Distributed version control\n\nLocal copy has the entire history\nCommit changes to code offline\nAuthorative source (origin) . . .\nExamples: Git, Mercurial" }, { - "objectID": "05_documentation.html#requirements", - "href": "05_documentation.html#requirements", - "title": "Documentation", - "section": "Requirements", - "text": "Requirements\n\nMention the requirements for your package\n\nOperating system\nPython version\nOther non-Python dependencies, e.g. VC++ redistributables\n\nInclude information on how to install your package\n\npip install my_package\npip install pip install https://github.com/DHI/{repo}/archive/main.zip" + "objectID": "01_version_control.html#git", + "href": "01_version_control.html#git", + "title": "Git, GitHub, Pull Requests, and code reviews", + "section": "Git", + "text": "Git\nGit is a powerful tool for managing code changes and collaborating with others on a project.\n\nYou can use Git from the command line, or with a graphical user interface (GUI).\n\n\n> git add foo.py\n\n\n> git commit -m \"Nailed it\"\n\n\n> git push" }, { - "objectID": "05_documentation.html#notebooks", - "href": "05_documentation.html#notebooks", - "title": "Documentation", - "section": "Notebooks", - "text": "Notebooks\n\nJupyter notebooks are a great way to document your code\nGood for prototyping\nIn a later stage, notebooks can be used to demonstrate how to use your code\nNot a replacement for documentation for a professional package" + "objectID": "01_version_control.html#basic-git-commands", + "href": "01_version_control.html#basic-git-commands", + "title": "Git, GitHub, Pull Requests, and code reviews", + "section": "Basic Git commands", + "text": "Basic Git commands\n\n\ngit add: adds a file to the staging area\ngit commit: creates a new commit with the changes in the staging area\ngit status: shows the current status of your repository\ngit log: shows the commit history of your repository\ngit stash: temporarily save changes that are not ready to be committed" }, { - "objectID": "05_documentation.html#docstrings", - "href": "05_documentation.html#docstrings", - "title": "Documentation", - "section": "Docstrings", - "text": "Docstrings\n\"\"\"K-means clustering.\"\"\"\n\nclass KMeans(_BaseKMeans):\n \"\"\"K-Means clustering.\n \n Parameters\n ----------\n n_clusters : int, default=8\n The number of clusters to form as well as the number of\n centroids to generate.\n\n Examples\n --------\n >>> X = np.array([[1, 2], [1, 4], [1, 0],\n ... [10, 2], [10, 4], [10, 0]])\n >>> kmeans = KMeans(n_clusters=2, random_state=0, n_init=\"auto\").fit(X)\n >>> kmeans.labels_\n array([1, 1, 1, 0, 0, 0], dtype=int32)\nsklearn.KMeans" + "objectID": "01_version_control.html#working-with-remote-repositories", + "href": "01_version_control.html#working-with-remote-repositories", + "title": "Git, GitHub, Pull Requests, and code reviews", + "section": "Working with remote repositories", + "text": "Working with remote repositories\n\n\ngit clone: creates a copy of the codebase on your local machine.\ngit push: pushes changes back to the remote repository.\ngit pull: pulls changes from the remote repository." }, { - "objectID": "05_documentation.html#docstring---numpy-format", - "href": "05_documentation.html#docstring---numpy-format", - "title": "Documentation", - "section": "Docstring - Numpy format", - "text": "Docstring - Numpy format\ndef function_name(param1, param2, param3):\n \"\"\"Short summary.\n \n Long description.\n \n Parameters\n ----------\n param1 : int\n Description of `param1`.\n param2 : str\n Description of `param2`.\n param3 : list of str\n Description of `param3`.\n \n Returns\n -------\n bool\n Description of return value.\n \"\"\"\n pass\n\nThere are several docstring formats. The most common is the numpy format, used by scikit-learn, pandas, numpy, scipy, etc." + "objectID": "01_version_control.html#branching-and-merging", + "href": "01_version_control.html#branching-and-merging", + "title": "Git, GitHub, Pull Requests, and code reviews", + "section": "Branching and Merging", + "text": "Branching and Merging\n\nA branch is a separate version of your code that you can work on independently from the main branch.\ngit merge: merges changes back into the main branch (we will do this from GitHub)" }, { - "objectID": "05_documentation.html#type-hints", - "href": "05_documentation.html#type-hints", - "title": "Documentation", - "section": "Type hints", - "text": "Type hints\nFrom Python 3.6, type hints can be used in addition to the type in the docstring.\ndef remove_outlier(data:pd.DataFrame, column:str, threshold:float=3) -> pd.DataFrame:\n \"\"\"Remove outliers from a dataframe.\n \n Parameters\n ----------\n data : pd.DataFrame\n Dataframe to remove outliers from.\n column : str\n Column to remove outliers from.\n threshold : float, optional\n Number of standard deviations to use as threshold, by default 3" + "objectID": "01_version_control.html#git-hosting-platforms", + "href": "01_version_control.html#git-hosting-platforms", + "title": "Git, GitHub, Pull Requests, and code reviews", + "section": "Git hosting platforms", + "text": "Git hosting platforms" }, { - "objectID": "05_documentation.html#doctest", - "href": "05_documentation.html#doctest", - "title": "Documentation", - "section": "doctest", - "text": "doctest\nUsing code without documentation is hard, but using code with wrong documentation is even harder.\nHow can you make sure that the documentation is correct?\n\nThe answer is the doctest module built in to the Python standard library.\n\n\n\n\n\n\n\n\nTip\n\n\nThe extensive standard library is why Python is described as a language with “batteries included!”" + "objectID": "01_version_control.html#github", + "href": "01_version_control.html#github", + "title": "Git, GitHub, Pull Requests, and code reviews", + "section": "GitHub", + "text": "GitHub\n\n\nGit repository hosting service\nCollaborate with others on codebase\nFork a repository to work on your own version\nPull requests for code review and merging changes\nIssue tracking and project management tools\nGitHub Pages for hosting websites" }, { - "objectID": "05_documentation.html#documentation-generators", - "href": "05_documentation.html#documentation-generators", - "title": "Documentation", - "section": "Documentation generators", - "text": "Documentation generators\n\nSphinx\nmkdocs\n\n\nSphinx has been around for a long time, has lot’s of functionality but is based on reStructuredText. mkdocs is a new kid on the block, based on markdown and has a lot of functionality." + "objectID": "01_version_control.html#github-flow", + "href": "01_version_control.html#github-flow", + "title": "Git, GitHub, Pull Requests, and code reviews", + "section": "Github flow", + "text": "Github flow\n\n\n\nCreate a branch\nMake changes\nCreate a pull request\nReview\nMerge\n\n\n\n\nClone a repository to work on a copy (optionally: fork first)\nCreate a branch for each new feature or fix\nCommit changes and push to remote repository\nOpen a pull request to propose changes and request code review\nMerge changes back into the main branch" }, { - "objectID": "05_documentation.html#mkdocs", - "href": "05_documentation.html#mkdocs", - "title": "Documentation", - "section": "mkdocs", - "text": "mkdocs\n\nText is written in markdown\nEasy to use\nAPI documentation can be generated with mkdocstrings\nThe end result is a static website that can be hosted on e.g. GitHub pages" + "objectID": "01_version_control.html#desktop-application-github-desktop", + "href": "01_version_control.html#desktop-application-github-desktop", + "title": "Git, GitHub, Pull Requests, and code reviews", + "section": "Desktop Application: GitHub Desktop", + "text": "Desktop Application: GitHub Desktop" }, { - "objectID": "05_documentation.html#configuration", - "href": "05_documentation.html#configuration", - "title": "Documentation", - "section": "Configuration", - "text": "Configuration\n\n\nmkdocs.yml\n\nsite_name: my_library\n\ntheme: \"material\" # or readthedocs, mkdocs, etc.\n\nplugins:\n- mkdocstrings:\n handlers:\n python:\n options:\n show_source: false # change if you want able to show source code\n heading_level: 2\n docstring_style: \"numpy\" # important!, since default is google" + "objectID": "01_version_control.html#demo", + "href": "01_version_control.html#demo", + "title": "Git, GitHub, Pull Requests, and code reviews", + "section": "Demo", + "text": "Demo" }, { - "objectID": "05_documentation.html#api-docs", - "href": "05_documentation.html#api-docs", - "title": "Documentation", - "section": "API docs", - "text": "API docs\n\n\ninstall mkdocstrings\n$ pip install mkdocstrings[python]\nInstall theme, e.g. material\n$ pip install mkdocs-material\nAdd plugin to mkdocs.yml (see above)\nCreate index.md in docs folder\nRun mkdocs serve to view locally\n\n\n\ndocs/index.md\n# Reference\n\n::: my_library.simulation" + "objectID": "01_version_control.html#github-best-practices", + "href": "01_version_control.html#github-best-practices", + "title": "Git, GitHub, Pull Requests, and code reviews", + "section": "Github best practices", + "text": "Github best practices\n\n\nCommit often\nUse descriptive commit messages\nKeep pull requests small and focused\nUse “issues” to track work\nReview code regularly" }, { - "objectID": "05_documentation.html#github-pages", - "href": "05_documentation.html#github-pages", - "title": "Documentation", - "section": "GitHub pages", - "text": "GitHub pages\n\n\nOnce you have a static website, you need to share it with the world\nGitHub pages allows you to easily host a static website on GitHub\nThe website is available at https://dhi.github.io/<repository>/\nThe website can be created locally by manually editing html pages.\nFor use as documentation, it is easier to use a documentation generator like mkdocs." + "objectID": "01_version_control.html#resources", + "href": "01_version_control.html#resources", + "title": "Git, GitHub, Pull Requests, and code reviews", + "section": "Resources", + "text": "Resources\n\nGitHub: quickstart\nRealPython: git and github intro\nDatacamp: introduction to Git" }, { - "objectID": "05_documentation.html#github-pages-1", - "href": "05_documentation.html#github-pages-1", - "title": "Documentation", - "section": "GitHub pages", - "text": "GitHub pages" + "objectID": "01_version_control.html#word-list", + "href": "01_version_control.html#word-list", + "title": "Git, GitHub, Pull Requests, and code reviews", + "section": "Word list", + "text": "Word list\n\nClone\n\nmaking a local copy of a remote repository on your computer.\n\nRemote\n\na reference to a Git repository that is hosted on a remote server, typically on a service like GitHub.\n\nCommit\n\na record of changes made to a repository, including the changes themselves and a message describing what was changed.\n\nStage\n\nselecting changes that you want to include in the next commit.\n\nPush\n\nsending changes from your local repository to a remote repository.\n\nPull\n\nretrieving changes from a remote repository and merging them into your local repository.\n\nBranch\n\na separate line of development that can be used to work on new features or bug fixes without affecting the main codebase.\n\nPull request\n\na way to propose changes to a repository by asking the repository owner to “pull” in the changes from a branch or fork.\n\nStash\n\ntemporarily save changes that are not ready to be committed (bring them back later when needed).\n\nMerge\n\nthe process of combining changes from one branch or fork into another, typically the main codebase.\n\nRebase\n\na way to integrate changes from one branch into another by applying the changes from the first branch to the second branch as if they were made there all along.\n\nMerge conflict\n\nwhen Git is unable to automatically merge changes from two different branches, because the changes overlap or conflict.\n\nCheckout\n\nswitching between different branches or commits in a repository.\n\nFork\n\na copy of a repository that you create on your own account, which you can modify without affecting the original repository." }, { - "objectID": "05_documentation.html#private-website", - "href": "05_documentation.html#private-website", - "title": "Documentation", - "section": "“Private” website", - "text": "“Private” website\n\nA GitHub repository can be made private\nThe website is still publicly available\nIn order to “hide” it from search engines, add a robots.txt file to the root of the website\nThis is not a secure way to hide a website, but it is a simple way to hide it from search engines.\n\n\n\nrobots.txt\n\nUser-agent: *\nDisallow: /\n\n\n\n\nPython package development" + "objectID": "01_version_control.html#summary", + "href": "01_version_control.html#summary", + "title": "Git, GitHub, Pull Requests, and code reviews", + "section": "Summary", + "text": "Summary\n\n\nVersion control is a tool for managing changes to code\nGit is a distributed version control system (software)\nGitHub is a platform for hosting and collaborating on Git repositories\nGitHub Desktop is a GUI for Git (and GitHub)\nPull requests are a way to propose changes to a repository\n\n\n\n\n\nPython package development" }, { - "objectID": "02_function_classes.html#functions-as-black-boxes", - "href": "02_function_classes.html#functions-as-black-boxes", - "title": "Functions, classes and modules", - "section": "Functions as black boxes", - "text": "Functions as black boxes\n\n\n\n\nflowchart LR\n A(Input A) --> F[\"Black box\"]\n B(Input B) --> F\n F --> O(Output)\n\n style F fill:#000,color:#fff,stroke:#333,stroke-width:4px\n\n\n\n\n\n\n\nA function is a black box that takes some input and produces some output.\nThe input and output can be anything, including other functions.\nAs long as the input and output are the same, the function body can be modified." + "objectID": "projects/data_cleaning/Project_module_03.html", + "href": "projects/data_cleaning/Project_module_03.html", + "title": "Python package development", + "section": "", + "text": "Create new branch “package-test” (Make sure changes from last module have been merged, and that you start from the main branch)\nMake sure pytest and pytest-cov are installed\n3.1 Installable package\n\n3.1.1 Organize the files into folders and add setup.py. Call your package tscleaner.\n\nsubfolders: tscleaner, scripts, notebooks, tests\nmake init-file in tscleaner with\n\nfrom .cleaning import SpikeCleaner, FlatPeriodCleaner, OutOfRangeCleaner\nfrom .plotting import plot_timeseries\n\ncreate a setup.py in the root with the following content (change with your data):\n\nfrom setuptools import setup, find_packages\nsetup( name=‘MyPackageName’,\nversion=‘0.0.1’,\nurl=‘https://github.com/mypackage.git’,\nauthor=‘Author Name’,\nauthor_email=‘author@gmail.com’,\ndescription=‘Description of my package’,\npackages=find_packages(),\ninstall_requires=[‘numpy’, ‘matplotlib’],\n)\n\n\n\n3.1.2 Install the package in editable mode.\n\n>pip install -e .\n\n3.1.3 Modify import statements in notebook_A and script main.py and make sure they run.\n3.1.4 Modify cleaner tools by raising exceptions for invalid inputs.\n3.1.5 Move the csv file to /tests/testdata and update notebook with relative path to the file\n\n3.2 Pytest\n\n3.2.1 Write unit tests with pytest in the /tests folder. Create an empty init-py file in the folder. Create a file test_cleaning.py and create at least five tests that verify that the cleaning tools work as intended\n[Optional] Consider to make a fixture that reads the csv file and you can read in all tests\n3.2.2 Run the tests from the commandline by writting >pytest in the project root (can you also run the tests from VSCode?)\n3.2.3 Assess the test coverage with >pytest --cov=tscleaner tests\nOptional: Get coverage as html with >pytest --cov=tscleaner --cov-report html (check the index.html in the htmlcov subfolder afterwards)\n\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" }, { - "objectID": "02_function_classes.html#pure-functions", - "href": "02_function_classes.html#pure-functions", - "title": "Functions, classes and modules", - "section": "Pure functions", - "text": "Pure functions\nA pure function returns the same output for the same input.\ndef f(x)\n return x**2\n\n>> f(2)\n4\n>> f(2)\n4" + "objectID": "projects/data_cleaning/Project_module_03.html#module-3-installable-package-and-pytest", + "href": "projects/data_cleaning/Project_module_03.html#module-3-installable-package-and-pytest", + "title": "Python package development", + "section": "", + "text": "Create new branch “package-test” (Make sure changes from last module have been merged, and that you start from the main branch)\nMake sure pytest and pytest-cov are installed\n3.1 Installable package\n\n3.1.1 Organize the files into folders and add setup.py. Call your package tscleaner.\n\nsubfolders: tscleaner, scripts, notebooks, tests\nmake init-file in tscleaner with\n\nfrom .cleaning import SpikeCleaner, FlatPeriodCleaner, OutOfRangeCleaner\nfrom .plotting import plot_timeseries\n\ncreate a setup.py in the root with the following content (change with your data):\n\nfrom setuptools import setup, find_packages\nsetup( name=‘MyPackageName’,\nversion=‘0.0.1’,\nurl=‘https://github.com/mypackage.git’,\nauthor=‘Author Name’,\nauthor_email=‘author@gmail.com’,\ndescription=‘Description of my package’,\npackages=find_packages(),\ninstall_requires=[‘numpy’, ‘matplotlib’],\n)\n\n\n\n3.1.2 Install the package in editable mode.\n\n>pip install -e .\n\n3.1.3 Modify import statements in notebook_A and script main.py and make sure they run.\n3.1.4 Modify cleaner tools by raising exceptions for invalid inputs.\n3.1.5 Move the csv file to /tests/testdata and update notebook with relative path to the file\n\n3.2 Pytest\n\n3.2.1 Write unit tests with pytest in the /tests folder. Create an empty init-py file in the folder. Create a file test_cleaning.py and create at least five tests that verify that the cleaning tools work as intended\n[Optional] Consider to make a fixture that reads the csv file and you can read in all tests\n3.2.2 Run the tests from the commandline by writting >pytest in the project root (can you also run the tests from VSCode?)\n3.2.3 Assess the test coverage with >pytest --cov=tscleaner tests\nOptional: Get coverage as html with >pytest --cov=tscleaner --cov-report html (check the index.html in the htmlcov subfolder afterwards)\n\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" }, { - "objectID": "02_function_classes.html#side-effects", - "href": "02_function_classes.html#side-effects", - "title": "Functions, classes and modules", - "section": "Side effects", - "text": "Side effects\nA function can have side effects (besides returning a value)\ndef f_with_side_effect(x):\n with open(\"output.txt\", \"a\") as f:\n f.write(str(x))\n return x**2\n\nThe function has x as input, returns the square of x, but also appends x to a file. If you run the function a second time, the file will contain two lines." + "objectID": "projects/data_cleaning/Project_module_02.html", + "href": "projects/data_cleaning/Project_module_02.html", + "title": "Python package development", + "section": "", + "text": "Create new branch “modules-classes” (Make sure changes from last module have been merged, and that you start from the main branch)\n2.1 Function arguments\n\nAdd default arguments to the functions. Commit.\nMake sure that you only use positional arguments where there is only one argument. Use keyword arguments everywhere else. Commit.\n\n2.2 Modules\n\nMove cleaner functions into a separate module “cleaning.py”. Commit.\nMove the plotting function into a separate module “plotting.py”. Commit.\nRename the script main.py and execute the cleaning and plotting.\n\nfrom cleaning import …\nfrom plotting import …\nCheck that it runs!\n\n\n2.3 Classes\n\nOrganize the cleaning functions into classes that all have the same structure (an init method and a clean method)\n\nSpikeCleaner\n\ndef __init__(max_jump)\ndef clean(data)\n\nmodify main.py and check that it runs\n\ncleaners = [\nSpikeCleaner(max_jump=10),\nOutOfRangeCleaner(min_val=0, max_val=50),\nFlatPeriodCleaner(flat_period=5),\n]\nfor cleaner in cleaners:\ndata = cleaner.clean(data)\n\n\nDownload notebook_A and csv file and make sure it runs. (remove any remaining print statements)\n\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" }, { - "objectID": "02_function_classes.html#side-effects-1", - "href": "02_function_classes.html#side-effects-1", - "title": "Functions, classes and modules", - "section": "Side effects", - "text": "Side effects\nPure functions without side effects are easier to reason about.\nBut sometimes side effects are necessary.\n\nWriting to a file\nWriting to a database\nPrinting to the screen\nCreating a plot" + "objectID": "projects/data_cleaning/Project_module_02.html#module-2-modules-and-classes", + "href": "projects/data_cleaning/Project_module_02.html#module-2-modules-and-classes", + "title": "Python package development", + "section": "", + "text": "Create new branch “modules-classes” (Make sure changes from last module have been merged, and that you start from the main branch)\n2.1 Function arguments\n\nAdd default arguments to the functions. Commit.\nMake sure that you only use positional arguments where there is only one argument. Use keyword arguments everywhere else. Commit.\n\n2.2 Modules\n\nMove cleaner functions into a separate module “cleaning.py”. Commit.\nMove the plotting function into a separate module “plotting.py”. Commit.\nRename the script main.py and execute the cleaning and plotting.\n\nfrom cleaning import …\nfrom plotting import …\nCheck that it runs!\n\n\n2.3 Classes\n\nOrganize the cleaning functions into classes that all have the same structure (an init method and a clean method)\n\nSpikeCleaner\n\ndef __init__(max_jump)\ndef clean(data)\n\nmodify main.py and check that it runs\n\ncleaners = [\nSpikeCleaner(max_jump=10),\nOutOfRangeCleaner(min_val=0, max_val=50),\nFlatPeriodCleaner(flat_period=5),\n]\nfor cleaner in cleaners:\ndata = cleaner.clean(data)\n\n\nDownload notebook_A and csv file and make sure it runs. (remove any remaining print statements)\n\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" }, { - "objectID": "02_function_classes.html#modifying-input-arguments", - "href": "02_function_classes.html#modifying-input-arguments", - "title": "Functions, classes and modules", - "section": "Modifying input arguments", - "text": "Modifying input arguments\ndef difficult_function(values):\n for i in range(len(values)):\n values[i] = min(0, values[i]) # 😟\n return values\n\n>>> x = [1,2,-1]\n>>> difficult_function(x)\n[0, 0, -1]\n>>> x\n[0, 0, -1]\n\nThis function modifies the input array, which might come as a surprise. The array is passed by reference, so the function can modify it." + "objectID": "projects/data_cleaning/clean_project_data_v4_final2.html", + "href": "projects/data_cleaning/clean_project_data_v4_final2.html", + "title": "Python package development", + "section": "", + "text": "clean_project_data_v4_final.py\n\nimport pandas as pd\nimport numpy as np\nfrom datetime import datetime, timedelta\nimport matplotlib.pyplot as plt\n\n# Create date range\ndate_rng = pd.date_range(start=\"1/1/2020\", end=\"1/31/2020\", freq=\"D\")\n\n# Sample time series data with DateTimeIndex\ndata1 = pd.Series([1, 2, -1, 4, 5, 20, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, \n 21, 22, 24, 24, 24, 24, 24, 24, 29, 30, 31], index=date_rng)\ndata2 = pd.Series([5, 6, 200, 8, 9, 10, 11, 12, 300, 14, 15, 16, 17, 18, 19, 20, 21, 22, \n 23, 24, 25, 26, 27, 27, 27, 30, 31, 32, 33, 34, 35], index=date_rng)\ndata3 = pd.Series([15, 16, 11, 18, 400, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, \n 32, 33, 34, 35, 36, 37, 38, 39, 45, 45, 45, 45, 45, 45], index=date_rng)\n\n\n# Cleaning data1\nprint(\"\\nCleaning data1\")\ndata1_original = data1.copy()\n\n# Checking for jumps \nprint(\"Checking for jumps in data1\")\nmax_jump=10\nprev_value = data1.iloc[0]\nfor t, value in data1.items():\n if abs(value - prev_value) <= max_jump:\n # \"Value ok\"\n data1[t] = value\n prev_value = value\n else:\n data1[t] = np.nan\n print(\"Jump detected and value removed on\", t, \":\", value)\nprint(f\"Data removed: {data1_original[~data1_original.isin(data1)]}\")\n# print(\"Data1 after jump check:\", data1)\n\n# Checking for values in range \nmin_val = 0\nmax_val = 50\nfor t, value in data1.items():\n # print(\"Checking value on\", t, \":\", value)\n if min_val <= value <= max_val:\n pass\n # print(\"Value ok:\", value)\n else:\n data1[t] = np.nan\n print(\"Value removed:\", value)\nprint(f\"Data removed: {data1_original[~data1_original.isin(data1)]}\")\n# print(\"Data1 after range check:\", data1)\n\n\n# Checking for flat periods \nprint(\"Checking for flat periods in data1\")\nflat_period = 5\ni = 0\nwhile i < len(data1) - flat_period:\n if len(set(data1[i: i + flat_period + 1])) == 1: \n print(\"Removing flat period starting at index:\", i)\n data1[i: i + flat_period + 1] = np.nan\n i += flat_period\n else:\n i += 1\nprint(f\"Data removed: {data1_original[~data1_original.isin(data1)]}\")\n# print(\"Data1 after flat period check:\", data1)\n\n\n# Cleaning data2\nprint(\"\\nCleaning data2\")\ndata2_original = data2.copy()\n\n# Checking for jumps \nprint(\"Checking for jumps in data2\")\nmax_jump=10\nprev_value = data2.iloc[0]\nfor t, value in data2.items():\n if abs(value - prev_value) <= max_jump:\n # \"Value ok\"\n data2[t] = value\n prev_value = value\n else:\n data2[t] = np.nan\n print(\"Jump detected and value removed on\", t, \":\", value)\nprint(f\"Data removed: {data2_original[~data2_original.isin(data2)]}\")\n# print(\"data2 after jump check:\", data2)\n\n# Checking for values in range \nmin_val = 0\nmax_val = 50\nfor t, value in data2.items():\n # print(\"Checking value on\", t, \":\", value)\n if min_val <= value <= max_val:\n pass\n # print(\"Value ok:\", value)\n else:\n data2[t] = np.nan\n print(\"Value removed:\", value)\nprint(f\"Data removed: {data2_original[~data2_original.isin(data2)]}\")\n# print(\"data2 after range check:\", data2)\n\n\n# Checking for flat periods \nprint(\"Checking for flat periods in data2\")\nflat_period = 5\ni = 0\nwhile i < len(data2) - flat_period:\n if len(set(data2[i: i + flat_period + 1])) == 1: \n print(\"Removing flat period starting at index:\", i)\n data2[i: i + flat_period + 1] = np.nan\n i += flat_period\n else:\n i += 1\nprint(f\"Data removed: {data2_original[~data2_original.isin(data2)]}\")\n# print(\"data2 after flat period check:\", data2)\n\n# print(\"Final cleaned data2:\", data2)\n\n# Cleaning data3\nprint(\"\\nCleaning data3\")\ndata3_original = data3.copy()\n\n# Checking for jumps \nprint(\"Checking for jumps in data3\")\nmax_jump=10\nprev_value = data3.iloc[0]\nfor t, value in data3.items():\n if abs(value - prev_value) <= max_jump:\n # \"Value ok\"\n data3[t] = value\n prev_value = value\n else:\n data3[t] = np.nan\n print(\"Jump detected and value removed on\", t, \":\", value)\nprint(f\"Data removed: {data3_original[~data3_original.isin(data3)]}\")\n# print(\"data3 after jump check:\", data3)\n\n# Checking for values in range \nmin_val = 0\nmax_val = 50\nfor t, value in data3.items():\n # print(\"Checking value on\", t, \":\", value)\n if min_val <= value <= max_val:\n pass\n # print(\"Value ok:\", value)\n else:\n data3[t] = np.nan\n print(\"Value removed:\", value)\nprint(f\"Data removed: {data3_original[~data3_original.isin(data3)]}\")\n# print(\"data3 after range check:\", data3)\n\n\n# Checking for flat periods \nprint(\"Checking for flat periods in data3\")\nflat_period = 5\ni = 0\nwhile i < len(data3) - flat_period:\n if len(set(data3[i: i + flat_period + 1])) == 1: \n print(\"Removing flat period starting at index:\", i)\n data3[i: i + flat_period + 1] = np.nan\n i += flat_period\n else:\n i += 1\nprint(f\"Data removed: {data3_original[~data3_original.isin(data3)]}\")\n# print(\"data3 after flat period check:\", data3)\n\n# print(\"Final cleaned data3:\", data3)\n\n## plot data showing outliers as red dots\nplt.figure(figsize=(10, 5))\nplt.plot(data1_original, '.', color=\"red\")\nplt.plot(data1, '.', color=\"green\")\nplt.title(\"Data1\")\nplt.show()\n\nplt.figure(figsize=(10, 5))\nplt.plot(data2_original, '.', color=\"red\")\nplt.plot(data2, '.', color=\"green\")\nplt.title(\"Data2\")\nplt.show()\n\nplt.figure(figsize=(10, 5))\nplt.plot(data3_original, '.', color=\"red\")\nplt.plot(data3, '.', color=\"green\")\nplt.title(\"Data3\")\nplt.show()" }, { - "objectID": "02_function_classes.html#positional-arguments", - "href": "02_function_classes.html#positional-arguments", - "title": "Functions, classes and modules", - "section": "Positional arguments", - "text": "Positional arguments\ndef f(x, y):\n return x + y\n\n>>> f(1, 2)\n3" + "objectID": "projects/data_cleaning/Project_module_07.html", + "href": "projects/data_cleaning/Project_module_07.html", + "title": "Python package development", + "section": "", + "text": "Add a license\nChange version number to 0.1.0\nBuild the package\nPublish the package to the PyPI Test Server.\n\nBack to homework overview" }, { - "objectID": "02_function_classes.html#keyword-arguments", - "href": "02_function_classes.html#keyword-arguments", - "title": "Functions, classes and modules", - "section": "Keyword arguments", - "text": "Keyword arguments\ndef f(x, y):\n return x + y\n\n>>> f(x=1, y=2)\n3" + "objectID": "projects/data_cleaning/Project_module_07.html#module-7-publishing", + "href": "projects/data_cleaning/Project_module_07.html#module-7-publishing", + "title": "Python package development", + "section": "", + "text": "Add a license\nChange version number to 0.1.0\nBuild the package\nPublish the package to the PyPI Test Server.\n\nBack to homework overview" }, { - "objectID": "02_function_classes.html#positional-arguments-1", - "href": "02_function_classes.html#positional-arguments-1", - "title": "Functions, classes and modules", - "section": "Positional arguments", - "text": "Positional arguments\n\n\nVersion 1\ndef is_operable(height, period):\n\n return height < 2.0 and period < 6.0\n\n>>> is_operable(1.0, 3.0)\nTrue\n\nVersion 2\ndef is_operable(period, height=0.0):\n # dont forget, that arguments are swapped 👍\n return height < 2.0 and period < 6.0\n\n>>> is_operable(1.0, 3.0)\nFalse 😟\n\n\n\nThe order of the arguments is swapped, since we want to make height an optional argument (more on that later). This breaks existing code, since the order of the arguments is changed." + "objectID": "projects/data_cleaning/Project_module_04.html", + "href": "projects/data_cleaning/Project_module_04.html", + "title": "Python package development", + "section": "", + "text": "Create new branch “action-formatting” (Make sure changes from last module have been merged, and that you start from the main branch)\n4.1 Github Action\n\n4.1.1 Copy the GitHub action “python-app.yml” from the python template https://github.com/DHI/template-python-library to your own library (make sure it sits in the same folder).\n4.1.2 Change all occurrences of “my_library” in the yml file to your package name “tscleaner”\n4.1.3 Comment out the line with “ruff-action” with “#”\n4.1.4 Commit, push and create a pull request; the tests should now run, verify that they all run before you move on\n\n4.2 Ruff\n\n4.2.1 Enable the “ruff-action” be removing the “#” you added above\n4.2.2 Commit and push, your actions will probably fail now - inspect the problems by clicking the red cross (did you also get an email?)\n4.2.3 Install “ruff” on your local machine with mamba/conda/pip\n4.2.4 Navigate to your project root folder and run ruff with “ruff .”\n4.2.5 Add __all__ = [\"SpikeCleaner\", \"FlatPeriodCleaner\", \"OutOfRangeCleaner\", \"plot_timeseries\"] to your __init__.py file and fix remaining issues until ruff passes\n4.2.6 Commit, push and verify that you action now succeeds\n\n4.3 Black\n\n4.3.1 Install “black” on your local machine with mamba/conda/pip\n4.3.2 Run black from your project root folder; inspect the differences; commit\n\n4.4 pyproject.toml\n\nCopy the pyproject.toml from the python template https://github.com/DHI/template-python-library (this file will replace your setup.py)\nModify to fit your package\nRemove the setup.py\nCommit, push and verify that the GitHub action runs\nIf it fails, you probably forgot some dependencies - go back and fix\n[Optional] You should also re-install your local package with “>pip install –upgrade -e .”\n\n4.5 [Optional] Enable black and ruff extensions in VSCode; set black to run on save\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" }, { - "objectID": "02_function_classes.html#keyword-only-arguments", - "href": "02_function_classes.html#keyword-only-arguments", - "title": "Functions, classes and modules", - "section": "Keyword-only arguments", - "text": "Keyword-only arguments\ndef f(*, x, y):\n return x + y\n\n>>> f(1,2)\nTraceback (most recent call last):\n File \"<stdin>\", line 1, in <module>\nTypeError: f() takes 0 positional arguments but 2 were given" + "objectID": "projects/data_cleaning/Project_module_04.html#module-4-github-actions-and-auto-formatting", + "href": "projects/data_cleaning/Project_module_04.html#module-4-github-actions-and-auto-formatting", + "title": "Python package development", + "section": "", + "text": "Create new branch “action-formatting” (Make sure changes from last module have been merged, and that you start from the main branch)\n4.1 Github Action\n\n4.1.1 Copy the GitHub action “python-app.yml” from the python template https://github.com/DHI/template-python-library to your own library (make sure it sits in the same folder).\n4.1.2 Change all occurrences of “my_library” in the yml file to your package name “tscleaner”\n4.1.3 Comment out the line with “ruff-action” with “#”\n4.1.4 Commit, push and create a pull request; the tests should now run, verify that they all run before you move on\n\n4.2 Ruff\n\n4.2.1 Enable the “ruff-action” be removing the “#” you added above\n4.2.2 Commit and push, your actions will probably fail now - inspect the problems by clicking the red cross (did you also get an email?)\n4.2.3 Install “ruff” on your local machine with mamba/conda/pip\n4.2.4 Navigate to your project root folder and run ruff with “ruff .”\n4.2.5 Add __all__ = [\"SpikeCleaner\", \"FlatPeriodCleaner\", \"OutOfRangeCleaner\", \"plot_timeseries\"] to your __init__.py file and fix remaining issues until ruff passes\n4.2.6 Commit, push and verify that you action now succeeds\n\n4.3 Black\n\n4.3.1 Install “black” on your local machine with mamba/conda/pip\n4.3.2 Run black from your project root folder; inspect the differences; commit\n\n4.4 pyproject.toml\n\nCopy the pyproject.toml from the python template https://github.com/DHI/template-python-library (this file will replace your setup.py)\nModify to fit your package\nRemove the setup.py\nCommit, push and verify that the GitHub action runs\nIf it fails, you probably forgot some dependencies - go back and fix\n[Optional] You should also re-install your local package with “>pip install –upgrade -e .”\n\n4.5 [Optional] Enable black and ruff extensions in VSCode; set black to run on save\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" }, { - "objectID": "02_function_classes.html#optionaldefault-arguments", - "href": "02_function_classes.html#optionaldefault-arguments", - "title": "Functions, classes and modules", - "section": "Optional(=default) arguments", - "text": "Optional(=default) arguments\ndef f(x, n=2):\n return x**n\n\n>>> f(2)\n4\n>>> f(2, n=3)\n8\n\nMakes it easy to use a function with many arguments." + "objectID": "03_testing.html#testing", + "href": "03_testing.html#testing", + "title": "Testing, linting and formatting", + "section": "Testing", + "text": "Testing\nVerify code is working as expected.\nSimplest way to test is to run code and check output.\n\nAutomated testing checks output automatically.\nCode changes can break other parts of code.\nAutomatic testing verifies code is still working." }, { - "objectID": "02_function_classes.html#mutable-default-arguments", - "href": "02_function_classes.html#mutable-default-arguments", - "title": "Functions, classes and modules", - "section": "Mutable default arguments", - "text": "Mutable default arguments\nPython’s default arguments are evaluated once when the function is defined, not each time the function is called.\n\ndef add_to_cart(x, cart=[]): # this line is evaluated only once 😮\n cart.append(x)\n return cart\n\n>>> add_to_cart(1, cart=[2])\n[2, 1]\n\n>>> add_to_cart(1)\n[1]\n>>> add_to_cart(2)\n[1, 2]\n\nPython’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well." + "objectID": "03_testing.html#testing-workflow", + "href": "03_testing.html#testing-workflow", + "title": "Testing, linting and formatting", + "section": "Testing workflow", + "text": "Testing workflow\n\n\n\n\nflowchart TD\n A[Prepare inputs]\n B[Describe expected output]\n C[Obtain actual output]\n D[Compare actual and\\n expected output]\n\n A --> B --> C --> D" }, { - "objectID": "02_function_classes.html#how-to-use-default-mutable-arguments", - "href": "02_function_classes.html#how-to-use-default-mutable-arguments", - "title": "Functions, classes and modules", - "section": "How to use default (mutable) arguments", - "text": "How to use default (mutable) arguments\ndef add_to_cart_safe(x, cart=None):\n if cart is None:\n cart = [] # this line is evaluated each time the function is called\n cart.append(x)\n return cart" + "objectID": "03_testing.html#unit-testing", + "href": "03_testing.html#unit-testing", + "title": "Testing, linting and formatting", + "section": "Unit testing", + "text": "Unit testing\n\n\n\n\n\n\nDefinition “Unit”\n\n\n\nA small, fundamental piece of code.\nExecuted in isolation with appropriate inputs.\n\n\n\n\n\n\nA function is typically considered a “unit”\nLines of code within functions are smaller (can’t be isolated)\nClasses are considered bigger (but can be treated as units)" }, { - "objectID": "02_function_classes.html#changing-return-types", - "href": "02_function_classes.html#changing-return-types", - "title": "Functions, classes and modules", - "section": "Changing return types", - "text": "Changing return types\nSince Python is a dynamic language, the type of the returned variable is allowed to vary.\ndef foo(x):\n if x >=0:\n return x\n else:\n return \"x is negative\"\n\nBut it usually a bad idea, since you can not tell from reading the code, which type will be returned." + "objectID": "03_testing.html#a-good-unit-test", + "href": "03_testing.html#a-good-unit-test", + "title": "Testing, linting and formatting", + "section": "A good unit test", + "text": "A good unit test\n\n\n\n\nFully automated (next week)\nHas full control over all the pieces running (“fake” external dependencies)\nCan be run in any order\nRuns in memory (no DB or file access, for example)\nConsistently returns the same result (no random numbers)\nRuns fast\nTests a single logical concept in the system\nReadable\nMaintainable\nTrustworthy" }, { - "objectID": "02_function_classes.html#changing-return-types-1", - "href": "02_function_classes.html#changing-return-types-1", - "title": "Functions, classes and modules", - "section": "Changing return types", - "text": "Changing return types\ndef is_operable(height, period):\n if height < 10:\n return height < 5.0 and period > 4.0\n else:\n return \"No way!\"\n\n>>> if is_operable(height=12.0, period=5.0):\n... print(\"Go ahead!\")\n...\nGo ahead!\n\n\n\n\n\n\n\nImportant\n\n\nIs this the result you expected?\n\n\n\n\n\nA non-empty string or a non-zero value is considered “truthy” in Python!" + "objectID": "03_testing.html#example", + "href": "03_testing.html#example", + "title": "Testing, linting and formatting", + "section": "Example", + "text": "Example\n\nget a timeseries of water levels\nfind the maxiumum water level each year\ncreate a summary report for the subset of data" }, { - "objectID": "02_function_classes.html#type-hints", - "href": "02_function_classes.html#type-hints", - "title": "Functions, classes and modules", - "section": "Type hints", - "text": "Type hints\nPython is a dynamically typed language -> the type of a variable is determined at runtime.\n\nBut we can add type hints to help the reader (and the code editor).\ndef is_operable(height: float, period: float) -> bool:\n ..." + "objectID": "03_testing.html#integration-testing", + "href": "03_testing.html#integration-testing", + "title": "Testing, linting and formatting", + "section": "Integration testing", + "text": "Integration testing\ndef test_integration():\n wl = get_water_level(time=\"2019-01-01\", location=\"Aarhus\")\n max_wls = get_max_water_level(wl, freq=\"Y\")\n report = summary_report(max_wls)\n\n assert report.title == \"Summary report\"\n assert report.text == \"The maximum water level in 2021 was 3.0 m\"" }, { - "objectID": "02_function_classes.html#classes", - "href": "02_function_classes.html#classes", - "title": "Functions, classes and modules", - "section": "Classes", - "text": "Classes\nclass WeirdToolbox:\n tools = [] # class variable ☹️\n\n\n>>> t1 = WeirdToolbox()\n>>> t1.tools.append(\"hammer\")\n>>> t1.tools\n[\"hammer\"]\n\n>>> t2 = WeirdToolbox()\n>>> t2.tools.append(\"screwdriver\")\n>>> t2.tools\n[\"hammer\", \"screwdriver\"]\n\nClass variables are rarely what you want, since they are shared between all instances of the class." + "objectID": "03_testing.html#testing-in-vs-code", + "href": "03_testing.html#testing-in-vs-code", + "title": "Testing, linting and formatting", + "section": "Testing in VS Code", + "text": "Testing in VS Code" }, { - "objectID": "02_function_classes.html#classes-1", - "href": "02_function_classes.html#classes-1", - "title": "Functions, classes and modules", - "section": "Classes", - "text": "Classes\nclass Toolbox:\n def __init__(self):\n self.tools = [] # instance variable 😃\n\n>>> t1 = Toolbox()\n>>> t1.tools.append(\"hammer\")\n>>> t1.tools\n[\"hammer\"]\n\n>>> t2 = Toolbox()\n>>> t2.tools.append(\"screwdriver\")\n>>> t2.tools\n[\"screwdriver\"]\n\nInstance variables are created when the instance is created, and are unique to each instance." + "objectID": "03_testing.html#fixtures", + "href": "03_testing.html#fixtures", + "title": "Testing, linting and formatting", + "section": "Fixtures", + "text": "Fixtures\n\n\nA piece of code that is used by multiple tests\nProvide data or services to tests\nDefined with @pytest.fixture\nSet up test environment\nPass fixtures as test arguments" }, { - "objectID": "02_function_classes.html#static-methods", - "href": "02_function_classes.html#static-methods", - "title": "Functions, classes and modules", - "section": "Static methods", - "text": "Static methods\nfrom datetime import date\n\nclass Interval:\n def __init__(self, start:date, end:date):\n self.start = start\n self.end = end\n\n>>> dr = Interval(date(2020, 1, 1), date(2020, 1, 31))\n>>> dr.start\ndatetime.date(2020, 1, 1)\n>>> dr.end\ndatetime.date(2020, 1, 31)\n\nHere is an example of useful class, but it is a bit cumbersome to create an instance." + "objectID": "03_testing.html#fixture-example", + "href": "03_testing.html#fixture-example", + "title": "Testing, linting and formatting", + "section": "Fixture example", + "text": "Fixture example\n@pytest.fixture\ndef water_level():\n return TimeSeries([1.0, .., 3.0], start = \"2019-01-01\")\n\ndef test_get_max_water_level(water_level):\n max_wls = get_max_water_level(water_level, freq=\"Y\")\n \n assert len(max_wls) == 1\n assert max_wls[0] == 3.0" }, { - "objectID": "02_function_classes.html#static-methods-1", - "href": "02_function_classes.html#static-methods-1", - "title": "Functions, classes and modules", - "section": "Static methods", - "text": "Static methods\nfrom datetime import date\n\nclass Interval:\n def __init__(self, start:date, end:date):\n self.start = start\n self.end = end\n\n @staticmethod\n def from_string(date_string):\n start_str, end_str = date_string.split(\"|\")\n start = date.fromisoformat(start_str)\n end = date.fromisoformat(end_str)\n return Interval(start, end)\n\n>>> dr = Interval.from_string(\"2020-01-01|2020-01-31\")\n>>> dr\n<__main__.Interval at 0x7fb99efcfb90>\n\nSince we commonly use ISO formatted dates separated by a pipe, we can add a static method to create an instance from a string. This makes it easier to create an instance." + "objectID": "03_testing.html#test-coverage", + "href": "03_testing.html#test-coverage", + "title": "Testing, linting and formatting", + "section": "Test coverage", + "text": "Test coverage\n\n\nA measure of how much of your code is tested\nA good test suite should cover all the code\nInstall pytest-cov\nRun tests with coverage report\n\npytest --cov=myproj\n\nUse coverage report to identify untested code" }, { - "objectID": "02_function_classes.html#dataclasses", - "href": "02_function_classes.html#dataclasses", - "title": "Functions, classes and modules", - "section": "Dataclasses", - "text": "Dataclasses\nfrom dataclasses import dataclass\n\n@dataclass\nclass Interval:\n start: date\n end: date\n\n @staticmethod\n def from_string(date_string):\n start_str, end_str = date_string.split(\"|\")\n start = date.fromisoformat(start_str)\n end = date.fromisoformat(end_str)\n return Interval(start, end)\n\n>>> dr = Interval.from_string(\"2020-01-01|2020-01-31\")\n>>> dr\nInterval(start=datetime.date(2020, 1, 1), end=datetime.date(2020, 1, 31))\n\nDataclasses are a new feature in Python 3.7, they are a convenient way to create classes with a few attributes. The variables are instance variables, and the class has a constructor that takes the same arguments as the variables." + "objectID": "03_testing.html#test-coverage-report", + "href": "03_testing.html#test-coverage-report", + "title": "Testing, linting and formatting", + "section": "Test coverage report", + "text": "Test coverage report\npytest --cov=myproj tests/\n-------------------- coverage: ... ---------------------\nName Stmts Miss Cover\n----------------------------------------\nmyproj/__init__ 2 0 100%\nmyproj/myproj 257 13 94%\nmyproj/feature4286 94 7 92%\n----------------------------------------\nTOTAL 353 20 94%" }, { - "objectID": "02_function_classes.html#equality", - "href": "02_function_classes.html#equality", - "title": "Functions, classes and modules", - "section": "Equality", - "text": "Equality\nOn a regular class, equality is based on the memory address of the object.\nclass Interval:\n def __init__(self, start:date, end:date):\n self.start = start\n self.end = end\n\n>>> dr1 = Interval(start=date(2020, 1, 1), end=date(2020, 1, 31))\n>>> dr2 = Interval(start=date(2020, 1, 1), end=date(2020, 1, 31))\n>>> dr1 == dr2\nFalse\n\nThis is not very useful, since we want to compare the values of the attributes." + "objectID": "03_testing.html#testing-advice", + "href": "03_testing.html#testing-advice", + "title": "Testing, linting and formatting", + "section": "Testing advice", + "text": "Testing advice\n\n\n\n\n\n\nTest edge cases\n\n\n\nempty lists\nlists with a single element\nempty strings\nempty dictionaries\nNone\nnp.nan" }, { - "objectID": "02_function_classes.html#equality-1", - "href": "02_function_classes.html#equality-1", - "title": "Functions, classes and modules", - "section": "Equality", - "text": "Equality\nclass Interval:\n def __init__(self, start:date, end:date):\n self.start = start\n self.end = end\n\n def __eq__(self, other):\n return self.start == other.start and self.end == other.end\n\n>>> dr1 = Interval(start=date(2020, 1, 1), end=date(2020, 1, 31))\n>>> dr2 = Interval(start=date(2020, 1, 1), end=date(2020, 1, 31))\n>>> dr1 == dr2\nTrue\n\nWe can override the __eq__ method to compare the values of the attributes." + "objectID": "03_testing.html#tests-act-as-specification", + "href": "03_testing.html#tests-act-as-specification", + "title": "Testing, linting and formatting", + "section": "Tests act as specification", + "text": "Tests act as specification\ndef test_operable_period_can_be_missing():\n\n assert is_operable(height=1.0, period=None)\n assert is_operable(height=1.0, period=np.nan)\n assert is_operable(height=1.0)\n assert not is_operable(height=11.0)\n\ndef test_height_can_not_be_missing():\n\n with pytest.raises(ValueError) as excinfo:\n is_operable(height=None)\n is_operable(height=np.nan)\n \n assert \"height\" in str(excinfo.value)" }, { - "objectID": "02_function_classes.html#data-classes", - "href": "02_function_classes.html#data-classes", - "title": "Functions, classes and modules", - "section": "Data classes", - "text": "Data classes\nfrom dataclasses import dataclass, field\n\n@dataclass\nclass Quantity:\n unit: str = field(compare=True)\n standard_name: field(compare=True)\n name: str = field(compare=False, default=None)\n\n\n>>> t1 = Quantity(name=\"temp\", unit=\"C\", standard_name=\"air_temperature\")\n>>> t2 = Quantity(name=\"temperature\", unit=\"C\", standard_name=\"air_temperature\")\n\n>>> t1 == t2\nTrue\n\n>>> d1 = Quantity(unit=\"m\", standard_name=\"depth\")\n>>> d1 == t2\nFalse" + "objectID": "03_testing.html#test-driven-development", + "href": "03_testing.html#test-driven-development", + "title": "Testing, linting and formatting", + "section": "Test driven development", + "text": "Test driven development\n\n\nWrite a test that fails ❌\nWrite the code to make the test pass ✅\nRefactor the code ⚒️\n\n\n\nThe benefit of this approach is that you are forced to think about the expected behaviour of your code before you write it.\nIt is also too easy to write a test that passes without actually testing the code." }, { - "objectID": "02_function_classes.html#data-classes-1", - "href": "02_function_classes.html#data-classes-1", - "title": "Functions, classes and modules", - "section": "Data classes", - "text": "Data classes\n\n\nCompact notation of fields with type hints\nEquality based on values of fields\nUseful string represenation by default\nIt is still a regular class" + "objectID": "03_testing.html#section", + "href": "03_testing.html#section", + "title": "Testing, linting and formatting", + "section": "", + "text": "and now for something completely different…" }, { - "objectID": "02_function_classes.html#modules", - "href": "02_function_classes.html#modules", - "title": "Functions, classes and modules", - "section": "Modules", - "text": "Modules\nModules are files containing Python code (functions, classes, constants) that belong together.\n$tree analytics/\nanalytics/\n├── __init__.py\n├── date.py\n└── tools.py\n\nThe analytics package contains two modules:\n\ntools module\ndate module" + "objectID": "03_testing.html#the-zen-of-python", + "href": "03_testing.html#the-zen-of-python", + "title": "Testing, linting and formatting", + "section": "The Zen of Python", + "text": "The Zen of Python\nBeautiful is better than ugly.\nExplicit is better than implicit.\nSimple is better than complex.\nComplex is better than complicated.\nFlat is better than nested.\nSparse is better than dense.\nReadability counts.\n\n…\nErrors should never pass silently.\nUnless explicitly silenced.\n…" }, { - "objectID": "02_function_classes.html#packages", - "href": "02_function_classes.html#packages", - "title": "Functions, classes and modules", - "section": "Packages", - "text": "Packages\n\n\nA package is a directory containing modules\nEach package in Python is a directory which MUST contain a special file called __init__.py\nThe __init__.py can be empty, and it indicates that the directory it contains is a Python package\n__init__.py can also execute initialization code" + "objectID": "03_testing.html#exceptions", + "href": "03_testing.html#exceptions", + "title": "Testing, linting and formatting", + "section": "Exceptions", + "text": "Exceptions\n\n\nExceptions are a way to handle errors in your code.\nRaising an exception can prevent propagating bad values.\nExceptions are communication between the programmer and the user.\nThere are many built-in exceptions in Python\n\nIndexError\nKeyError\nValueError\nFileNotFoundError\n\nYou can also create your own custom exceptions, e.g. ModelInitialistionError, MissingLicenseError?" }, { - "objectID": "02_function_classes.html#init__.py", - "href": "02_function_classes.html#init__.py", - "title": "Functions, classes and modules", - "section": "__init__.py", - "text": "__init__.py\nExample: mikeio/pfs/__init__.py:\nfrom .pfsdocument import Pfs, PfsDocument\nfrom .pfssection import PfsNonUniqueList, PfsSection\n\ndef read_pfs(filename, encoding=\"cp1252\", unique_keywords=False):\n \"\"\"Read a pfs file for further analysis/manipulation\"\"\"\n \n return PfsDocument(filename, encoding=encoding, unique_keywords=unique_keywords)\n\nThe imports in __init__.py let’s you separate the implementation into multiple files.\n>>> mikeio.pfs.pfssection.PfsSection\n<class 'mikeio.pfs.pfssection.PfsSection'>\n>>> mikeio.pfs.PfsSection\n<class 'mikeio.pfs.pfssection.PfsSection'>\n\nThe PfsSection and PfsDocument are imported from the pfssection.py and pfsdocument.py modules. to the mikeio.pfs namespace." + "objectID": "03_testing.html#example-1", + "href": "03_testing.html#example-1", + "title": "Testing, linting and formatting", + "section": "Example", + "text": "Example\n\n\nsrc/ops.py\n\ndef is_operable(height:float, period:float) -> bool:\n if height < 0.0:\n raise ValueError(f\"Supplied value of {height=} is unphysical.\")\n\n>>> is_operable(height=-1.0, period=4.0)\n\nTraceback (most recent call last):\n ...\nValueError: Supplied value of height=-1.0 is unphysical.\n\n\nIt is better to raise an exception (that can terminate the program), than to propagate a bad value." }, { - "objectID": "02_function_classes.html#python-naming-conventions", - "href": "02_function_classes.html#python-naming-conventions", - "title": "Functions, classes and modules", - "section": "Python naming conventions", - "text": "Python naming conventions\nBy adhering to the naming conventions, your code will be easier to read for other Python developers.\n\nvariables, functions and methods: lowercase_with_underscores\nclasses: CamelCase\nconstants: UPPERCASE_WITH_UNDERSCORES" + "objectID": "03_testing.html#warnings", + "href": "03_testing.html#warnings", + "title": "Testing, linting and formatting", + "section": "Warnings", + "text": "Warnings\nWarnings are a way to alert users of your code to potential issues or usage errors without actually halting the program’s execution.\n\n\nsrc/ops.py\n\nimport warnings\nwarnings.warn(\"This is a warning\")" }, { - "objectID": "02_function_classes.html#variables-function-and-method-names", - "href": "02_function_classes.html#variables-function-and-method-names", - "title": "Functions, classes and modules", - "section": "Variables, function and method names", - "text": "Variables, function and method names\n\nUse lowercase characters\nSeparate words with underscores\n\n\nmodel_name = \"NorthSeaModel\"\nn_epochs = 100\n\ndef my_function():\n pass" + "objectID": "03_testing.html#how-to-test-exceptions", + "href": "03_testing.html#how-to-test-exceptions", + "title": "Testing, linting and formatting", + "section": "How to test exceptions", + "text": "How to test exceptions\n\n\ntests/test_ops.py\n\nimport pytest\nfrom ops import is_operable\n\ndef test_negative_heights_are_not_valid():\n with pytest.raises(ValueError):\n is_operable(height=-1.0, period=4.0)\n\nThe same can be done with warnings." }, { - "objectID": "02_function_classes.html#constants", - "href": "02_function_classes.html#constants", - "title": "Functions, classes and modules", - "section": "Constants", - "text": "Constants\n\nUse all uppercase characters\n\nGRAVITY = 9.81\n\nAVOGADRO_CONSTANT = 6.02214076e23\n\nSECONDS_IN_A_DAY = 86400\n\nN_LEGS_PER_ANIMAL = {\n \"human\": 2,\n \"dog\": 4,\n \"spider\": 8,\n}\n\nPython will not prevent you from changing the value of a constant, but it is a convention to use all uppercase characters for constants." + "objectID": "03_testing.html#linting", + "href": "03_testing.html#linting", + "title": "Testing, linting and formatting", + "section": "Linting", + "text": "Linting\nA way to check your code for common errors and style issues.\nruff is a new tool for linting Python code.\n\nsyntax errors\nunused imports\nunused variables\nundefined names\ncode style (e.g. line length, indentation, whitespace, etc.)" }, { - "objectID": "02_function_classes.html#classes-2", - "href": "02_function_classes.html#classes-2", - "title": "Functions, classes and modules", - "section": "Classes", - "text": "Classes\n\nUse CamelCase for the name of the class\nUse lowercase characters for the name of the methods\nSeparate words with underscores\n\n\nclass RandomClassifier: # CamelCase ✅\n\n def fit(self, X, y):\n self.classes_ = np.unique(y)\n\n def predict(self, X):\n return np.random.choice(self.classes_, size=len(X))\n\n def fit_predict(self, X, y): # lowercase ✅\n self.fit(X, y)\n return self.predict(X)\n\n\n\nPython package development" + "objectID": "03_testing.html#linting-with-ruff", + "href": "03_testing.html#linting-with-ruff", + "title": "Testing, linting and formatting", + "section": "Linting with ruff", + "text": "Linting with ruff\n\n\nexamples/04_testing/process.py\n\nimport requests\nimport scipy\n\ndef preprocess(x, y, xout):\n\n x = x[~np.isnan(x)] \n method = \"cubic\"\n # interpolate missing values with cubic spline\n return scipy.interpolate.interp1d(x, y)(xout)\n\nRun ruff:\n$ ruff process.py\nprocess.py:1:8: F401 [*] `requests` imported but unused\nprocess.py:6:12: F821 Undefined name `np`\nprocess.py:7:5: F841 [*] Local variable `method` is assigned to but never used\nFound 3 errors.\n[*] 2 potentially fixable with the --fix option.\n\n\nLinting is a fast way to find common errors.\nUnused imports are confusing.\nUnused and undefined variables are usually a typo or a mistake. Fixing them can prevent bugs." }, { - "objectID": "projects/data_cleaning/Project_module_03.html", - "href": "projects/data_cleaning/Project_module_03.html", - "title": "Python package development", - "section": "", - "text": "Create new branch “package-test” (Make sure changes from last module have been merged, and that you start from the main branch)\nMake sure pytest and pytest-cov are installed\n3.1 Installable package\n\n3.1.1 Organize the files into folders and add setup.py. Call your package tscleaner.\n\nsubfolders: tscleaner, scripts, notebooks, tests\nmake init-file in tscleaner with\n\nfrom .cleaning import SpikeCleaner, FlatPeriodCleaner, OutOfRangeCleaner\nfrom .plotting import plot_timeseries\n\ncreate a setup.py in the root with the following content (change with your data):\n\nfrom setuptools import setup, find_packages\nsetup( name=‘MyPackageName’,\nversion=‘0.0.1’,\nurl=‘https://github.com/mypackage.git’,\nauthor=‘Author Name’,\nauthor_email=‘author@gmail.com’,\ndescription=‘Description of my package’,\npackages=find_packages(),\ninstall_requires=[‘numpy’, ‘matplotlib’],\n)\n\n\n\n3.1.2 Install the package in editable mode.\n\n>pip install -e .\n\n3.1.3 Modify import statements in notebook_A and script main.py and make sure they run.\n3.1.4 Modify cleaner tools by raising exceptions for invalid inputs.\n3.1.5 Move the csv file to /tests/testdata and update notebook with relative path to the file\n\n3.2 Pytest\n\n3.2.1 Write unit tests with pytest in the /tests folder. Create an empty init-py file in the folder. Create a file test_cleaning.py and create at least five tests that verify that the cleaning tools work as intended\n[Optional] Consider to make a fixture that reads the csv file and you can read in all tests\n3.2.2 Run the tests from the commandline by writting >pytest in the project root (can you also run the tests from VSCode?)\n3.2.3 Assess the test coverage with >pytest --cov=tscleaner tests\nOptional: Get coverage as html with >pytest --cov=tscleaner --cov-report html (check the index.html in the htmlcov subfolder afterwards)\n\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" + "objectID": "03_testing.html#formatting", + "href": "03_testing.html#formatting", + "title": "Testing, linting and formatting", + "section": "Formatting", + "text": "Formatting\n\n\nFormatting code for readability and maintainability is essential.\nblack is an opinionated automatic code formatter for Python.\nIt enforces its own rules for formatting, which are not configurable.\nHaving a unified style makes code changes easier to understand and collaborate on." }, { - "objectID": "projects/data_cleaning/Project_module_03.html#module-3-installable-package-and-pytest", - "href": "projects/data_cleaning/Project_module_03.html#module-3-installable-package-and-pytest", - "title": "Python package development", - "section": "", - "text": "Create new branch “package-test” (Make sure changes from last module have been merged, and that you start from the main branch)\nMake sure pytest and pytest-cov are installed\n3.1 Installable package\n\n3.1.1 Organize the files into folders and add setup.py. Call your package tscleaner.\n\nsubfolders: tscleaner, scripts, notebooks, tests\nmake init-file in tscleaner with\n\nfrom .cleaning import SpikeCleaner, FlatPeriodCleaner, OutOfRangeCleaner\nfrom .plotting import plot_timeseries\n\ncreate a setup.py in the root with the following content (change with your data):\n\nfrom setuptools import setup, find_packages\nsetup( name=‘MyPackageName’,\nversion=‘0.0.1’,\nurl=‘https://github.com/mypackage.git’,\nauthor=‘Author Name’,\nauthor_email=‘author@gmail.com’,\ndescription=‘Description of my package’,\npackages=find_packages(),\ninstall_requires=[‘numpy’, ‘matplotlib’],\n)\n\n\n\n3.1.2 Install the package in editable mode.\n\n>pip install -e .\n\n3.1.3 Modify import statements in notebook_A and script main.py and make sure they run.\n3.1.4 Modify cleaner tools by raising exceptions for invalid inputs.\n3.1.5 Move the csv file to /tests/testdata and update notebook with relative path to the file\n\n3.2 Pytest\n\n3.2.1 Write unit tests with pytest in the /tests folder. Create an empty init-py file in the folder. Create a file test_cleaning.py and create at least five tests that verify that the cleaning tools work as intended\n[Optional] Consider to make a fixture that reads the csv file and you can read in all tests\n3.2.2 Run the tests from the commandline by writting >pytest in the project root (can you also run the tests from VSCode?)\n3.2.3 Assess the test coverage with >pytest --cov=tscleaner tests\nOptional: Get coverage as html with >pytest --cov=tscleaner --cov-report html (check the index.html in the htmlcov subfolder afterwards)\n\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" + "objectID": "03_testing.html#running-black", + "href": "03_testing.html#running-black", + "title": "Testing, linting and formatting", + "section": "Running Black", + "text": "Running Black\n$ black .\nreformatted data_utils.py\nreformatted dfsu/__init__.py\nreformatted dataarray.py\nreformatted dataset.py\nreformatted spatial/geometry.py\nreformatted pfs/pfssection.py\n\nAll done! ✨ 🍰 ✨\n6 files reformatted, 27 files left unchanged." }, { - "objectID": "projects/data_cleaning/Project_module_02.html", - "href": "projects/data_cleaning/Project_module_02.html", - "title": "Python package development", - "section": "", - "text": "Create new branch “modules-classes” (Make sure changes from last module have been merged, and that you start from the main branch)\n2.1 Function arguments\n\nAdd default arguments to the functions. Commit.\nMake sure that you only use positional arguments where there is only one argument. Use keyword arguments everywhere else. Commit.\n\n2.2 Modules\n\nMove cleaner functions into a separate module “cleaning.py”. Commit.\nMove the plotting function into a separate module “plotting.py”. Commit.\nRename the script main.py and execute the cleaning and plotting.\n\nfrom cleaning import …\nfrom plotting import …\nCheck that it runs!\n\n\n2.3 Classes\n\nOrganize the cleaning functions into classes that all have the same structure (an init method and a clean method)\n\nSpikeCleaner\n\ndef __init__(max_jump)\ndef clean(data)\n\nmodify main.py and check that it runs\n\ncleaners = [\nSpikeCleaner(max_jump=10),\nOutOfRangeCleaner(min_val=0, max_val=50),\nFlatPeriodCleaner(flat_period=5),\n]\nfor cleaner in cleaners:\ndata = cleaner.clean(data)\n\n\nDownload notebook_A and csv file and make sure it runs. (remove any remaining print statements)\n\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" + "objectID": "03_testing.html#running-black-1", + "href": "03_testing.html#running-black-1", + "title": "Testing, linting and formatting", + "section": "Running Black", + "text": "Running Black\nVisual Studio Code can be configured to run black automatically when saving a file using the Black extension." }, { - "objectID": "projects/data_cleaning/Project_module_02.html#module-2-modules-and-classes", - "href": "projects/data_cleaning/Project_module_02.html#module-2-modules-and-classes", - "title": "Python package development", - "section": "", - "text": "Create new branch “modules-classes” (Make sure changes from last module have been merged, and that you start from the main branch)\n2.1 Function arguments\n\nAdd default arguments to the functions. Commit.\nMake sure that you only use positional arguments where there is only one argument. Use keyword arguments everywhere else. Commit.\n\n2.2 Modules\n\nMove cleaner functions into a separate module “cleaning.py”. Commit.\nMove the plotting function into a separate module “plotting.py”. Commit.\nRename the script main.py and execute the cleaning and plotting.\n\nfrom cleaning import …\nfrom plotting import …\nCheck that it runs!\n\n\n2.3 Classes\n\nOrganize the cleaning functions into classes that all have the same structure (an init method and a clean method)\n\nSpikeCleaner\n\ndef __init__(max_jump)\ndef clean(data)\n\nmodify main.py and check that it runs\n\ncleaners = [\nSpikeCleaner(max_jump=10),\nOutOfRangeCleaner(min_val=0, max_val=50),\nFlatPeriodCleaner(flat_period=5),\n]\nfor cleaner in cleaners:\ndata = cleaner.clean(data)\n\n\nDownload notebook_A and csv file and make sure it runs. (remove any remaining print statements)\n\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" + "objectID": "03_testing.html#profiling", + "href": "03_testing.html#profiling", + "title": "Testing, linting and formatting", + "section": "Profiling", + "text": "Profiling\n\n\nProfiling is a way to measure the performance of your code.\nIt can help you identify bottlenecks in your code.\nYour intuition about what is slow is often wrong.\nThe line_profiler package reports the time spent on each line of code.\nIt can be run inside a notebook using the lprun magic command." }, { - "objectID": "projects/data_cleaning/index.html", - "href": "projects/data_cleaning/index.html", - "title": "Course project: Time Series Data Cleaning", - "section": "", - "text": "1.1 GitHub repo\n1.2 Functions\n\n\n\n\n\n2.1 Function arguments\n2.2 Modules\n2.3 Classes\n\n\n\n\n\n3.1 Installable package\n3.2 Pytest\n\n\n\n\n\n4.1 Github Action\n4.2 Ruff\n4.3 Black\n4.4 pyproject.toml\n\n\n\n\n\n5.1 Type Hints\n5.2 Data class\n5.3 Module level function\n5.4 Composition or inheritance\n\n\n\n\n\n6.1 README\n6.2 Docstrings\n6.3 mkdocs\n\n\n\n\n\nAdd a license\nChange version number to 0.1.0\nBuild the package with hatchling.\nPublish the package to the PyPI Test Server." + "objectID": "03_testing.html#profiling---example-code", + "href": "03_testing.html#profiling---example-code", + "title": "Testing, linting and formatting", + "section": "Profiling - example code", + "text": "Profiling - example code\nimport numpy as np\n\ndef top_neighbors(points, radius=\"0.1\"):\n \"\"\"Don't use this function, it's only purpose is to be profiled.\"\"\"\n n = len(points)\n idx = np.array([int(x) for x in str.split(\"0 \"* n)])\n\n for i in range(n):\n for j in range(n):\n if i != j:\n d = np.sqrt(np.sum((points[i] - points[j])**2))\n if d < float(radius): \n idx[i] += 1\n for i in range(n):\n for j in range(n - i - 1):\n if idx[j] < idx[j + 1]:\n idx[j], idx[j + 1] = idx[j + 1], idx[j]\n points[j], points[j + 1] = points[j + 1], points[j]\n return points\n\ndef main():\n points = np.random.rand(1000, 2)\n top = top_neighbors(points)" }, { - "objectID": "projects/data_cleaning/index.html#module-1-github-and-basic-functions", - "href": "projects/data_cleaning/index.html#module-1-github-and-basic-functions", - "title": "Course project: Time Series Data Cleaning", - "section": "", - "text": "1.1 GitHub repo\n1.2 Functions" + "objectID": "03_testing.html#profiling---output", + "href": "03_testing.html#profiling---output", + "title": "Testing, linting and formatting", + "section": "Profiling - output", + "text": "Profiling - output\nInvoking the jupyter magic command lprun with:\n\nfunction to profile - top_neighbors\ncode to run - main()\n\n%lprun -f top_neighbors main()\n\n\nLine # Hits Time Per Hit % Time Line Contents\n==============================================================\n 3 def top_neighbors(points, radius=\"0.1\"):\n 4 \"\"\"Don't use this function, it's only purpose is to be profiled.\"\"\"\n 5 1 2800.0 2800.0 0.0 n = len(points)\n 6 1 353300.0 353300.0 0.0 idx = np.array([int(x) for x in str.split(\"0 \"* n)])\n 7 \n 8 1001 345100.0 344.8 0.0 for i in range(n):\n 9 1001000 378191701.0 377.8 2.2 for j in range(n):\n 10 1000000 328387205.0 328.4 1.9 if i != j:\n 11 999000 1e+10 14473.0 83.8 d = np.sqrt(np.sum((points[i] - points[j])**2))\n 12 999000 933778605.0 934.7 5.4 if d < float(radius): \n 13 28952 57010001.0 1969.1 0.3 idx[i] += 1\n 14 1001 367100.0 366.7 0.0 for i in range(n):\n 15 500500 144295203.0 288.3 0.8 for j in range(n - i - 1):\n 16 499500 302166901.0 604.9 1.8 if idx[j] < idx[j + 1]:\n 17 240227 212070500.0 882.8 1.2 idx[j], idx[j + 1] = idx[j + 1], idx[j]\n 18 240227 437538803.0 1821.4 2.5 points[j], points[j + 1] = points[j + 1], points[j]\n 19 1 500.0 500.0 0.0 return points\n\n\n\nPython package development" }, { - "objectID": "projects/data_cleaning/index.html#module-2-modules-and-classes", - "href": "projects/data_cleaning/index.html#module-2-modules-and-classes", - "title": "Course project: Time Series Data Cleaning", - "section": "", - "text": "2.1 Function arguments\n2.2 Modules\n2.3 Classes" + "objectID": "00_introduction.html#instructors", + "href": "00_introduction.html#instructors", + "title": "Python package development", + "section": "Instructors", + "text": "Instructors\n\nHenrik Andersson - @ecomodeller\nJesper Sandvig Mariegaard - @jsmariegaard" }, { - "objectID": "projects/data_cleaning/index.html#module-3-installable-package-and-pytest", - "href": "projects/data_cleaning/index.html#module-3-installable-package-and-pytest", - "title": "Course project: Time Series Data Cleaning", - "section": "", - "text": "3.1 Installable package\n3.2 Pytest" + "objectID": "00_introduction.html#participants", + "href": "00_introduction.html#participants", + "title": "Python package development", + "section": "Participants", + "text": "Participants\nIntroduce yourselves in a break out session later today." }, { - "objectID": "projects/data_cleaning/index.html#module-4-github-actions-and-auto-formatting", - "href": "projects/data_cleaning/index.html#module-4-github-actions-and-auto-formatting", - "title": "Course project: Time Series Data Cleaning", - "section": "", - "text": "4.1 Github Action\n4.2 Ruff\n4.3 Black\n4.4 pyproject.toml" + "objectID": "00_introduction.html#learning-modules", + "href": "00_introduction.html#learning-modules", + "title": "Python package development", + "section": "Learning modules", + "text": "Learning modules\n\nGit, Pull Requests, and code reviews\n\nDiscussion\nHomework\n\nPython functions, classes, and modules\n\nDiscussion\nHomework\n\nTesting and auto-formatting\n\nHomework\n\nDependencies and GitHub actions\n\nHomework\n\nDocumentation\n\nHomework\n\nObject oriented design in Python\n\nHomework\n\nDistributing your package\n\nHomework" }, { - "objectID": "projects/data_cleaning/index.html#module-5-object-oriented-design", - "href": "projects/data_cleaning/index.html#module-5-object-oriented-design", - "title": "Course project: Time Series Data Cleaning", - "section": "", - "text": "5.1 Type Hints\n5.2 Data class\n5.3 Module level function\n5.4 Composition or inheritance" + "objectID": "00_introduction.html#learning-objectives", + "href": "00_introduction.html#learning-objectives", + "title": "Python package development", + "section": "Learning objectives", + "text": "Learning objectives\n\nimproved Python skills\nknowledge of how to create reusable Python code\nknow how to share code with others through a Python package" }, { - "objectID": "projects/data_cleaning/index.html#module-6-documentation", - "href": "projects/data_cleaning/index.html#module-6-documentation", - "title": "Course project: Time Series Data Cleaning", - "section": "", - "text": "6.1 README\n6.2 Docstrings\n6.3 mkdocs" + "objectID": "00_introduction.html#format", + "href": "00_introduction.html#format", + "title": "Python package development", + "section": "Format", + "text": "Format\n\nOnline session (Zoom) Tuesday and Friday\nHomework assignments\nQuiz (learning platform)" }, { - "objectID": "projects/data_cleaning/index.html#module-7-publishing", - "href": "projects/data_cleaning/index.html#module-7-publishing", - "title": "Course project: Time Series Data Cleaning", - "section": "", - "text": "Add a license\nChange version number to 0.1.0\nBuild the package with hatchling.\nPublish the package to the PyPI Test Server." + "objectID": "00_introduction.html#course-material", + "href": "00_introduction.html#course-material", + "title": "Python package development", + "section": "Course material", + "text": "Course material\n\nHillard, 2020, Practices of the Python Pro, Manning\nSlides" }, { - "objectID": "projects/data_cleaning/Project_module_05.html", - "href": "projects/data_cleaning/Project_module_05.html", + "objectID": "00_introduction.html#poll", + "href": "00_introduction.html#poll", "title": "Python package development", - "section": "", - "text": "Create new branch “docs” (Make sure changes from last module have been merged, and that you start from the main branch)\n6.1 README\n\nWrite a README file with basic information about the project.\n\n6.2 Docstrings\n\nWrite NumPy style docstrings for all functions and classes.\n[Optional] Install the autodocstrings extension in VSCode (set the style to NumPy)\n\n6.3 mkdocs\n\nInstall mkdocs, mkdocstrings and material design mamba/pip install mkdocstrings-python mkdocs-material\nCreate a mkdocs.yml file (copy from https://github.com/DHI/template-python-library and adapt).\nCreate a docs folder and create a markdown file index.md inside.\nCreate API documentation locally using >mkdocs serve.\nCheck the generated HTML documentation.\n\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" + "section": "Poll", + "text": "Poll\n\n\n\nPython package development" }, { - "objectID": "projects/data_cleaning/Project_module_05.html#module-5-documentation", - "href": "projects/data_cleaning/Project_module_05.html#module-5-documentation", + "objectID": "course_structure.html", + "href": "course_structure.html", "title": "Python package development", "section": "", - "text": "Create new branch “docs” (Make sure changes from last module have been merged, and that you start from the main branch)\n6.1 README\n\nWrite a README file with basic information about the project.\n\n6.2 Docstrings\n\nWrite NumPy style docstrings for all functions and classes.\n[Optional] Install the autodocstrings extension in VSCode (set the style to NumPy)\n\n6.3 mkdocs\n\nInstall mkdocs, mkdocstrings and material design mamba/pip install mkdocstrings-python mkdocs-material\nCreate a mkdocs.yml file (copy from https://github.com/DHI/template-python-library and adapt).\nCreate a docs folder and create a markdown file index.md inside.\nCreate API documentation locally using >mkdocs serve.\nCheck the generated HTML documentation.\n\nCreate pull request in GitHub and “request review” from your reviewers\nGet feedback, Adjust code until approval, then merge (and delete branch)\n\nBack to homework overview" + "text": "flowchart TD\n\n M1(Git, Pull Requests, and code reviews)\n M2(Python functions, classes, and modules)\n M3(Testing and auto-formatting)\n M4(Dependencies and GitHub actions)\n M5(Documentation)\n M6(Object oriented design in Python)\n M7(Distributing your package)\n\n B1[1. The bigger picture]\n B2[2. Separations of concern]\n B3[3. Abstraction and encapsulation]\n B4[4. Designing for high performance]\n B5[5. Testing your software]\n B6[6. Separations of concerns in practice]\n B7[7. Extensibility and flexibility]\n B8[8. The rules and exceptions of inheritance]\n B9[9. Keeping things lightweight]\n B10[10. Achieving loose coupling]\n\n M1 --> M2 --> M3 --> M4 --> M5 --> M6 --> M7\n\n B1 --> M2\n B2 --> M2\n B3 --> M6\n B8 --> M6\n B4 --> M4\n B5 --> M4\n B6 --> M5\n B7 --> M3\n\n B9 --> M7\n B10 --> M7" } ] \ No newline at end of file diff --git a/sitemap.xml b/sitemap.xml index 51a9377..c15277e 100644 --- a/sitemap.xml +++ b/sitemap.xml @@ -1,99 +1,103 @@ - https://github.com/DHI/python-package-development/projects/data_cleaning/Project_module_01.html - 2023-11-01T14:16:54.517Z + https://github.com/DHI/python-package-development/06_oop.html + 2023-11-03T08:12:21.888Z - https://github.com/DHI/python-package-development/projects/data_cleaning/Project_module_04.html - 2023-11-01T14:16:53.777Z + https://github.com/DHI/python-package-development/04_dependencies_ci.html + 2023-11-03T08:12:20.968Z - https://github.com/DHI/python-package-development/projects/data_cleaning/Project_module_06.html - 2023-11-01T14:16:53.129Z + https://github.com/DHI/python-package-development/07_packaging.html + 2023-11-03T08:12:20.096Z - https://github.com/DHI/python-package-development/projects/data_cleaning/Project_module_07.html - 2023-11-01T14:16:52.497Z + https://github.com/DHI/python-package-development/index.html + 2023-11-03T08:12:19.000Z https://github.com/DHI/python-package-development/projects/data_cleaning/notebook_A.html - 2023-11-01T14:16:51.857Z + 2023-11-03T08:12:18.404Z - https://github.com/DHI/python-package-development/00_introduction.html - 2023-11-01T14:16:50.841Z + https://github.com/DHI/python-package-development/projects/data_cleaning/Project_module_05.html + 2023-11-03T08:12:17.848Z - https://github.com/DHI/python-package-development/group_work/module_03.html - 2023-11-01T14:16:49.893Z + https://github.com/DHI/python-package-development/projects/data_cleaning/index.html + 2023-11-03T08:12:17.328Z - https://github.com/DHI/python-package-development/group_work/module_01.html - 2023-11-01T14:16:49.385Z + https://github.com/DHI/python-package-development/projects/data_cleaning/Project_module_01.html + 2023-11-03T08:12:16.776Z - https://github.com/DHI/python-package-development/group_work/module_02.html - 2023-11-01T14:16:48.837Z + https://github.com/DHI/python-package-development/projects/data_cleaning/Project_module_06.html + 2023-11-03T08:12:16.204Z - https://github.com/DHI/python-package-development/04_dependencies_ci.html - 2023-11-01T14:16:48.141Z + https://github.com/DHI/python-package-development/05_documentation.html + 2023-11-03T08:12:15.264Z - https://github.com/DHI/python-package-development/01_version_control.html - 2023-11-01T14:16:46.793Z + https://github.com/DHI/python-package-development/group_work/index.html + 2023-11-03T08:12:14.044Z - https://github.com/DHI/python-package-development/03_testing.html - 2023-11-01T14:16:43.401Z + https://github.com/DHI/python-package-development/group_work/module_04.html + 2023-11-03T08:12:13.588Z - https://github.com/DHI/python-package-development/07_packaging.html - 2023-11-01T14:16:41.697Z + https://github.com/DHI/python-package-development/group_work/module_02.html + 2023-11-03T08:12:12.816Z - https://github.com/DHI/python-package-development/course_structure.html - 2023-11-01T14:16:45.225Z + https://github.com/DHI/python-package-development/group_work/module_01.html + 2023-11-03T08:12:13.356Z - https://github.com/DHI/python-package-development/06_oop.html - 2023-11-01T14:16:47.441Z + https://github.com/DHI/python-package-development/group_work/module_03.html + 2023-11-03T08:12:13.828Z - https://github.com/DHI/python-package-development/index.html - 2023-11-01T14:16:48.549Z + https://github.com/DHI/python-package-development/02_function_classes.html + 2023-11-03T08:12:14.676Z - https://github.com/DHI/python-package-development/group_work/index.html - 2023-11-01T14:16:49.117Z + https://github.com/DHI/python-package-development/01_version_control.html + 2023-11-03T08:12:15.660Z - https://github.com/DHI/python-package-development/group_work/module_04.html - 2023-11-01T14:16:49.629Z + https://github.com/DHI/python-package-development/projects/data_cleaning/Project_module_03.html + 2023-11-03T08:12:16.504Z - https://github.com/DHI/python-package-development/05_documentation.html - 2023-11-01T14:16:50.561Z + https://github.com/DHI/python-package-development/projects/data_cleaning/Project_module_02.html + 2023-11-03T08:12:17.048Z - https://github.com/DHI/python-package-development/02_function_classes.html - 2023-11-01T14:16:51.481Z + https://github.com/DHI/python-package-development/projects/data_cleaning/clean_project_data_v4_final2.html + 2023-11-03T08:12:17.604Z - https://github.com/DHI/python-package-development/projects/data_cleaning/Project_module_03.html - 2023-11-01T14:16:52.233Z + https://github.com/DHI/python-package-development/projects/data_cleaning/Project_module_07.html + 2023-11-03T08:12:18.064Z - https://github.com/DHI/python-package-development/projects/data_cleaning/Project_module_02.html - 2023-11-01T14:16:52.813Z + https://github.com/DHI/python-package-development/projects/data_cleaning/Project_module_04.html + 2023-11-03T08:12:18.692Z - https://github.com/DHI/python-package-development/projects/data_cleaning/index.html - 2023-11-01T14:16:53.429Z + https://github.com/DHI/python-package-development/03_testing.html + 2023-11-03T08:12:19.504Z - https://github.com/DHI/python-package-development/projects/data_cleaning/Project_module_05.html - 2023-11-01T14:16:54.185Z + https://github.com/DHI/python-package-development/00_introduction.html + 2023-11-03T08:12:20.356Z + + + https://github.com/DHI/python-package-development/course_structure.html + 2023-11-03T08:12:21.348Z