Skip to content

Commit

Permalink
[DOC] documentation updates June 2024 (#48)
Browse files Browse the repository at this point in the history
Documentation updates

## Description
- Fixed spelling of and type annotations for Parameters across
docstrings
- Fixed broken links and unrecognized relative links
- Fixed indentation for continuation lines in docstrings
- Refactored README for legibility and clarity with a more defined
structure and better formatting
- General style and formatting improvements
- Minor change to the layout of the pages (to better accommodate the
reader)
- No code changes were made as part of this PR

## Testing
To build the docs: 
- check out the branch
- run `make docs`
- docsite will be hosted on your local machine

## Related Issue
#38 
#42 

---------

Co-authored-by: Danny Meijer <[email protected]>
  • Loading branch information
dannymeijer and dannymeijer authored Jun 21, 2024
1 parent 2d1449e commit 4b2e015
Show file tree
Hide file tree
Showing 42 changed files with 509 additions and 350 deletions.
25 changes: 8 additions & 17 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,26 +2,18 @@

There are a few guidelines that we need contributors to follow so that we are able to process requests as efficiently as possible.

<!-- uncomment the below once Open Source is ready -->
[//]: # (If you have any questions or concerns please feel free to contact us at [[email protected]]&#40;mailto:[email protected]&#41;.)
If you have any questions or concerns please feel free to contact us at [[email protected]](mailto:[email protected]).

[//]: # ()
[//]: # (## Getting Started)

[//]: # ()
[//]: # (* Review our [Code of Conduct]&#40;https://github.com/Nike-Inc/nike-inc.github.io/blob/master/CONDUCT.md&#41;)
## Getting Started

[//]: # (* Submit the [Individual Contributor License Agreement]&#40;https://www.clahub.com/agreements/Nike-Inc/fastbreak&#41;)

[//]: # (* Make sure you have a [GitHub account]&#40;https://github.com/signup/free&#41;)

[//]: # (* Submit a ticket for your issue, assuming one does not already exist.)

[//]: # ( * Clearly describe the issue including steps to reproduce when it is a bug.)

[//]: # ( * Make sure you fill in the earliest version that you know has the issue.)

[//]: # (* Fork the repository on GitHub)
* Review our [Code of Conduct](https://github.com/Nike-Inc/nike-inc.github.io/blob/master/CONDUCT.md)
* Make sure you have a [GitHub account](https://github.com/signup/free)
* Submit a ticket for your issue, assuming one does not already exist.
* Clearly describe the issue including steps to reproduce when it is a bug.
* Make sure you fill in the earliest version that you know has the issue.
* Fork the repository on GitHub

## Making Changes

Expand Down Expand Up @@ -98,6 +90,5 @@ At the moment, the release process is manual. We try to make frequent releases.
* [GitHub pull request documentation](https://help.github.com/send-pull-requests/)
* [Nike's Code of Conduct](https://github.com/Nike-Inc/nike-inc.github.io/blob/master/CONDUCT.md)

[//]: # (* [Nike's Individual Contributor License Agreement]&#40;https://www.clahub.com/agreements/Nike-Inc/fastbreak&#41;)

[//]: # (* [Nike OSS]&#40;https://nike-inc.github.io/&#41;)
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.PHONY: help ## Display this message
help:
@python koheesio/__about__.py
@python src/koheesio/__about__.py
@echo "\nAvailable \033[34m'make'\033[0m commands:"
@echo "\n\033[1mSetup:\033[0m"
@grep -E '^.PHONY: .*?## setup - .*$$' $(MAKEFILE_LIST) | sort | awk 'BEGIN {FS = ".PHONY: |## (setup|hatch) - "}; {printf " \033[36m%-22s\033[0m %s\n", $$2, $$3}'
Expand Down
210 changes: 155 additions & 55 deletions README.md

Large diffs are not rendered by default.

Binary file added docs/assets/documentation-system-overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 1 addition & 6 deletions docs/community/approach-documentation.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,3 @@
---
tags:
- doctype/explanation
---

## Scope

<!-- TODO: add Koheesio specifics -->
Expand All @@ -21,7 +16,7 @@ From [documentation.divio.com](https://documentation.divio.com):
> They are: _tutorials, how-to guides, technical reference and explanation_. They represent four different purposes or functions, and require four different approaches to their creation. Understanding the implications of this will help improve most documentation - often immensely.
>
> **About the system**
> ![Documentation System Overview](assets/../../assets/documentation-system-overview.png)
> ![Documentation System Overview](../assets/documentation-system-overview.png)
> The documentation system outlined here is a simple, comprehensive and nearly universally-applicable scheme. It is proven in practice across a wide variety of fields and applications.
>
> There are some very simple principles that govern documentation that are very rarely if ever spelled out. They seem to be a secret, though they shouldn’t be.
Expand Down
11 changes: 5 additions & 6 deletions docs/community/contribute.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,15 @@ There are a few guidelines that we need contributors to follow so that we are ab

* Create a feature branch off of `main` before you start your work.
* Please avoid working directly on the `main` branch.
* Setup the required package manager [hatch](#-package-manager)
* Setup the dev environment [see below](#-dev-environment-setup)
* Setup the required package manager [hatch](#package-manager)
* Setup the dev environment [see below](#dev-environment-setup)
* Make commits of logical units.
* You may be asked to squash unnecessary commits down to logical units.
* Check for unnecessary whitespace with `git diff --check` before committing.
* Write meaningful, descriptive commit messages.
* Please follow existing code conventions when working on a file
* Make sure to check the standards on the code, [see below](#-linting-and-standards)
* Make sure to test the code before you push changes [see below](#-testing)
* Make sure to check the standards on the code, [see below](#linting-and-standards)
* Make sure to test the code before you push changes [see below](#testing)

## 🤝 Submitting Changes

Expand Down Expand Up @@ -66,7 +66,7 @@ make hatch-install

This will install hatch using brew if you are on a Mac.

If you are on a different OS, you can follow the instructions [here]( https://hatch.pypa.io/latest/install/)
If you are on a different OS, you can follow the instructions [here](https://hatch.pypa.io/latest/install/)


### 📌 Dev Environment Setup
Expand Down Expand Up @@ -119,5 +119,4 @@ Make sure that all tests pass and that you have adequate coverage before submitt
* [General GitHub documentation](https://help.github.com/)
* [GitHub pull request documentation](https://help.github.com/send-pull-requests/)
* [Nike's Code of Conduct](https://github.com/Nike-Inc/nike-inc.github.io/blob/master/CONDUCT.md)
* [Nike's Individual Contributor License Agreement](https://www.clahub.com/agreements/Nike-Inc/fastbreak)
* [Nike OSS](https://nike-inc.github.io/)
26 changes: 1 addition & 25 deletions docs/css/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -35,31 +35,7 @@
border-left: .05rem solid var(--md-typeset-table-color);
}

/* Mark external links as such. */
.md-content a.autorefs-external::after,
.md-content a[href^="http"]:after {
/* https://primer.style/octicons/arrow-up-right-24 */
background-image: url('data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path fill="rgb(0, 0, 0)" d="M18.25 15.5a.75.75 0 00.75-.75v-9a.75.75 0 00-.75-.75h-9a.75.75 0 000 1.5h7.19L6.22 16.72a.75.75 0 101.06 1.06L17.5 7.56v7.19c0 .414.336.75.75.75z"></path></svg>');
content: ' ';

display: inline-block;
position: relative;
top: 0.1em;
margin-left: 0.2em;
margin-right: 0.1em;

height: 0.6em;
width: 0.6em;
border-radius: 100%;
background-color: var(--md-typeset-a-color);
}

.md-content a.autorefs-external:hover::after,
.md-content a[href^="http"]:hover::after {
background-color: var(--md-accent-fg-color);
background-image: url('data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path fill="rgb(255, 255, 255)" d="M18.25 15.5a.75.75 0 00.75-.75v-9a.75.75 0 00-.75-.75h-9a.75.75 0 000 1.5h7.19L6.22 16.72a.75.75 0 101.06 1.06L17.5 7.56v7.19c0 .414.336.75.75.75z"></path></svg>');
}

/* Gradient banner */
.md-header {
background: linear-gradient(142deg, rgba(229,119,39,1) 3%, rgba(172,56,56,1) 31%, rgba(133,59,96,1) 51%, rgba(31,67,103,1) 79%, rgba(31,99,120,1) 94%, rgba(32,135,139,1) 100%);
}
Expand Down
33 changes: 15 additions & 18 deletions docs/reference/concepts/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,14 @@ The core components are the following:
> <small>*Note:* click on the 'Concept' to take you to the corresponding module. The module documentation will have
greater detail on the specifics of the implementation</small>
## [**Step**](steps.md)

[//]: # (References)
[Context]: context.md
[Logging]: logger.md
[Step]: step.md


## [Step]

A custom unit of logic that can be executed. A Step is an atomic operation and serves as the building block of data
pipelines built with the framework. A step can be seen as an operation on a set of inputs, and returns a set of
Expand Down Expand Up @@ -53,39 +60,29 @@ Step ---> O3["Output 3"]
Step is the core abstraction of the framework. Meaning, that it is the core building block of the framework and is used
to define all the operations that can be executed.

Please see the [Step](steps.md) documentation for more details.

## [**Task**](tasks.md)

The unit of work of one execution of the framework.
Please see the [Step] documentation for more details.

An execution usually consists of an `Extract - Transform - Load` approach of one data object.
Tasks typically consist of a series of Steps.

Please see the [Task](tasks.md) documentation for more details.

## [**Context**](context.md)
## [Context]

The Context is used to configure the environment where a Task or Step runs.

It is often based on configuration files and can be used to adapt behaviour of a Task or Step based on the environment
it runs in.

Please see the [Context](context.md) documentation for more details.
Please see the [Context] documentation for more details.

## [**logger**](logging.md)

A logger object to log messages with different levels.
## [Logging]

Please see the [Logging](logging.md) documentation for more details.
A logger object to log messages with different levels.

Please see the [Logging] documentation for more details.

The interactions between the base concepts of the model is visible in the below diagram:

```mermaid
---
title: Koheesio Class Diagram
---
classDiagram
Step .. Task
Step .. Transformation
Expand Down
4 changes: 2 additions & 2 deletions docs/reference/concepts/context.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,14 @@ complex configurations. It also provides serialization and deserialization capab
and load configurations in JSON, YAML, or TOML formats.

Whether you're setting up the environment for a Task or Step, or managing variables shared across multiple tasks,
`Context` provides a robust and efficient solution.
`Context` provides a robust and efficient solution.

This document will guide you through its key features and show you how to leverage its capabilities in your Koheesio
applications.

## API Reference

See [API Reference](../../koheesio/context.html) for a detailed description of the `Context` class and its methods.
See [API Reference](../../api_reference/context.md) for a detailed description of the `Context` class and its methods.

## Key Features

Expand Down
File renamed without changes.
File renamed without changes.
8 changes: 0 additions & 8 deletions docs/reference/concepts/tasks.md

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ the `df` property of the `Reader`.

## API Reference

See [API Reference](../../koheesio/steps/readers) for a detailed description of the `Reader` class and its methods.
See [API Reference](../../api_reference/spark/readers/index.md) for a detailed description of the `Reader` class and its methods.

## Key Features of a Reader

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ pipeline. This can help avoid errors and make your code easier to understand and

## API Reference

See [API Reference](../../koheesio/steps/transformations) for a detailed description of the `Transformation` classes and
their methods.
See [API Reference](../../api_reference/spark/transformations/index.md) for a detailed description of the
`Transformation` classes and their methods.

## Types of Transformations

Expand Down
File renamed without changes.
112 changes: 82 additions & 30 deletions docs/tutorials/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,36 +6,53 @@

## Installation

### Poetry

If you're using Poetry, add the following entry to the `pyproject.toml` file:

```toml title="pyproject.toml"
[[tool.poetry.source]]
name = "nike"
url = "https://artifactory.nike.com/artifactory/api/pypi/python-virtual/simple"
secondary = true
```

```bash
poetry add koheesio
```

### pip

If you're using pip, run the following command to install Koheesio:

Requires [pip](https://pip.pypa.io/en/stable/).

```bash
pip install koheesio --extra-index-url https://artifactory.nike.com/artifactory/api/pypi/python-virtual/simple
```
<details>
<summary>hatch / hatchling</summary>

If you're using hatch (or hatchling), simply add `koheesio` to the `dependencies` or section in your
`pyproject.toml` file:

```toml title="pyproject.toml"
dependencies = [
"koheesio",
]
```
</details>

<details>
<summary>poetry</summary>

If you're using Poetry, add the following entry to the `pyproject.toml` file:

```toml title="pyproject.toml"
[[tool.poetry.source]]
name = "nike"
url = "https://artifactory.nike.com/artifactory/api/pypi/python-virtual/simple"
secondary = true
```

```bash
poetry add koheesio
```
</details>

<details>
<summary>pip</summary>

If you're using pip, run the following command to install Koheesio:

Requires [pip](https://pip.pypa.io/en/stable/).

```bash
pip install koheesio
```
</details>

## Basic Usage

Once you've installed Koheesio, you can start using it in your Python scripts. Here's a basic example:

```python
```python title="my_first_step.py"
from koheesio import Step

# Define a step
Expand All @@ -50,17 +67,52 @@ step = MyStep()
step.execute()
```

### Advanced Usage
For more advanced usage, you can check out the examples in the `__notebooks__` directory of this repository. These examples show how to use Koheesio's features in more detail.
## Advanced Usage

```python title="my_first_etl.py"
from pyspark.sql.functions import lit
from pyspark.sql import DataFrame, SparkSession

# Step 1: import Koheesio dependencies
from koheesio.context import Context
from koheesio.spark.readers.dummy import DummyReader
from koheesio.spark.transformations.camel_to_snake import CamelToSnakeTransformation
from koheesio.spark.writers.dummy import DummyWriter
from koheesio.spark.etl_task import EtlTask

# Step 2: Set up a SparkSession
spark = SparkSession.builder.getOrCreate()

# Step 3: Configure your Context
context = Context({
"source": DummyReader(),
"transformations": [CamelToSnakeTransformation()],
"target": DummyWriter(),
"my_favorite_movie": "inception",
})

# Step 4: Create a Task
class MyFavoriteMovieTask(EtlTask):
my_favorite_movie: str

def transform(self, df: DataFrame = None) -> DataFrame:
df = df.withColumn("MyFavoriteMovie", lit(self.my_favorite_movie))
return super().transform(df)

# Step 5: Run your Task
task = MyFavoriteMovieTask(**context)
task.run()
```

### Contributing
If you want to contribute to Koheesio, check out the `CONTRIBUTING.md` file in this repository. It contains guidelines for contributing, including how to submit issues and pull requests.
If you want to contribute to Koheesio, check out the `CONTRIBUTING.md` file in this repository. It contains guidelines
for contributing, including how to submit issues and pull requests.

### Testing
To run the tests for Koheesio, use the following command:

```bash
make test
make dev-test
```

This will run all the tests in the `test` directory.
This will run all the tests in the `tests` directory.
8 changes: 0 additions & 8 deletions docs/tutorials/how-to.md

This file was deleted.

Loading

0 comments on commit 4b2e015

Please sign in to comment.