Skip to content

Latest commit

 

History

History
231 lines (159 loc) · 10.4 KB

README.md

File metadata and controls

231 lines (159 loc) · 10.4 KB



Read the documentation: gloe.ideos.com.br

Souce code: github.com/ideos/gloe


Your Code as a Flow

Gloe (pronounced /ɡloʊ/, like "glow") is a general-purpose library designed to guide developers in expressing their code as a flow.

Why follow this approach? Because it ensures that Gloe can keep your Python code easy to maintain, document, and test. Gloe guides you to write code in the form of small, safely connected units, rather than relying on scattered functions and classes with no clear relationship.

What is a flow? Formally, a flow is defined as a DAG (Directed Acyclic Graph) with one source and one sink, meaning it has a beginning and an end. In practice, it is a sequence of steps that transform data from one form to another.

Where can I use it? Anywhere! Gloe is particularly useful for data science and machine learning pipelines, as well as for servers, scripts, or any area where there is a gap between code and logical business understanding.

Give me examples!

Key Features

1. Gloe emerged in the context of a large application with extremely complex business logic. After transitioning the code to Gloe, maintenance efficiency improved by 200%, leading the entire team to adopt it widely.

2. This feature is under development.

Installing

Requirements:

  • Python (>= 3.9)
  • typing-extensions (>= 4.7)

You can install Gloe using pip or conda:

# PyPI
pip install gloe
# or conda
conda install -c conda-forge gloe

Example

Consider the following flow. It is part of an e-commerce server and starts from a HTTP request and ends with a list of recommended products.

get_recommendations = (
    extract_request_id >>
    get_user_by_id >> (
        get_last_seen_items,
        get_last_ordered_items,
        get_user_network_data,
    ) >>
    get_recommended_items
)

Steps of this flow:

  1. The user ID is extracted from the request.
  2. The corresponding user data is retrieved using this ID.
  3. Three pieces of information about the user are gathered: the last seen items, the last ordered items, and the user's network data (such as personal information and relationships).
  4. These three pieces of information are used to generate a list of recommended items for this specific user.

Can we agree that it is easy to understand just by reading the code?

Flow Steps: Transformers

Each step of a flow is called a transformer and creating one is as easy as:

from gloe import transformer

@transformer
def extract_request_id(req: Request) -> int:
    # your logic goes here

You can connect many transformers using the right shift operator, just like the above example. When the argument of >> is a tuple, you are creating branches using the default parallel gateway.

Learn more about creating transformers, pipelines, and gateways.

Maintainance

When the manager requests you to return the items grouped by department instead of in a flat list, the refactoring process is straightforward:

get_user_recommendations = (
    extract_request_id >>
    get_user_by_id >> (
        get_last_seen_items,
        get_last_ordered_items,
        get_user_network_data,
    ) >>
    get_recommended_items >>
    group_by_department
)

See full transformers definition.

Type checking

If, by some chance, you connect two transformers with incompatible types, the IDE along with Mypy will warn you about the malformed flow.

For example, suppose you implemented the extract_request_id transformer returning a string instead of a integer ID:

@transformer
def extract_request_id(request: Request) -> str:
    ...

But the get_user_by_id transformer expects an int as input:

@transformer
def get_user_by_id(user_id: int) -> User:
    ...

The result will be something like this:

Malformed pipeline example

Simple Usage

Considering is everything okay about the types, this pipeline can be invoked from a server, for example:

@users_router.get('/:user_id/recommended')
def recommended_items_route(req: Request):
    return get_user_recommendations(req)

Quick Documentation

When you need to document it somewhere, you can just plot it.

Graph for send_promotion

Integration

Suppose you don't need to extract the user ID within the flow because you are using a web framework that already does it, like FastAPI, you can remove the extract_request_id transformer. Since the incoming type for get_user_by_id is integer, the configuration would be:

get_user_recommendations = (
    get_user_by_id >> (
        get_last_seen_items,
        get_last_ordered_items,
        get_user_network_data,
    ) >>
    get_recommended_items >>
    group_by_department
)


@users_router.get('/{user_id}/recommended')
def recommended_items_route(user_id: int):
    return get_user_recommendations(user_id)

We hope the above example illustrates how easily you can identify maintenance points and gain confidence that the rest of the code will continue to working properly, as long as the transformers' interfaces remain satisfied.

Motivation

Software development has lots of patterns and good practices related to the code itself, like how to document, test, structure and what programming paradigm to use. However, beyond all that, we believe that the key point of a successful software project is a good communication between everyone involved in the development. Of course, this communication is not necessarily restricted to meetings or text messages, it is present also in documentation artifacts and in a well-written code.

When developers write a code, they are telling a story to the next person who will read or/and refactor it. Depending on the code's quality, this story could be quite confusing, with no clear roles of the characters and a messy plot (sometimes with an undesired twist). The next person to maintain the software may take a long time to clearly understand the narrative and make it clear, or they will simply give up, leaving it as is.

How Can Gloe Help Me?

Gloe comes to turn this story coherent, logically organized and easy to follow. This is achieved by dividing the code into concise steps with an unambiguous responsibility and explicit interface. Then, Gloe allows you to connect these steps, clarifying their relationship and identifying necessary changes during refactoring. Thus, you can quickly understand the entire story being told and enhance it. Inspired by things like natural transformation and Free Monad (present in Scala and Haskell), Gloe implemented this approach using functional programming and strong typing concepts.

Gloe is not a workflow orchestrator

Currently, unlike platforms like Air Flow that include scheduler backends for task orchestration, Gloe's primary purpose is to aid in development. The graph structure aims to make the code more flat and hence readable. However, it is important to note that Gloe does not offer functionalities for executing tasks in a dedicated environment, nor does it directly contribute to execution speed or scalability improvements.

License

Apache License Version 2.0