calliope-project · brynpickering · Nov 13, 2024 · Oct 18, 2024 · Oct 18, 2024 · Oct 22, 2024
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,7 +2,9 @@
 
 ### User-facing changes
 
-|new| `where(array, where_array)` math helper function to apply a where array _inside_ an expression, to enable extending component dimensions on-the-fly, and applying filtering to different components within the expression (#604, #679).
+|changed| Helper functions are now documented on their own page within the "Defining your own math" section of the documentation (#698).
+
+|new| `where(array, condition)` math helper function to apply a where array _inside_ an expression, to enable extending component dimensions on-the-fly, and applying filtering to different components within the expression (#604, #679).
 
 |new| Data tables can inherit options from `templates`, like `techs` and `nodes` (#676).
 

diff --git a/docs/reference/api/helper_functions.md b/docs/reference/api/helper_functions.md
@@ -4,3 +4,6 @@ search:
 ---
 
 ::: calliope.backend.helper_functions
+    options:
+      docstring_options:
+        ignore_init_summary: true
diff --git a/docs/user_defined_math/helper_functions.md b/docs/user_defined_math/helper_functions.md
@@ -0,0 +1,121 @@
+
+# Helper functions
+
+For [`where` strings](syntax.md#where-strings) and [`expression` strings](syntax.md#where-strings), there are many helper functions available to use, to allow for more complex operations to be undertaken within the string.
+Their functionality is detailed in the [helper function API page](../reference/api/helper_functions.md).
+Here, we give a brief summary.
+Some of these helper functions require a good understanding of their functionality to apply, so make sure you are comfortable with them before using them.
+
+## inheritance
+
+using `inheritance(...)` in a `where` string allows you to grab a subset of technologies / nodes that all share the same [`template`](../creating/templates.md) in the technology's / node's `template` key.
+If a `template` also inherits from another `template` (chained inheritance), you will get all `techs`/`nodes` that are children along that inheritance chain.
+
+So, for the definition:
+
+```yaml
+templates:
+  techgroup1:
+    template: techgroup2
+    flow_cap_max: 10
+  techgroup2:
+    base_tech: supply
+techs:
+  tech1:
+    template: techgroup1
+  tech2:
+    template: techgroup2
+```
+
+`inheritance(techgroup1)` will give the `[tech1]` subset and `inheritance(techgroup2)` will give the `[tech1, tech2]` subset.
+
+## any
+
+Parameters are indexed over multiple dimensions.
+Using `any(..., over=...)` in a `where` string allows you to check if there is at least one non-NaN value in a given dimension (akin to [xarray.DataArray.any][]).
+So, `any(cost, over=[nodes, techs])` will check if there is at least one non-NaN tech+node value in the `costs` dimension (the other dimension that the `cost` decision variable is indexed over).
+
+## defined
+
+Similar to [any](syntax.md#any), using `defined(..., within=...)` in a `where` string allows you to check for non-NaN values along dimensions.
+In the case of `defined`, you can check if e.g., certain technologies have been defined within the nodes or certain carriers are defined within a group of techs or nodes.
+
+So, for the definition:
+
+```yaml
+techs:
+  tech1:
+    base_tech: conversion
+    carrier_in: electricity
+    carrier_out: heat
+  tech2:
+    base_tech: conversion
+    carrier_in: [coal, biofuel]
+    carrier_out: electricity
+nodes:
+  node1:
+    techs: {tech1}
+  node2:
+    techs: {tech1, tech2}
+```
+
+`defined(carriers=electricity, within=techs)` would yield a list of `[True, True]` as both technologies define electricity.
+
+`defined(techs=[tech1, tech2], within=nodes)` would yield a list of `[True, True]` as both nodes define _at least one_ of `tech1` or `tech2`.
+
+`defined(techs=[tech1, tech2], within=nodes, how=all)` would yield a list of `[False, True]` as only `node2` defines _both_ `tech1` and `tech2`.
+
+## sum
+
+Using `sum(..., over=)` in an expression allows you to sum over one or more dimension of your component array (be it a parameter, decision variable, or global expression).
+
+## select_from_lookup_arrays
+
+Some of our arrays in [`model.inputs`][calliope.Model.inputs] are not data arrays, but "lookup" arrays.
+These arrays are used to map the array's index items to other index items.
+For instance when using [time clustering](../advanced/time.md#time-clustering), the `lookup_cluster_last_timestep` array is used to get the timestep resolution and the stored energy for the last timestep in each cluster.
+Using `select_from_lookup_arrays(..., dim_name=lookup_array)` allows you to apply this lookup array to your data array.
+
+## get_val_at_index
+
+If you want to access an integer index in your dimension, use `get_val_at_index(dim_name=integer_index)`.
+For example, `get_val_at_index(timesteps=0)` will get the first timestep in your timeseries, `get_val_at_index(timesteps=-1)` will get the final timestep.
+This is mostly used when conditionally applying a different expression in the first / final timestep of the timeseries.
+
+It can be used in the `where` string (e.g., `timesteps=get_val_at_index(timesteps=0)` to mask all other timesteps) and the `expression string` (via [slices](syntax.md#slices) - `storage[timesteps=$first_timestep]` and `first_timestep` expression being `get_val_at_index(timesteps=0)`).
+
+## roll
+
+We do not use for-loops in our math.
+This can be difficult to get your head around initially, but it means that to define expressions of the form `var[t] == var[t-1] + param[t]` requires shifting all the data in your component array by N places.
+Using `roll(..., dimension_name=N)` allows you to do this.
+For example, `roll(storage, timesteps=1)` will shift all the storage decision variable objects by one timestep in the array.
+Then, `storage == roll(storage, timesteps=1) + 1` is equivalent to applying `storage[t] == storage[t - 1] + 1` in a for-loop.
+
+## default_if_empty
+
+We work with quite sparse arrays in our models.
+So, although your arrays are indexed over e.g., `nodes`, `techs` and `carriers`, a decision variable or parameter might only have one or two values in the array, with the rest being NaN.
+This can play havoc with defining math, with `nan` values making their way into your optimisation problem and then killing the solver or the solver interface.
+Using `default_if_empty(..., default=...)` in your `expression` string allows you to put a placeholder value in, which will be used if the math expression unavoidably _needs_ a value.
+Usually you shouldn't need to use this, as your `where` string will mask those NaN values.
+But if you're having trouble setting up your math, it is a useful function to getting it over the line.
+
+!!! note
+    Our internally defined parameters, listed in the `Parameters` section of our [pre-defined base math documentation][base-math] all have default values which propagate to the math.
+    You only need to use `default_if_empty` for decision variables and global expressions, and for user-defined parameters.
+
+## where
+
+[Where strings](syntax.md#where-strings) only allow you to apply conditions across the whole expression equations.
+Sometimes, it's necessary to apply specific conditions to different components _within_ the expression.
+Using `where(<math_component>, <condition>)` helper function enables this,
+where `<math_component>` is a reference to a parameter, variable, or global expression and `<condition>` is a reference to an array in your model inputs that contains only `True`/`1` and `False`/`0`/`NaN` values.
+`<condition>` will then be applied to `<math_component>`, keeping only the values in `<math_component>` where `<condition>` is `True`/`1`.
+
+This helper function can also be used to _extend_ the dimensions of a `<math_component>`.
+If the `<condition>` has any dimensions not present in `<math_component>`, `<math_component>` will be [broadcast](https://tutorial.xarray.dev/fundamentals/02.3_aligning_data_objects.html#broadcasting-adjusting-arrays-to-the-same-shape) to include those dimensions.
+
+!!! note
+    `Where` gets referred to a lot in Calliope math.
+    It always means the same thing: applying [xarray.DataArray.where][].
diff --git a/docs/user_defined_math/syntax.md b/docs/user_defined_math/syntax.md
@@ -37,7 +37,7 @@ When checking the existence of an input parameter it is possible to first sum it
         - If you want to apply a constraint across all `nodes` and `techs`, but only for node+tech combinations where the `flow_out_eff` parameter has been defined, you would include `flow_out_eff`.
         - If you want to apply a constraint over `techs` and `timesteps`, but only for combinations where the `source_use_max` parameter has at least one `node` with a value defined, you would include `any(resource, over=nodes)`.  (1)
 
-    1.  `any` is a [helper function](#helper-functions); read more below!
+    1.  `any` is a [helper function](helper_functions.md#any); read more below!
 
 1. Checking the value of a configuration option or an input parameter.
 Checks can use any of the operators: `>`, `<`, `=`, `<=`, `>=`.
@@ -50,15 +50,15 @@ Configuration options are any that are defined in `config.build`, where you can
         - If you want to apply a constraint only for the first timestep in your timeseries, you would include `timesteps=get_val_at_index(dim=timesteps, idx=0)`. (1)
         - If you want to apply a constraint only for the last timestep in your timeseries, you would include `timesteps=get_val_at_index(dim=timesteps, idx=-1)`.
 
-    1.  `get_val_at_index` is a [helper function](#helper-functions); read more below!
+    1.  `get_val_at_index` is a [helper function](helper_functions.md#get_val_at_index); read more below!
 
 1. Checking the `base_tech` of a technology (`storage`, `supply`, etc.) or its inheritance chain (if using `templates` and the `template` parameter).
 
     ??? example "Examples"
 
         - If you want to create a decision variable across only `storage` technologies, you would include `base_tech=storage`.
         - If you want to apply a constraint across only your own `rooftop_supply` technologies (e.g., you have defined `rooftop_supply` in `templates` and your technologies `pv` and `solar_thermal` define `#!yaml template: rooftop_supply`), you would include `inheritance(rooftop_supply)`.
-        Note that `base_tech=...` is a simple check for the given value of `base_tech`, while `inheritance()` is a helper function ([see below](#helper-functions)) which can deal with finding techs/nodes using the same template, e.g. `pv` might inherit the `rooftop_supply` template which in turn might inherit the template `electricity_supply`.
+        Note that `base_tech=...` is a simple check for the given value of `base_tech`, while `inheritance()` is a helper function ([see below](helper_functions.md)) which can deal with finding techs/nodes using the same template, e.g. `pv` might inherit the `rooftop_supply` template which in turn might inherit the template `electricity_supply`.
 
 1. Subsetting a set.
 The sets available to subset are always [`nodes`, `techs`, `carriers`] + any additional sets defined by you in [`foreach`](#foreach-lists).
@@ -67,7 +67,7 @@ The sets available to subset are always [`nodes`, `techs`, `carriers`] + any add
 
         - If you want to filter `nodes` where any of a set of `techs` are defined: `defined(techs=[tech1, tech2], within=nodes, how=any)` (1).
 
-    1. `defined` is a [helper function](#helper-functions); read more below!
+    1. `defined` is a [helper function](helper_functions.md#defined); read more below!
 
 To combine statements you can use the operators `and`/`or`.
 You can also use the `not` operator to negate any of the statements.
@@ -109,127 +109,6 @@ Behind the scenes, we will make sure that every relevant element of the defined
 Slicing math components involves appending the component with square brackets that contain the slices, e.g. `flow_out[carriers=electricity, nodes=[A, B]]` will slice the `flow_out` decision variable to focus on `electricity` in its `carriers` dimension and only has two nodes (`A` and `B`) on its `nodes` dimension.
 To find out what dimensions you can slice a component on, see your input data (`model.inputs`) for parameters and the definition for decision variables in your math dictionary.
 
-## Helper functions
-
-For [`where` strings](#where-strings) and [`expression` strings](#where-strings), there are many helper functions available to use, to allow for more complex operations to be undertaken.
-Their functionality is detailed in the [helper function API page](../reference/api/helper_functions.md).
-Here, we give a brief summary.
-Some of these helper functions require a good understanding of their functionality to apply, so make sure you are comfortable with them before using them.
-
-### inheritance
-
-using `inheritance(...)` in a `where` string allows you to grab a subset of technologies / nodes that all share the same [`template`](../creating/templates.md) in the technology's / node's `template` key.
-If a `template` also inherits from another `template` (chained inheritance), you will get all `techs`/`nodes` that are children along that inheritance chain.
-
-So, for the definition:
-
-```yaml
-templates:
-  techgroup1:
-    template: techgroup2
-    flow_cap_max: 10
-  techgroup2:
-    base_tech: supply
-techs:
-  tech1:
-    template: techgroup1
-  tech2:
-    template: techgroup2
-```
-
-`inheritance(techgroup1)` will give the `[tech1]` subset and `inheritance(techgroup2)` will give the `[tech1, tech2]` subset.
-
-### any
-
-Parameters are indexed over multiple dimensions.
-Using `any(..., over=...)` in a `where` string allows you to check if there is at least one non-NaN value in a given dimension (akin to [xarray.DataArray.any][]).
-So, `any(cost, over=[nodes, techs])` will check if there is at least one non-NaN tech+node value in the `costs` dimension (the other dimension that the `cost` decision variable is indexed over).
-
-### defined
-
-Similar to [any](#any), using `defined(..., within=...)` in a `where` string allows you to check for non-NaN values along dimensions.
-In the case of `defined`, you can check if e.g., certain technologies have been defined within the nodes or certain carriers are defined within a group of techs or nodes.
-
-So, for the definition:
-
-```yaml
-techs:
-  tech1:
-    base_tech: conversion
-    carrier_in: electricity
-    carrier_out: heat
-  tech2:
-    base_tech: conversion
-    carrier_in: [coal, biofuel]
-    carrier_out: electricity
-nodes:
-  node1:
-    techs: {tech1}
-  node2:
-    techs: {tech1, tech2}
-```
-
-`defined(carriers=electricity, within=techs)` would yield a list of `[True, True]` as both technologies define electricity.
-
-`defined(techs=[tech1, tech2], within=nodes)` would yield a list of `[True, True]` as both nodes define _at least one_ of `tech1` or `tech2`.
-
-`defined(techs=[tech1, tech2], within=nodes, how=all)` would yield a list of `[False, True]` as only `node2` defines _both_ `tech1` and `tech2`.
-
-### sum
-
-Using `sum(..., over=)` in an expression allows you to sum over one or more dimension of your component array (be it a parameter, decision variable, or global expression).
-
-### select_from_lookup_arrays
-
-Some of our arrays in [`model.inputs`][calliope.Model.inputs] are not data arrays, but "lookup" arrays.
-These arrays are used to map the array's index items to other index items.
-For instance when using [time clustering](../advanced/time.md#time-clustering), the `lookup_cluster_last_timestep` array is used to get the timestep resolution and the stored energy for the last timestep in each cluster.
-Using `select_from_lookup_arrays(..., dim_name=lookup_array)` allows you to apply this lookup array to your data array.
-
-### get_val_at_index
-
-If you want to access an integer index in your dimension, use `get_val_at_index(dim_name=integer_index)`.
-For example, `get_val_at_index(timesteps=0)` will get the first timestep in your timeseries, `get_val_at_index(timesteps=-1)` will get the final timestep.
-This is mostly used when conditionally applying a different expression in the first / final timestep of the timeseries.
-
-It can be used in the `where` string (e.g., `timesteps=get_val_at_index(timesteps=0)` to mask all other timesteps) and the `expression string` (via [slices](#slices) - `storage[timesteps=$first_timestep]` and `first_timestep` expression being `get_val_at_index(timesteps=0)`).
-
-### roll
-
-We do not use for-loops in our math.
-This can be difficult to get your head around initially, but it means that to define expressions of the form `var[t] == var[t-1] + param[t]` requires shifting all the data in your component array by N places.
-Using `roll(..., dimension_name=N)` allows you to do this.
-For example, `roll(storage, timesteps=1)` will shift all the storage decision variable objects by one timestep in the array.
-Then, `storage == roll(storage, timesteps=1) + 1` is equivalent to applying `storage[t] == storage[t - 1] + 1` in a for-loop.
-
-### default_if_empty
-
-We work with quite sparse arrays in our models.
-So, although your arrays are indexed over e.g., `nodes`, `techs` and `carriers`, a decision variable or parameter might only have one or two values in the array, with the rest being NaN.
-This can play havoc with defining math, with `nan` values making their way into your optimisation problem and then killing the solver or the solver interface.
-Using `default_if_empty(..., default=...)` in your `expression` string allows you to put a placeholder value in, which will be used if the math expression unavoidably _needs_ a value.
-Usually you shouldn't need to use this, as your `where` string will mask those NaN values.
-But if you're having trouble setting up your math, it is a useful function to getting it over the line.
-
-!!! note
-    Our internally defined parameters, listed in the `Parameters` section of our [pre-defined base math documentation][base-math] all have default values which propagate to the math.
-    You only need to use `default_if_empty` for decision variables and global expressions, and for user-defined parameters.
-
-### where
-
-[Where strings](#where-strings) only allow you to apply conditions across the whole expression equations.
-Sometimes, it's necessary to apply specific conditions to different components _within_ the expression.
-Using `where(<math_component>, <boolean_array>)` helper function enables this,
-where `<math_component>` is a reference to a parameter, variable, or global expression and `<boolean_array>` is a reference to an array in your model inputs that contains only `True`/`1` and `False`/`0`/`NaN` values.
-`<boolean_array>` will then be applied to `<math_component>`, keeping only the values in `<math_component>` where `<boolean_array>` is `True`/`1`.
-
-This helper function can also be used to _extend_ the dimensions of a `<math_component>`.
-If the ``<boolean_array>`` has any dimensions not present in `<math_component>`, `<math_component>` will be [broadcast](https://tutorial.xarray.dev/fundamentals/02.3_aligning_data_objects.html#broadcasting-adjusting-arrays-to-the-same-shape) to include those dimensions.
-
-!!! note
-    `Where` gets referred to a lot in Calliope math.
-    It always means the same thing: applying [xarray.DataArray.where][].
-
 ## equations
 
 Equations are combinations of [expression strings](#expression-strings) and [where strings](#where-strings).

diff --git a/mkdocs.yml b/mkdocs.yml
@@ -117,6 +117,7 @@ nav:
     - user_defined_math/index.md
     - user_defined_math/components.md
     - user_defined_math/syntax.md
+    - user_defined_math/helper_functions.md
     - user_defined_math/customise.md
     - Example additional math gallery:
       - user_defined_math/examples/index.md