Skip to content

Array formulae in the Calculation Engine #2787

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 145 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
145 commits
Select commit Hold shift + click to select a range
8156ed9
First steps to handling array formulae with the Xlsx Reader and Writer
Jan 30, 2022
39a6c29
Set formula attributes datatype in Cell
Jan 30, 2022
73e7ee0
Write correct cell area ref when saving array formulae in cells for X…
Jan 30, 2022
6a51ad4
Basic Read/Write test for an array function; verify that formula attr…
Jan 30, 2022
1d941b4
Initial work on setting an array formula through code, and populating…
Jan 31, 2022
69436d3
phpcs fixes
Jan 31, 2022
5d258f1
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Jan 31, 2022
d2435cd
Ensure that a basic metadata file containing a cell metadata definiti…
Feb 1, 2022
9f3db99
Ensure that our array result dimensions match the specified array for…
Feb 2, 2022
1fa690c
Provide a method to easily access the array formula range for a cell …
Feb 2, 2022
beb1e8f
Update array formula ranges when inserting/deleting rows/columns
Feb 2, 2022
9bfc4be
Stubs for MS pseudo-functions used to handle the Spillage (`ANCHORARR…
Feb 2, 2022
ceb1c04
Initial work implementing the SINGLE() and ANCHORARRAY() pseudo-funct…
Feb 2, 2022
54d49ed
Unit tests for pseudo-functions
Feb 3, 2022
12ddc9e
General fixes for array functions with partial-range
Feb 3, 2022
7d84faf
Updates to function lists
Feb 3, 2022
abf4f9c
regenerate phpstan baseline (it is smaller, honest)
Feb 3, 2022
c84e334
Update documentation with details of array formula handling, and the …
Feb 3, 2022
66e63eb
Modify ABS() function to handle arrays, with appropriate unit tests
Feb 3, 2022
3cec90d
Minor refactoring, and additional unit tests (including exceptions) f…
Feb 4, 2022
c1125ba
Clean up some phpstan issues
Feb 4, 2022
2dc6aa1
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Feb 4, 2022
3ed98ab
Resolve phpstan issues
Feb 4, 2022
6bd181b
Eliminate spurious var_dump
Feb 4, 2022
0520466
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Feb 9, 2022
3357af3
Minor refactoring of Cell calculation logic for array response
Feb 9, 2022
c05dd63
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Feb 13, 2022
c7cbdf6
Reset phpstan baseline
Feb 13, 2022
719df3c
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Feb 19, 2022
31cf8fb
Fix merge conflicts
Feb 19, 2022
8e8e8f4
Apply float precision for unit tests
Feb 19, 2022
76fbf38
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Feb 22, 2022
568921c
Start work linking cells in spillage areas to the cell containing the…
Feb 23, 2022
c3652f3
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Feb 23, 2022
7d5dc72
Maintain reference for cells inside a spillage range to prevent their…
Feb 23, 2022
b929dd1
Ensure that the `fromArray()` and `toArray()` functions retain the cu…
Feb 24, 2022
3806981
Allow recalc of all values for a spillage range
Feb 24, 2022
a67cffc
minor tweaks
Feb 24, 2022
200713a
Initial work on ODS reader to support array formulae
Feb 25, 2022
a83e3c6
Initial work on Gnumeric reader to support array formulae
Feb 26, 2022
dc718dd
Rename xlfn functions for Ods
Feb 26, 2022
45a770a
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Feb 26, 2022
5672bd8
Resolve issues introduced during resolution of merge conflicts from m…
Feb 26, 2022
e30e7cb
More unit tests
Feb 26, 2022
7dfa06f
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Feb 26, 2022
1574f48
Resolve `translateSeparator()` method to handle separators (row and c…
Feb 27, 2022
9413f00
Some Refactoring of the Ods Reader, moving all formula and address tr…
Feb 27, 2022
e1f278b
Unit tests for reading/writing array formulae from Ods
Feb 27, 2022
4d0f426
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Feb 27, 2022
cff8303
Re-merge from baseline
Feb 27, 2022
55be3b4
Minor typehint fixes
Feb 27, 2022
f64e2c2
Prep-work for handling matrix arithmetic in the calc engine, with som…
Feb 27, 2022
2c95f3a
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Mar 4, 2022
c0f5a40
Suppress tests that we know will fail (for the moment)
Mar 4, 2022
fcc55a7
Fix for calculating array operations in the calculation engine (with …
Mar 4, 2022
e4b7d73
Unit tests for reading array formulae in Ods files
Mar 4, 2022
3dbc7dd
Handle aray formulae in the Gnumeric Reader (with unit tests)
Mar 4, 2022
b67841a
Additional Gnumeric Reader array formula unit tests
Mar 4, 2022
d6b6a11
Additional array formula tests for Ods Reader
Mar 5, 2022
85130dc
Precision in unit test float value
Mar 5, 2022
4a56261
Reverse `$asArray` and `$resetLog` arguments for calcuation methods; …
Mar 5, 2022
5346776
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Mar 5, 2022
9c99f2f
Fix arguments for calculation call in `anchorarray()` pseudo-functin
Mar 5, 2022
ec6113e
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Mar 6, 2022
6b0d724
Additional unit tests for array operations (in preparation for a swit…
Mar 7, 2022
96453e0
Fix to identify spillage cells from their arrayFormulaRange value
Mar 9, 2022
8ed2bab
Modifications to FORMULATEXT() to return correct values for cells in …
Mar 9, 2022
b1eb0e2
More unit tests for MMULT() function, to ensure correct calculation c…
Mar 9, 2022
5f79b74
Revert "Modifications to FORMULATEXT() to return correct values for c…
Mar 9, 2022
5ea78ae
Modifications to FORMULATEXT() to return correct values for cells in …
Mar 9, 2022
be57361
Additional unit tests for array/scalar operations
Mar 10, 2022
53aab72
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Mar 10, 2022
96ac716
Fix sizing adjustments for reflective matrices (vectors increase to m…
Mar 10, 2022
3f69661
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Mar 11, 2022
27fc7d5
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Mar 13, 2022
ec3281b
Resolve phpstan
Mar 13, 2022
1ea119c
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Mar 13, 2022
925c730
Some more unit tests for array functions and spillage; and updates to…
Mar 14, 2022
7ff8e8b
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Mar 17, 2022
222b79b
Ensure that master can still merge in cleanly... knowing that if I do…
Mar 17, 2022
8f8257d
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Mar 18, 2022
2acb3fe
Update branch and some additional unit tests
Mar 18, 2022
dc255fb
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Mar 19, 2022
ca940c6
Re-baseline
Mar 19, 2022
2b3addc
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Mar 19, 2022
fe1e0d2
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Mar 24, 2022
6602dd6
More unit tests
Mar 24, 2022
adb0c13
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Mar 24, 2022
a3130ef
Flag functions as spillage functions. We may be able to use this to d…
Mar 25, 2022
3bb6b42
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Apr 12, 2022
1014a03
Resolve merge conflicts
Apr 12, 2022
c0c79c7
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Apr 13, 2022
e41c428
Initial creation of the version 2.0 Development branch
Apr 24, 2022
703604c
Remove Reader/Writer deprecations
Apr 25, 2022
0998e72
Eliminate underscore prefix in method names... no longer needed as an…
Apr 26, 2022
53a6ab9
Merge branch 'master' into 2.0-Development
Apr 27, 2022
02abb41
Merge branch 'master' into 2.0-Development
Apr 28, 2022
d613de1
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
Apr 30, 2022
f918847
Merge branch 'master' into 2.0-Development
May 7, 2022
acfd752
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
May 7, 2022
297817f
Merge branch 'master' into 2.0-Development
May 10, 2022
304e2b8
Merge branch 'master' into 2.0-Development
May 10, 2022
e51dabe
Merge branch 'master' into 2.0-Development
May 11, 2022
78ea02a
Update phpstan baseline
May 11, 2022
f5b4308
Merge branch 'master' into CalculationEngine-Array-Formulae-Initial-Work
May 11, 2022
e1e5888
PHP deprecation resolution
May 11, 2022
33d3442
Merge branch '2.0-Development' into CalculationEngine-Array-Formulae-…
May 13, 2022
feb7695
Merge resolutions
May 13, 2022
89b70a0
Merge branch 'master' into 2.0-Development
May 28, 2022
b1ca6ee
Merge branch '2.0-Development' into CalculationEngine-Array-Formulae-…
May 28, 2022
b1d1ce7
Re-baseline
May 28, 2022
05a1252
Merge branch 'master' into 2.0-Development
Jun 18, 2022
ba15a68
Merge from master, and rebase phpstan baseline
Jun 18, 2022
a8fc5bc
Merge branch '2.0-Development' into CalculationEngine-Array-Formulae-…
Jun 18, 2022
a40c708
Merge from 2.0 development, and rebase phpstan baseline
Jun 18, 2022
0a34ce8
Merge branch 'master' into 2.0-Development
Jul 2, 2022
42dde14
Merge branch 'CalculationEngine-Array-Formulae-Initial-Work' into 2.0…
Jul 2, 2022
fd4e256
Rationalise the worksheet getHighestRowAndColumn() and getHighestData…
MarkBaker Jul 5, 2022
ccd9aba
Modify toArray() method to use highest data row/column rather than hi…
Jul 5, 2022
ae746d0
Merge branch 'master' into 2.0-Development
MarkBaker Jul 9, 2022
3e49312
Reset phpstan baseline after merge from 1.x
MarkBaker Jul 9, 2022
5de3c4e
Merge remote-tracking branch 'origin/2.0-Development' into 2.0-Develo…
MarkBaker Jul 9, 2022
08849a7
Changes required after merge from 1.x master, and reset phpstan baseline
MarkBaker Jul 9, 2022
95cf51b
Modify the general settings ExcelCalendar to be simply a default, Cre…
MarkBaker Jul 10, 2022
1fda83d
Remember to re-baseline phpstan before final commit and push because …
MarkBaker Jul 18, 2022
fe4808b
Merge branch 'master' into 2.0-Development
MarkBaker Aug 3, 2022
76314dd
merge from master and re-baseline phpstan
MarkBaker Aug 3, 2022
b30f364
Renaming methods for Excel functions to provide more consistent case,…
MarkBaker Aug 5, 2022
3964087
Test for calendar when reading Excel Files.
MarkBaker Jul 13, 2022
765037e
Ensure that correct calendar (read from the spreadsheet when loaded) …
MarkBaker Aug 17, 2022
6cdfd3c
Merge branch '2.0-Development' into 2.x-Calendar-Changes
MarkBaker Aug 17, 2022
aba94a1
Fix merge from 2.0 for the Spreadsheet Copy (I really don't like this…
MarkBaker Aug 17, 2022
a002288
Merge pull request #2937 from PHPOffice/2.x-Calendar-Changes
MarkBaker Aug 17, 2022
a26a58d
Merge branch 'master' into 2.0-Development
MarkBaker Aug 17, 2022
a7ac633
Adjustment to error check in TEXTFROMARRAY() function
MarkBaker Aug 17, 2022
0665a87
Re-baseline phpstan
MarkBaker Aug 18, 2022
125f5b1
Merge pull request #3012 from PHPOffice/2.0-CalcEngine-Function-Renames
MarkBaker Aug 18, 2022
d06d1cc
Merge remote-tracking branch 'origin/2.0-Development' into 2.0-Develo…
MarkBaker Aug 18, 2022
42cf5ce
Re-baseline phpstan
MarkBaker Aug 18, 2022
541bf0f
Merge branch 'master' into 2.0-Development
MarkBaker Sep 25, 2022
ff655bd
Resolve merge conflicts for master -> 2.0-dev
MarkBaker Sep 25, 2022
d05e27b
Merge branch 'master' into 2.0-Development
MarkBaker Sep 26, 2022
40b6fc7
Excel Functions implementation method renaming
MarkBaker Sep 26, 2022
4341300
Stricter type-hinting for IReader
MarkBaker Sep 30, 2022
ecc18d2
Binary-value options for Reader using flag settings
MarkBaker Oct 2, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
6 changes: 3 additions & 3 deletions docs/topics/accessing-cells.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,10 +131,10 @@ Formats handled by the advanced value binder include:
- TRUE or FALSE (dependent on locale settings) are converted to booleans.
- Numeric strings identified as scientific (exponential) format are
converted to numbers.
- Fractions and vulgar fractions are converted to numbers, and
- Fractions and "vulgar" fractions are converted to numbers, and
an appropriate number format mask applied.
- Percentages are converted
to numbers, divided by 100, and an appropriate number format mask
- Percentages are converted to numbers, divided by 100, and an
appropriate number format mask
applied.
- Dates and times are converted to Excel timestamp values
(numbers), and an appropriate number format mask applied.
Expand Down
36 changes: 33 additions & 3 deletions docs/topics/calculation-engine.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ formula calculation capabilities. A cell can be of a value type
which can be evaluated). For example, the formula `=SUM(A1:A10)`
evaluates to the sum of values in A1, A2, ..., A10.

Calling `getValue()` on a cell that contains a formula will return the formula itself.

To calculate a formula, you can call the cell containing the formula’s
method `getCalculatedValue()`, for example:

Expand All @@ -22,6 +24,30 @@ with PhpSpreadsheet, it evaluates to the value "64":

![09-command-line-calculation.png](./images/09-command-line-calculation.png)

Calling `getCalculatedValue()` on a cell that doesn't contain a formula will simply return the value of that cell; but if the cell does contain a formula, then PhpSpreadsheet will evaluate that formula to calculate the result.

There are a few useful mehods to help identify whether a cell contains a formula or a simple value; and if a formula, to provide further information about it:

```php
$spreadsheet->getActiveSheet()->getCell('E11')->isFormula();
```
will return a boolean true/false, telling you whether that cell contains a formula or not, so you can determine if a call to `getCalculatedVaue()` will need to perform an evaluation.

A formula can be either a simple formula, or an array formula; and another method will identify which it is:
```php
$spreadsheet->getActiveSheet()->getCell('E11')->isArrayFormula();
```
Finally, an array formula might result in a single cell result, or a result that can spill over into a range of cells; so for array formulae the following method also exists:
```php
$spreadsheet->getActiveSheet()->getCell('E11')->arrayFormulaRange();
```
which returns a string containing a cell reference (e.g. `E11`) or a cell range reference (e.g. `E11:G13`).


For more details on working with array formulae, see the [the recipes documentationn](./recipes.md/#array-formulae).

### Adjustments to formulae when Inserting/Deleting Columns/Rows

When writing a formula to a cell, formulae should always be set as they would appear in an English version of Microsoft Office Excel, and PhpSpreadsheet handles all formulae internally in this format. This means that the following rules hold:

- Decimal separator is `.` (period)
Expand Down Expand Up @@ -91,6 +117,11 @@ formula calculation is subject to PHP's language characteristics.
Not all functions are supported, for a comprehensive list, read the
[function list by name](../references/function-list-by-name.md).

#### Array arguments for Function Calls in Formulae

While most of the Excel function implementations now support array arguments, there are a few that should accept arrays as arguments but don't do so.
In these cases, the result may be a single value rather than an array; or it may be a `#VALUE!` error.

#### Operator precedence

In Excel `+` wins over `&`, just like `*` wins over `+` in ordinary
Expand All @@ -112,7 +143,7 @@ content.

- [Reference for this behaviour in PHP](https://php.net/manual/en/language.types.string.php#language.types.string.conversion)

#### Formulas don’t seem to be calculated in Excel2003 using compatibility pack?
#### Formulae don’t seem to be calculated in Excel2003 using compatibility pack?

This is normal behaviour of the compatibility pack, Xlsx displays this
correctly. Use `\PhpOffice\PhpSpreadsheet\Writer\Xls` if you really need
Expand Down Expand Up @@ -161,8 +192,7 @@ number of seconds from the PHP/Unix base date. The PHP/Unix base date
(0) is 00:00 UST on 1st January 1970. This value can be positive or
negative: so a value of -3600 would be 23:00 hrs on 31st December 1969;
while a value of +3600 would be 01:00 hrs on 1st January 1970. This
gives PHP a date range of between 14th December 1901 and 19th January
2038.
gives 32-bit PHP a date range of between 14th December 1901 and 19th January 2038.

#### PHP `DateTime` Objects

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
69 changes: 68 additions & 1 deletion docs/topics/reading-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,68 @@ Once you have created a reader object for the workbook that you want to
load, you have the opportunity to set additional options before
executing the `load()` method.

All of these options can be set by calling the appropriate methods against the Reader (as described below), but some options (those with only two possible values) can also be set through flags, either by calling the Reader's `setFlags()` method, or passing the flags as an argument in the call to `load()`.
Those options that can be set through flags are:

Option | Flag | Default
-------------------|------------------------------|---
Ignore Empty Cells | IReader::IGNORE_EMPTY_CELLS | Load empty cells
Read Data Only | IReader::READ_DATA_ONLY | Read data, structure and style
Include Charts | IReader::LOAD_WITH_CHARTS | Don't read charts

Several flags can be combined in a single call:
```php
$inputFileType = 'Xlsx';
$inputFileName = './sampleData/example1.xlsx';

/** Create a new Reader of the type defined in $inputFileType **/
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
/** Set additional flags before the call to load() */
$reader->setFlags(IReader::IGNORE_EMPTY_CELLS | IReader::LOAD_WITH_CHARTS);
$reader->load($inputFileName);
```
or
```php
$inputFileType = 'Xlsx';
$inputFileName = './sampleData/example1.xlsx';

/** Create a new Reader of the type defined in $inputFileType **/
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
/** Set additional flags in the call to load() */
$reader->load($inputFileName, IReader::IGNORE_EMPTY_CELLS | IReader::LOAD_WITH_CHARTS);
```

### Ignoring Empty Cells

Many Excel files have empty rows or columns at the end of a worksheet, which can't easily be seen when looking at the file in Excel (Try using Ctrl-End to see the last cell in a worksheet).
By default, PhpSpreadsheet will load these cells, because they are valid Excel values; but you may find that an apparently small spreadsheet requires a lot of memory for all those empty cells.
If you are running into memory issues with seemingly small files, you can tell PhpSpreadsheet not to load those empty cells using the `setReadEmptyCells()` method.

```php
$inputFileType = 'Xls';
$inputFileName = './sampleData/example1.xls';

/** Create a new Reader of the type defined in $inputFileType **/
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
/** Advise the Reader that we only want to load cell's that contain actual content **/
$reader->setReadEmptyCells(false);
/** Load $inputFileName to a Spreadsheet Object **/
$spreadsheet = $reader->load($inputFileName);
```

Note that cells containing formulae will still be loaded, even if that formula evaluates to a NULL or an empty string.
Similarly, Conditional Styling might also hide the value of a cell; but cells that contain Conditional Styling or Data Validation will always be loaded regardless of their value.

This option is available for the following formats:

Reader | Y/N |Reader | Y/N |Reader | Y/N |
----------|:---:|--------|:---:|--------------|:---:|
Xlsx | YES | Xls | YES | Xml | NO |
Ods | NO | SYLK | NO | Gnumeric | NO |
CSV | NO | HTML | NO

This option is also available through flags.

### Reading Only Data from a Spreadsheet File

If you're only interested in the cell values in a workbook, but don't
Expand Down Expand Up @@ -210,6 +272,8 @@ Xlsx | YES | Xls | YES | Xml | YES |
Ods | YES | SYLK | NO | Gnumeric | YES |
CSV | NO | HTML | NO

This option is also available through flags.

### Reading Only Named WorkSheets from a File

If your workbook contains a number of worksheets, but you are only
Expand Down Expand Up @@ -642,7 +706,7 @@ Xlsx | NO | Xls | NO | Xml | NO |
Ods | NO | SYLK | NO | Gnumeric | NO |
CSV | YES | HTML | NO

### A Brief Word about the Advanced Value Binder
## A Brief Word about the Advanced Value Binder

When loading data from a file that contains no formatting information,
such as a CSV file, then data is read either as strings or numbers
Expand Down Expand Up @@ -694,6 +758,9 @@ Xlsx | NO | Xls | NO | Xml | NO
Ods | NO | SYLK | NO | Gnumeric | NO
CSV | YES | HTML | YES

Note that you can also use the Binder to determine how PhpSpreadsheet identified datatypes for values when you set a cell value without explicitly setting a datatype.
Value Binders can also be used to set formatting for a cell appropriate to the value.

## Error Handling

Of course, you should always apply some error handling to your scripts
Expand Down
Loading