Skip to content

Commit a47c799

Browse files
committed
adding docs for math epsilon and group by
1 parent a36282a commit a47c799

File tree

5 files changed

+74
-2
lines changed

5 files changed

+74
-2
lines changed

docs/basics.rst

+44
Original file line numberDiff line numberDiff line change
@@ -143,5 +143,49 @@ Object attribute added:
143143
You just need to set view='tree' to get it in tree form.
144144

145145

146+
.. _group_by_label:
147+
148+
Group By
149+
--------
150+
151+
group_by can be used when dealing with list of dictionaries to convert them to group them by value defined in group_by. The common use case is when reading data from a flat CSV and primary key is one of the columns in the CSV. We want to use the primary key to group the rows instead of CSV row number.
152+
153+
Example:
154+
>>> from deepdiff import DeepDiff
155+
>>> t1 = [
156+
... {'id': 'AA', 'name': 'Joe', 'last_name': 'Nobody'},
157+
... {'id': 'BB', 'name': 'James', 'last_name': 'Blue'},
158+
... {'id': 'CC', 'name': 'Mike', 'last_name': 'Apple'},
159+
... ]
160+
>>>
161+
>>> t2 = [
162+
... {'id': 'AA', 'name': 'Joe', 'last_name': 'Nobody'},
163+
... {'id': 'BB', 'name': 'James', 'last_name': 'Brown'},
164+
... {'id': 'CC', 'name': 'Mike', 'last_name': 'Apple'},
165+
... ]
166+
>>>
167+
>>> DeepDiff(t1, t2)
168+
{'values_changed': {"root[1]['last_name']": {'new_value': 'Brown', 'old_value': 'Blue'}}}
169+
170+
171+
Now we use group_by='id':
172+
>>> DeepDiff(t1, t2, group_by='id')
173+
{'values_changed': {"root['BB']['last_name']": {'new_value': 'Brown', 'old_value': 'Blue'}}}
174+
175+
.. note::
176+
group_by actually changes the structure of the t1 and t2. You can see this by using the tree view:
177+
178+
>>> diff = DeepDiff(t1, t2, group_by='id', view='tree')
179+
>>> diff
180+
{'values_changed': [<root['BB']['last_name'] t1:'Blue', t2:'Brown'>]}
181+
>>> diff['values_changed'][0]
182+
<root['BB']['last_name'] t1:'Blue', t2:'Brown'>
183+
>>> diff['values_changed'][0].up
184+
<root['BB'] t1:{'name': 'Ja...}, t2:{'name': 'Ja...}>
185+
>>> diff['values_changed'][0].up.up
186+
<root t1:{'AA': {'nam...}, t2:{'AA': {'nam...}>
187+
>>> diff['values_changed'][0].up.up.t1
188+
{'AA': {'name': 'Joe', 'last_name': 'Nobody'}, 'BB': {'name': 'James', 'last_name': 'Blue'}, 'CC': {'name': 'Mike', 'last_name': 'Apple'}}
189+
146190

147191
Back to :doc:`/index`

docs/conf.py

+2
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,8 @@
140140
'github_count': True,
141141
'font_family': 'Open Sans',
142142
'canonical_url': 'https://zepworks.com/deepdiff/current/',
143+
'page_width': '1024px',
144+
'body_max_width': '1024px',
143145
}
144146

145147
# Add any paths that contain custom themes here, relative to this directory.

docs/diff_doc.rst

+4-1
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ get_deep_distance: Boolean, default = False
5353
:ref:`get_deep_distance_label` will get you the deep distance between objects. The distance is a number between 0 and 1 where zero means there is no diff between the 2 objects and 1 means they are very different. Note that this number should only be used to compare the similarity of 2 objects and nothing more. The algorithm for calculating this number may or may not change in the future releases of DeepDiff.
5454

5555
group_by: String, default=None
56-
:ref:`group_by` can be used when dealing with list of dictionaries to convert them to group them by value defined in group_by. The common use case is when reading data from a flat CSV and primary key is one of the columns in the CSV. We want to use the primary key to group the rows instead of CSV row number.
56+
:ref:`group_by_label` can be used when dealing with list of dictionaries to convert them to group them by value defined in group_by. The common use case is when reading data from a flat CSV and primary key is one of the columns in the CSV. We want to use the primary key to group the rows instead of CSV row number.
5757

5858
hasher: default = DeepHash.murmur3_128bit
5959
Hash function to be used. If you don't want Murmur3, you can use Python's built-in hash function
@@ -105,6 +105,9 @@ max_passes: Integer, default = 10000000
105105
max_diffs: Integer, default = None
106106
:ref:`max_diffs_label` defined the maximum number of diffs to run on objects to pin point what exactly is different. This is only used when ignore_order=True
107107

108+
math_epsilon: Decimal, default = None
109+
:ref:`math_epsilon_label` uses Python's built in Math.isclose. It defines a tolerance value which is passed to math.isclose(). Any numbers that are within the tolerance will not report as being different. Any numbers outside of that tolerance will show up as different.
110+
108111
number_format_notation : string, default="f"
109112
:ref:`number_format_notation_label` is what defines the meaning of significant digits. The default value of "f" means the digits AFTER the decimal point. "f" stands for fixed point. The other option is "e" which stands for exponent notation or scientific notation.
110113

docs/numbers.rst

+23
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,29 @@ ignore_nan_inequality: Boolean, default = False
119119
>>> DeepDiff(float('nan'), float('nan'), ignore_nan_inequality=True)
120120
{}
121121

122+
.. _math_epsilon_label:
123+
124+
Math Epsilon
125+
------------
126+
127+
math_epsilon: Decimal, default = None
128+
math_epsilon uses Python's built in Math.isclose. It defines a tolerance value which is passed to math.isclose(). Any numbers that are within the tolerance will not report as being different. Any numbers outside of that tolerance will show up as different.
129+
130+
For example for some sensor data derived and computed values must lie in a certain range. It does not matter that they are off by e.g. 1e-5.
131+
132+
To check against that the math core module provides the valuable isclose() function. It evaluates the being close of two numbers to each other, with reference to an epsilon (abs_tol). This is superior to the format function, as it evaluates the mathematical representation and not the string representation.
133+
134+
Example:
135+
>>> from decimal import Decimal
136+
>>> d1 = {"a": Decimal("7.175")}
137+
>>> d2 = {"a": Decimal("7.174")}
138+
>>> DeepDiff(d1, d2, math_epsilon=0.01)
139+
{}
140+
141+
.. note::
142+
math_epsilon cannot currently handle the hashing of values, which is done when :ref:`ignore_order_label` is True.
143+
144+
122145
Performance Improvement of Numbers diffing
123146
------------------------------------------
124147

tests/test_diff_math.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
from decimal import Decimal
2-
from deepdiff.diff import DeepDiff
2+
from deepdiff import DeepDiff
33

44

55
class TestDiffMath:

0 commit comments

Comments
 (0)