Writing and reopening introduces bad values

**What happened**: When I open two particular netcdf files in xarray, concat them and write the resulting Dataset to disk and reopen, bad/unexpected values are introduced. This happens very rarely (at least that I have noticed) and I have not spotted a pattern that would indicate when to expect this behaviour.

**What you expected to happen**: The values should not change through the writing and reading process.

**Minimal Complete Verifiable Example**:

Download ```2t_era5_moda_sfc_20190201-20190228.nc``` and ```2t_era5_moda_sfc_20190501-20190531.nc``` from https://github.com/dougrichardson/issues/tree/main/xarray_write (each file is ~1.3MB).

```python
feb = xr.open_dataset('./2t_era5_moda_sfc_20190201-20190228.nc')
feb = feb.sel(latitude=slice(21,19), longitude=slice(79,80))

may = xr.open_dataset('./2t_era5_moda_sfc_20190501-20190531.nc')
may = may.sel(latitude=slice(21,19), longitude=slice(79,80))

ds = xr.concat([feb, may], dim='time')

# The bad values are introduced for may. This is what the file should look like:
ds.t2m.sel(time='2019-05-01').plot()

# Write to file, reopen and plot again
ds.to_netcdf('./test.nc', mode='w')
ds.close()

ds2 = xr.open_dataset('./test.nc')
ds2.t2m.sel(time='2019-05-01').plot()
ds2.close()

# We can also compare values using numpy.isclose:
np.isclose(may.t2m.values, ds2.t2m.sel(time='2019-05').values)
array([[[False, False, False, False, False],
        [False, False, False, False, False],
        [False, False, False, False,  True],
        [False, False, False, False, False],
        [False, False, False, False, False],
        [False, False, False, False, False],
        [False, False, False, False, False],
        [ True,  True, False, False, False],
        [False,  True,  True,  True, False]]])
```

**Anything else we need to know?**: Bad data is generated only in one time slice of ```ds```, i.e. ```ds.sel(time='2019-05-01')```. However, I have replaced ```feb``` with a number of different netcdf files, and there is no problem. Thus the issue seems to be with these two files specifically. I can provide a third netcdf file to highlight the lack of a problem there, if that would be useful.

This appears to be related to the encoding - if I specify the datatype when writing to file, the problem is fixed. However, as pointed out in https://github.com/pydata/xarray/issues/4826, this can introduce other problems. The netcdf files are climate data with add_offset and scale_factor attributes.

**Environment**:

<details><summary>Output of <tt>xr.show_versions()</tt></summary>


INSTALLED VERSIONS
------------------
commit: None
python: 3.9.4 | packaged by conda-forge | (default, May 10 2021, 22:13:33) 
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 4.18.0-305.7.1.el8.nci.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: None
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.10.6
libnetcdf: 4.7.4

xarray: 0.19.0
pandas: 1.2.4
numpy: 1.20.3
scipy: 1.6.3
netCDF4: 1.5.6
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.8.3
cftime: 1.5.0
nc_time_axis: 1.2.0
PseudoNetCDF: None
rasterio: 1.2.4
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2021.05.1
distributed: 2021.05.1
matplotlib: 3.4.2
cartopy: 0.19.0.post1
seaborn: None
numbagg: None
pint: None
setuptools: 49.6.0.post20210108
pip: 21.1.2
conda: 4.10.1
pytest: None
IPython: 7.24.0
sphinx: None

</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Writing and reopening introduces bad values #5739

INSTALLED VERSIONS

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Writing and reopening introduces bad values #5739

Description

INSTALLED VERSIONS

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions