Skip to content

Commit

Permalink
Improve performance of the nc_reclaim_data and nc_copy_data functions.
Browse files Browse the repository at this point in the history
re: Issue Unidata#2685
re: PR Unidata#2179

As noted in PR Unidata#2179,
the old code did not allow for reclaiming instances of types,
nor for properly copying them. That PR provided new functions
capable of reclaiming/copying instances of arbitrary types.

However, as noted by Issue Unidata#2685, using these
most general functions resulted in a significant performance
degradation, even for common cases.

This PR attempts to mitigate the cost of using the general
reclaim/copy functions in two ways.

First, the previous functions operating at the top level by
using ncid and typeid arguments. These functions were augmented
with equivalent versions that used the netcdf-c library internal
data structures to allow direct access to needed information.
These new functions are used internally to the library.

The second mitigation involves optimizing the internal functions
by providing early tests for common cases. This avoids
unnecessary recursive function calls.

The overall result is a significant improvement in speed by a
factor of roughly twenty -- your mileage may vary. These
optimized functions are still not as fast as the original (more
limited) functions, but they are getting close. Additional optimizations are
possible. But the cost is a significant "uglification" of the
code that I deemed a step too far, at least for now.

## Misc. Changes
1. Added a test case to check the proper reclamation/copy of complex types.
2. Found and fixed some places where nc_reclaim/copy should have been used.
3. Replaced, in the netcdf-c library, (almost all) occurrences of nc_reclaim_copy with calls to NC_reclaim/copy. This plus the optimizations is the primary speed-up mechanism.
4. In DAP4, the metadata is held in a substrate in-memory file; this required some changes so that the reclaim/copy code accessed that substrate dispatcher rather than the DAP4 dispatcher.
5. Re-factored and isolated the code that computes if a type is (transitively) variable-sized or not.
6. Clean up the reclamation code in ncgen; adding the use of nc_reclaim exposed some memory problems.
  • Loading branch information
DennisHeimbigner committed May 20, 2023
1 parent bfb8a31 commit fb40a72
Show file tree
Hide file tree
Showing 49 changed files with 1,702 additions and 815 deletions.
1 change: 1 addition & 0 deletions RELEASE_NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ This file contains a high-level description of this package's evolution. Release
## 4.9.3 - TBD


* Improve performance and testing of the new nc_reclaim/copy functions. See [Github #????](https://github.com/Unidata/netcdf-c/pull/????).
* Improve S3 documentation, testing, and support See [Github #2686](https://github.com/Unidata/netcdf-c/pull/2686).
* Remove obsolete code. See [Github #2680](https://github.com/Unidata/netcdf-c/pull/2680).
* [Bug Fix] Add a crude test to see if an NCZarr path looks like a valid NCZarr/Zarr file. See [Github #2658](https://github.com/Unidata/netcdf-c/pull/2658).
Expand Down
20 changes: 14 additions & 6 deletions configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -1488,10 +1488,14 @@ AC_CHECK_SIZEOF(ssize_t)
$SLEEPCMD
AC_CHECK_SIZEOF([void*])

if test "x$enable_hdf5" = xyes || test "x$enable_dap" = xyes ; then
AC_SEARCH_LIBS([deflate], [zlibwapi zlibstat zlib zlib1 z], [], [
AC_MSG_ERROR([Can't find or link to the z library. Turn off netCDF-4 and \
DAP clients with --disable-hdf5 --disable-dap, or see config.log for errors.])])
# Check for deflate library
AC_SEARCH_LIBS([deflate], [zlibwapi zlibstat zlib zlib1 z], [have_deflate=yes], [have_deflate=no])
AC_MSG_CHECKING([whether deflate library is available])
AC_MSG_RESULT([${have_deflate}])

if test "x$have_deflate" = xno ; then
AC_MSG_ERROR([Can't find or link to the z library. Turn off netCDF-4 and \
DAP and NCZarr clients with --disable-hdf5 --disable-dap --disable-nczarr, or see config.log for errors.])
fi

# We need the math library
Expand Down Expand Up @@ -1857,6 +1861,7 @@ AM_CONDITIONAL(ENABLE_S3_TESTPUB, [test "x$with_s3_testing" != xno]) # all => pu
AM_CONDITIONAL(ENABLE_S3_TESTALL, [test "x$with_s3_testing" = xyes])
AM_CONDITIONAL(ENABLE_NCZARR_ZIP, [test "x$enable_nczarr_zip" = xyes])
AM_CONDITIONAL(HAS_MULTIFILTERS, [test "x$has_multifilters" = xyes])
AM_CONDITIONAL(HAVE_DEFLATE, [test "x$have_deflate" = xyes])
AM_CONDITIONAL(HAVE_SZ, [test "x$have_sz" = xyes])
AM_CONDITIONAL(HAVE_H5Z_SZIP, [test "x$enable_hdf5_szip" = xyes])
AM_CONDITIONAL(HAVE_BLOSC, [test "x$have_blosc" = xyes])
Expand Down Expand Up @@ -1971,6 +1976,7 @@ AC_SUBST(DO_NCZARR_ZIP_TESTS,[$enable_nczarr_zip])
AC_SUBST(HAS_QUANTIZE,[$enable_quantize])
AC_SUBST(HAS_LOGGING,[$enable_logging])
AC_SUBST(DO_FILTER_TESTS,[$enable_filter_testing])
AC_SUBST(HAS_DEFLATE,[$have_deflate])
AC_SUBST(HAS_SZLIB,[$have_sz])
AC_SUBST(HAS_SZLIB_WRITE, [$have_sz])
AC_SUBST(HAS_ZSTD,[$have_zstd])
Expand All @@ -1987,8 +1993,10 @@ AC_SUBST(WHICH_S3_SDK,[none])
fi

# Always available
std_filters="deflate bz2"

std_filters="bz2"
if test "x$have_deflate" = xyes ; then
std_filters="${std_filters} deflate"
fi
if test "x$have_sz" = xyes ; then
std_filters="${std_filters} szip"
fi
Expand Down
2 changes: 1 addition & 1 deletion docs/cloud.md
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,7 @@ This is an experimental SDK provided internally in the netcdf-c library.

* It is written in C
* It provides the minimum functionality necessary to read/write/search an Amazon S3 bucket.
* It was constructed by heavily modifying the HDF5 *H5FDros3* Virtual File Driver and combining it with crypto code wrappers provided by libcurl. The resulting file was then modified to fit into the netcdf coding style.
* It was constructed by heavily modifying the HDF5 *H5Fs3commons.c* file and combining it with crypto code wrappers provided by libcurl. The resulting file was then modified to fit into the netcdf coding style.
* The resulting code is rather ugly, but appears to work under at least Linux and under Windows (using Visual C++).

### Dependencies
Expand Down
129 changes: 72 additions & 57 deletions docs/internal.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,14 @@ Notes On the Internals of the NetCDF-C Library

# Notes On the Internals of the NetCDF-C Library {#intern_head}

\tableofcontents

This document attempts to record important information about
the internal architecture and operation of the netcdf-c library.
It covers the following issues.

* [Including C++ Code in the netcdf-c Library](#intern_c++)
* [Managing instances of variable-length data types](#intern_vlens)
* [Inferring File Types](#intern_infer)
* [Adding a Standard Filter](#intern_filters)

# 1. Including C++ Code in the netcdf-c Library {#intern_c++}

Expand Down Expand Up @@ -41,46 +45,71 @@ AM_CXXFLAGS = -std=c++11
````
It is possible that other values (e.g. *-std=c++14*) may also work.

# 2. Managing instances of complex data types
# 2. Managing instances of variable-length data types {#intern_vlens}

For a long time, there have been known problems with the
management of complex types containing VLENs. This also
involves the string type because it is stored as a VLEN of
chars.
involves the string type because it is stored as a VLEN of chars.

The term complex type refers to any type that directly or
recursively references a VLEN type. So an array of VLENS, a
compound with a VLEN field, and so on.
The term "variable-length" refers to any
type that directly or recursively references a VLEN type. So an
array of VLENS, a compound with a VLEN field, and so on.

In order to properly handle instances of these complex types, it
In order to properly handle instances of these variable-length types, it
is necessary to have function that can recursively walk
instances of such types to perform various actions on them. The
term "deep" is also used to mean recursive.

Two deep walking operations are provided by the netcdf-c library
to aid in managing instances of complex structures.
- free'ing an instance of the complex type
- copying an instance of the complex type.
Two primary deep walking operations are provided by the netcdf-c library
to aid in managing instances of variable-length types.
- free'ing an instance of the type
- copying an instance of the type.

Note that the term "vector" is used in the following text to
mean a contiguous (in memory) sequence of instances of some
type. Given an array with, say, dimensions 2 X 3 X 4, this will
be stored in memory as a vector of length 2\*3\*4=24 instances.

## Special case reclamation functions

Previously The netcdf-c library provided functions to reclaim an
instance of a subset of the possible variable-length types. It
provided no specialized support for copying instances of
variable-length types.

These specialized functions are still available and can be used
when their pre-conditions are met. They have the advantage of
being faster than the general purpose functions described
later in this document.

Previously The netcdf-c library only did shallow free and shallow copy of
complex types. This meant that only the top level was properly
free'd or copied, but deep internal blocks in the instance were
not touched. This led to a host of memory leaks and failures
when the deep data was effectively shared between the netcdf-c library
internally and the user's data.
These functions, and their pre and post conditions, are as follows.
* *int nc_free_string(size_t len, char\* data[])*
<u>Action</u>: reclaim a vector of variable length strings.
<u>Pre-condition(s)</u>: the data argument is a vector containing *len* pointers to variable length, nul-terminated strings.
<u>Post-condition(s)</u>: only the strings in the vector are reclaimed -- the vector itself is not reclaimed.

Note that the term "vector" is used to mean a contiguous (in
memory) sequence of instances of some type. Given an array with,
say, dimensions 2 X 3 X 4, this will be stored in memory as a
vector of length 2\*3\*4=24 instances.
* int nc_free_vlens(size_t len, nc_vlen_t vlens[])
<u>Action</u>: reclaim a vector of VLEN instances.
<u>Pre-condition(s)</u>: the data argument is a vector containing *len* pointers to variable length, nul-terminated strings.
<u>Post-condition(s)</u>:
* only the data pointed to by the VLEN instances (i.e. *nc_vlen_t.p* in the vector are reclaimed -- the vector itself is not reclaimed.
* the base type of the VLEN must be a fixed-size type -- this means atomic types in the range NC_CHAR thru NC_UINT64, and compound types where the basetype of each field is itself fixed-size.

The use cases are primarily these.
* *int nc_free_vlen(nc_vlen_t \*vl)*
<u>Action</u>: this is equivalent to calling *nc_free_vlens(1,&vl)*.

## nc\_get\_vars
If the pre and post conditions are not met, then using these
functions can lead to a host of memory leaks and failures
because the deep data variable-length data is in effect
shared between the netcdf-c library internally and the user's data.

## Typical use cases

### *nc\_get\_vars*
Suppose one is reading a vector of instances using nc\_get\_vars
(or nc\_get\_vara or nc\_get\_var, etc.). These functions will
return the vector in the top-level memory provided. All
interior blocks (form nested VLEN or strings) will have been
interior blocks (from nested VLEN or strings) will have been
dynamically allocated. Note that computing the size of the vector
may be tricky because the strides must be taken into account.

Expand All @@ -90,14 +119,7 @@ memory leak occurs. So, the recursive reclaim function is used
to walk the returned instance vector and do a deep reclaim of
the data.

Currently functions are defined in netcdf.h that are supposed to
handle this: nc\_free\_vlen(), nc\_free\_vlens(), and
nc\_free\_string(). Unfortunately, these functions only do a
shallow free, so deeply nested instances are not properly
handled by them. They are marked in the description as
deprecated in favor of the newer recursive function.

## nc\_put\_vars
### *nc\_put\_vars*

Suppose one is writing a vector of instances using nc\_put\_vars
(or nc\_put\_vara or nc\_put\_var, etc.). These functions will
Expand All @@ -113,7 +135,10 @@ So again, the recursive reclaim function can be used
to walk the returned instance vector and do a deep reclaim of
the data.

## nc\_put\_att
WARNING: If the data passed into these functions contains statically
allocated data, then using any of the reclaim functions will fail.

### *nc\_put\_att*
Suppose one is writing a vector of instances as the data of an attribute
using, say, nc\_put\_att.

Expand All @@ -122,26 +147,18 @@ so that changes/reclamation of the input data will not affect
the attribute. Note that this copying behavior is different from
writing to a variable, where the data is written immediately.

Again, the code inside the netcdf library used to use only shallow copying
rather than deep copy. As a result, one saw effects such as described
in Github Issue https://github.com/Unidata/netcdf-c/issues/2143.

Also, after defining the attribute, it may be necessary for the user
After defining the attribute, it may be necessary for the user
to free the data that was provided as input to nc\_put\_att() as in the
nc\_put\_xxx functions (previously described).

## nc\_get\_att
### *nc\_get\_att*
Suppose one is reading a vector of instances as the data of an attribute
using, say, nc\_get\_att.

Internally, the existing attribute data must be copied and returned
to the caller, and the caller is responsible for reclaiming
the returned data.

Again, the code inside the netcdf library used to only do shallow copying
rather than deep copy. So this could lead to memory leaks and errors
because the deep data was shared between the library and the user.

## New Instance Walking API

Proper recursive functions were added to the netcdf-c library to
Expand All @@ -156,27 +173,25 @@ int nc_reclaim_data_all(int ncid, nc_type xtypeid, void* memory, size_t count);
int nc_copy_data(int ncid, nc_type xtypeid, const void* memory, size_t count, void* copy);
int nc_copy_data_all(int ncid, nc_type xtypeid, const void* memory, size_t count, void** copyp);
````
There are two variants. The first two, nc\_reclaim\_data() and
There are two variants. The two functions, nc\_reclaim\_data() and
nc\_copy\_data(), assume the top-level vector is managed by the
caller. For reclaim, this is so the user can use, for example, a
statically allocated vector. For copy, it assumes the user
provides the space into which the copy is stored.

The second two, nc\_reclaim\_data\_all() and
nc\_copy\_data\_all(), allows the functions to manage the
The other two, nc\_reclaim\_data\_all() and
nc\_copy\_data\_all(), allow the called functions to manage the
top-level. So for nc\_reclaim\_data\_all, the top level is
assumed to be dynamically allocated and will be free'd by
nc\_reclaim\_data\_all(). The nc\_copy\_data\_all() function
will allocate the top level and return a pointer to it to the
user. The user can later pass that pointer to
nc\_reclaim\_data\_all() to reclaim the instance(s).

# Internal Changes
## Internal Changes
The netcdf-c library internals are changed to use the proper reclaim
and copy functions. This also allows some simplification of the code
since the stdata and vldata fields of NC\_ATT\_INFO are no longer needed.
Currently this is commented out using the SEPDATA \#define macro.
When the bugs are found and fixed, all this code will be removed.

## Optimizations

Expand All @@ -202,7 +217,7 @@ The classification of types can be made at the time the type is defined
or is read in from some existing file. The reclaim and copy functions
use this information to speed up the handling of fixed size types.

# Warnings
## Warnings

1. The new API functions require that the type information be
accessible. This means that you cannot use these functions
Expand All @@ -228,7 +243,7 @@ use this information to speed up the handling of fixed size types.
}
````

# 3. Inferring File Types
# 3. Inferring File Types {#intern_infer}

As described in the companion document -- docs/dispatch.md --
when nc\_create() or nc\_open() is called, it must figure out what
Expand Down Expand Up @@ -478,7 +493,7 @@ So for example, if DAP2 is the model, then all netcdf-4 mode flags
and some netcdf-3 flags are removed from the set of mode flags
because DAP2 provides only a standard netcdf-classic format.

# 4. Adding a Standard Filter
# 4. Adding a Standard Filter {#intern_filters}

The standard filter system extends the netcdf-c library API to
support a fixed set of "standard" filters. This is similar to the
Expand Down Expand Up @@ -534,7 +549,7 @@ and CMake (CMakeLists.txt)
Configure.ac must have a block that similar to this that locates
the implementing library.
````
# See if we have libzstd
\# See if we have libzstd
AC_CHECK_LIB([zstd],[ZSTD_compress],[have_zstd=yes],[have_zstd=no])
if test "x$have_zstd" = "xyes" ; then
AC_SEARCH_LIBS([ZSTD_compress],[zstd zstd.dll cygzstd.dll], [], [])
Expand All @@ -559,7 +574,7 @@ endif
````

````
# Need our version of szip if libsz available and we are not using HDF5
\# Need our version of szip if libsz available and we are not using HDF5
if HAVE_SZ
noinst_LTLIBRARIES += libh5szip.la
libh5szip_la_SOURCES = H5Zszip.c H5Zszip.h
Expand Down Expand Up @@ -637,4 +652,4 @@ done:
*Author*: Dennis Heimbigner<br>
*Email*: dmh at ucar dot edu<br>
*Initial Version*: 12/22/2021<br>
*Last Revised*: 01/25/2022
*Last Revised*: 5/16/2023
60 changes: 60 additions & 0 deletions include/nc.h
Original file line number Diff line number Diff line change
Expand Up @@ -77,4 +77,64 @@ extern int iterate_NCList(int i,NC**); /* Walk from 0 ...; ERANGE return => stop
extern void free_NC(NC*);
extern int new_NC(const struct NC_Dispatch*, const char*, int, NC**);

/* Defined in dinstance_intern.c */

/**************************************************/
/**
Following are the internal implementation signatures upon which
are built the API functions: nc_reclaim_data, nc_reclaim_data_all,
nc_copy_data, and nc_copy_data_all.
These functions access internal data structures instead of using e.g. ncid.
This is to improve performance.
*/

/*
Reclaim a vector of instances of arbitrary type.
This recursively walks the top-level instances to reclaim any
nested data such as vlen or strings or such.
Assumes it is passed a pointer to count instances of xtype.
Reclaims any nested data.
WARNING: nc_reclaim_data does not reclaim the top-level
memory because we do not know how it was allocated. However
nc_reclaim_data_all does reclaim top-level memory assuming that it
was allocated using malloc().
WARNING: all data blocks below the top-level (e.g. string
instances) will be reclaimed, so do not call if there is any
static data in the instance.
Should work for any netcdf format.
*/

EXTERNL int NC_reclaim_data(NC* nc, nc_type xtypeid, void* memory, size_t count);
EXTERNL int NC_reclaim_data_all(NC* nc, nc_type xtypeid, void* memory, size_t count);

/**
Copy vector of arbitrary type instances. This recursively walks
the top-level instances to copy any nested data such as vlen or
strings or such.
Assumes it is passed a pointer to count instances of xtype.
WARNING: nc_copy_data does not copy the top-level memory, but
assumes a block of proper size was passed in. However
nc_copy_data_all does allocate top-level memory copy.
Should work for any netcdf format.
*/

EXTERNL int NC_copy_data(NC* nc, nc_type xtypeid, const void* memory, size_t count, void* copy);
EXTERNL int NC_copy_data_all(NC* nc, nc_type xtypeid, const void* memory, size_t count, void** copyp);

/* Macros to map NC_FORMAT_XX to metadata structure (e.g. NC_FILE_INFO_T) */
/* Fast test for what file structure is used */
#define NC3INFOFLAGS ((1<<NC_FORMATX_NC3)|(1<<NC_FORMATX_PNETCDF)|(1<<NC_FORMATX_DAP2))
#define FILEINFOFLAGS ((1<<NC_FORMATX_NC_HDF5)|(1<<NC_FORMATX_NC_HDF4)|(1<<NC_FORMATX_DAP4)|(1<<NC_FORMATX_UDF1)|(1<<NC_FORMATX_UDF0)|(1<<NC_FORMATX_NCZARR))

#define USENC3INFO(nc) ((1<<(nc->dispatch->model)) & NC3INFOFLAGS)
#define USEFILEINFO(nc) ((1<<(nc->dispatch->model)) & FILEINFOFLAGS)
#define USED2INFO(nc) ((1<<(nc->dispatch->model)) & (1<<NC_FORMATX_DAP2))
#define USED4INFO(nc) ((1<<(nc->dispatch->model)) & (1<<NC_FORMATX_DAP4))

#endif /* _NC_H_ */
Loading

0 comments on commit fb40a72

Please sign in to comment.