IMPORTANT: This release of ESMF contains a known issue with OpenMP thread counts reset to 1 and incorrect pinning of OpenMP threads, leading to reduced application performance in threaded regions. See Known Bugs below for more details. ESMF release 8.1.1 addresses this issue.
Overview
During the 8.1.0 development cycle, a number of exciting new features were added to the ESMF library, ease of use was improved in several areas, and performance was optimized in critical parts of the library. Highlights of the 8.1.0 release are listed below. A detailed list of release notes is also provided further down.
The integration of the MOAB mesh library, developed by the U.S. Department of Energy, into ESMF, has reached a milestone that supports application-level testing. All users of ESMF are encouraged to begin testing their applications with the MOAB option turned ON. Preliminary results indicate that MOAB provides improved performance and scaling, and reduced memory footprint. Applications seeking to go to very high resolution grids should benefit from this new capability. The MOAB mesh feature is still OFF by default in this release, but can easily be turned ON from the application level, any time during the run. The available regridding features supported by the MOAB based implementation are clearly listed in the detailed release notes.
Another area of the library that received significant attention is the key-value storage. The new ESMF_Info class, based on a modern C++ JSON implementation, replaces ESMF_Attribute. Most of the legacy ESMF_Attribute API is preserved for backward compatibility, but users are encouraged to migrate their code to using the ESMF_Info API. As a consequence of the new key-value implementation, the NUOPC initialization time has been reduced. This is most pronounced in applications with large Field and PET counts.
Major ease of use improvements went into the NUOPC Layer. One is the introduction of semantic specialization labels into the NUOPC API. The new approach no longer uses Initialize Phase Definition (IPD) versions or the IPDvXXpY nomenclature when registering methods in the SetServices() method. This leads to clearer, more concise NUOPC “cap” implementations. Another improvement is the seamless integration of the NUOPC profiling options into the ESMF profiling infrastructure. Simply by setting the Profiling attribute on NUOPC Driver, Model, Mediator, or Connector components, it is now possible for the user to generate a detailed NUOPC-level performance profile.
A number of new features were added in the area of resource control, handling of threaded components, and shared memory access. The NUOPC layer now supports resource control and handling of threaded components. This mechanism supports hybrid MPI+OpenMP components with different threading levels, allowing each component to fully utilize HPC resources independently. Coupling between threaded components is supported automatically via the standard NUOPC Connectors. Further, data can now be shared by reference between components that run on the same compute nodes even if running with different threading levels. Both Field-level sharing and Array-level sharing are supported through the ESMF API.
New features were also added to the ESMF regridding implementation. It now supports the “nearest destination”, and “creep nearest destination” extrapolation methods. The “creep fill” extrapolation method, introduced in 8.0.0, is now available through ESMPy.
Finally 8.1.0 includes additions to improve the overall user experience with ESMF. In the area of regridding the user has now the option to specify the ‘checkFlag’ argument to ESMF_FieldRegridStore(). This turns on more comprehensive grid/mesh error checking at the cost of performance. The option should therefore be used during application development and debugging, and not during production runs. Finally, for users faced with debugging failing ESMF applications, a new section was added to the User’s Guide entitled "Debugging of ESMF User Applications" that provides some hints on how to interpret stack traces and errors that appear in the ESMF log files.
Release Notes
- This release is backward compatible with the last major release update, ESMF 8.0.0, for all the interfaces that are marked as backward compatible in the Reference Manual. There were API changes to a few unmarked methods that may require minor modifications to user code that uses these methods. The entire list of API changes is summarized in a table showing interface changes since ESMF_8_0_0, including the rationale and impact for each change.
- Some bit-for-bit changes are expected for this release compared to release ESMF 8.0.0 and patch release ESMF 8.0.1. We observe the following impact with Intel compilers using “
-O2 -fp-model precise
”:- Fixed a problem that could result in erroneously unmapped destinations when going from a very fine source grid to a coarse destination grid (e.g. 1/20 degree to 10x15 degree). Expected bit-for-bit changes:
- ESMF_REGRIDMETHOD_CONSERVE_2ND: roundoff level changes in weight values because of a change in the order of calculation.
- All regrid methods: Missing weights being added for very fine source grid to coarse destination grid regridding cases as this fix comes into play.
- Fixed a problem where using the bilinear method between two identical grids doesn't result in an identity matrix for the regridding weights. It also improves the efficiency of the code when using bilinear or patch between identical grids. Expected bit-for-bit changes:
- ESMF_REGRIDMETHOD_BILINEAR: small changes in regridding weights when a destination point exactly matches a source point.
- ESMF_REGRIDMETHOD_PATCH: small changes in regridding weights when a destination point exactly matches a source point.
- Fixed a problem where a set of points with latitudes set at exactly -90.0 are not all mapped to the same point. Expected bit-for-bit changes:
- All regrid methods: Small weight changes when a point in the grid lies at exactly -90.0 latitude.
- Optimized the creep fill so the memory doesn't increase as quickly for large numbers of creep levels. Expected bit-for-bit changes:
- ESMF_EXTRAPMETHOD_CREEP: Small weight changes for the extrapolated destination locations.
- Fixed an issue where in some cases creep fill weights can trigger an assertion in the code that redistributes weights to their final decomposition. Expected bit-for-bit changes:
- ESMF_EXTRAPMETHOD_CREEP: Small weight changes for the extrapolated destination locations.
- Fixed a problem that could result in erroneously unmapped destinations when going from a very fine source grid to a coarse destination grid (e.g. 1/20 degree to 10x15 degree). Expected bit-for-bit changes:
- Tables summarizing the ESMF regridding status have been updated. These include supported grids and capabilities of the offline and integrated regridding, and numerical results of some specific test cases.
- The ESMF_Info class was introduced as a replacement for ESMF_Attribute. ESMF_Info is based on a modern C++ JSON implementation to provide efficient key-value pair storage. Most of the legacy ESMF_Attribute API is preserved for backward compatibility.
- ESMF is in the process of upgrading the internal mesh representation to use the MOAB mesh library developed by the U.S. Department of Energy. In this release, ESMF capabilities using MOAB have been significantly optimized and expanded, allowing for application-level testing of ESMF with MOAB as the underlying mesh representation. MOAB is built into the ESMF library by default, but its use must be enabled at run-time by calling ESMF_MeshSetMOAB(). When MOAB is activated, the following new capabilities are supported in this release:
- The Mesh creation, conservative regridding, and bilinear regridding algorithms when MOAB is active have been optimized to reduce their memory use and expand the size of Meshes they can be used on.
- Grids can now be explicitly converted to a Mesh when MOAB is active, using the ESMF_MeshCreate() method.
- Grids can be used to do first-order conservative regridding using MOAB.
- Grids can be used for bilinear regridding on cell centers or corners using MOAB.
- The NUOPC Layer contained in this release has been improved in the following specific technical areas:
- The NUOPC API has been simplified through the introduction of semantic specialization labels. The new approach leads to clearer and more concise NUOPC “cap” implementations that do not require specifying an Initialize Phase Definition (IPD) version or using the IPDvXXpY nomenclature when registering methods in the SetServices() method. Existing caps do not have to be re-written or updated, although updating to the new semantic specialization labels is recommended for existing and new NUOPC caps. The older IPD version based approach is supported for backward compatibility.
- The NUOPC layer now provides features for resource control and handling of threaded components. This mechanism supports mixing of hybrid MPI+OpenMP components with different threading levels and mixing with standard MPI components on the same processing elements (PEs), i.e. cores. It allows each component to fully utilize HPC resources independently. Coupling between threaded components is supported automatically via the standard NUOPC Connectors.
- The external NUOPC interface that supports interaction between an entire NUOPC application and a layer outside of NUOPC (e.g. a Data Assimilation system) was further refined. The associated prototype code (ExternalDriverAPIProto) has been updated to reflect the current status.
- The NUOPC Profiling attribute, available in the Driver Metadata, Model Metadata, Mediator Metadata, and Connector Metadata, has been re-implemented. The NUOPC layer profiling features now integrate with the ESMF profiling infrastructure.
- The NUOPC transfer protocol for geometry objects (Grid, Mesh, LocStream) has been made more efficient. Geometries used for multiple Fields are only transferred once, reducing the initialization overhead associated with the transfer.
- Several optimizations were implemented in the NUOPC layer to reduce overhead. All applications using NUOPC benefit from these optimizations without requiring code changes.
- Added the
creep_nrst_d
value to theextrapMethod
NUOPC connection options. This is equivalent to the ESMF_EXTRAPMETHOD_CREEP_NRST_D option in ESMF_FieldRegridStore() discussed below.
- Added the extrapolation option: CREEP_FILL to ESMPy. This option fills unmapped destination points by repeatedly moving data from mapped locations to neighboring unmapped locations.
- The implementation of the ESMF_StateReconcile() method was redesigned to improve performance, and scalability. Most users do not interact with this method directly, however, the NUOPC initialization time has been reduced as a consequence, which is most pronounced in applications with large Field and PET counts.
- Added a new extrapolation method called "nearest destination" to the regrid weight generation system. This capability fills destination points that were not filled by an initial regridding by using the nearest regridded destination point. The nearest destination method is accessible by specifying the extrapMethod=ESMF_EXTRAPMETHOD_NEAREST_D option in any of the ESMF_*RegridStore() methods or the --extrap_method nearestd option when using the ESMF_RegridWeightGen application).
- Added a new extrapolation method called "creep nearest destination" to the regrid weight generation system. This capability fills destination points that were not filled by an initial regridding by first applying a creep fill extrapolation and then filling the remaining unmapped destination points using nearest destination extrapolation. The creep nearest destination method is accessible by specifying the extrapMethod=ESMF_EXTRAPMETHOD_CREEP_NRST_D option in any of the ESMF_*RegridStore() methods or the --extrap_method creepnrstd option when using the ESMF_RegridWeightGen application).
- The creep fill extrapolation has been optimized so that it uses less memory per level extrapolated. This allows the creep fill method to extrapolate several times further into unmapped parts of the destination Field.
- Added the optional checkFlag argument to ESMF_FieldRegridStore(). The intention behind this flag is to allow the user to turn on more expensive error checking that may not be appropriate for an operational run. Initially this flag turns on a check for grid self-intersection during conservative regridding.
- The ESMF_MeshGet() call has been expanded to allow the user to query a full set of information for most Meshes. The exception is 2D Meshes with cells of more than four sides for which the element information (e.g. element connectivity) is not yet available.
- A new shared memory feature was introduced that allows sharing of decomposition elements (DEs) between PETs that execute on the same single system image (SSI), i.e. node. This feature provides an efficient way to access data by reference between components that are running on the same PEs (cores), but with different threading levels. Both Field-level DE sharing and Array-level DE sharing are supported.
- Key-value pair storage was added to the ESMF_Mesh and ESMF_LocStream classes through the overloaded ESMF_InfoGetFromHost() method.
- A section discussing “Debugging of ESMF User Applications” has been added to the User’s Guide. This section is designed to help users interpret error traces and efficiently locate issues in failing ESMF applications.
Known Bugs
-
FieldBundles don't currently enforce that every contained Field is built on the same Grid, Mesh, LocStream, or XGrid object, although the documentation says that this should be so.
-
The packed FieldBundle implementation uses a concatenated string to create a base object. When this string has more than 255 characters, e.g. a large number of Fields with long individual names is packed, the base object is not created correctly resulting in incorrect behavior at the FieldBundle level.
-
When the ESMF regrid weight generation methods and applications are used with nearest destination to source interpolation method, the unmapped destination point detection does not work. Even if the option is set to return an error for unmapped destination points (the default) no error will be returned.
-
The ESMF regrid weight generation methods and applications do not currently work for source Fields created on Grids which contain a DE of width less than 2 elements. For conservative regridding the destination Field also has this restriction.
-
The ESMF regrid weight generation methods and applications do not currently work on Fields created on Grids with arbitrary distribution.
-
When using the ESMF_RegridWeightGen application to generate conservative weights for a Mesh with > 16 million cells, the weight file produced has some of its factors scrambled in a subtle way. This can lead to higher than usual conservation error (>1.0E-7).
-
When pole extrapolation is used during regridding operations and a quadrilateral cell that degenerates into a triangle is in the top or bottom row of the grid, the error "Condition {etopo->num_nodes == 4} failed..." is incorrectly returned.
-
The ESMF_GridCreate() interface that allows the user to create a copy of an existing Grid with a new distribution will give incorrect results when used on a Grid with 3 or more dimensions and whose coordinate arrays are less than the full dimension of the Grid (i.e. it contains factorized coordinates).
-
The ESMF_XGrid construction can lead to degenerate cells for cases where the source and destination grids have edges that are almost the same. Often these cells don't produce weights and are benign, but when weights are produced can lead to low accuracy results when transferring data to/from the XGrid.
-
When any of the Redist() methods are used inside a VMEpoch, the execution crashes with an MPI error.
-
The OpenMP thread count is being reset to 1 within all ESMF components. This affects user code that leverages OpenMP threading inside of components, and uses the OMP_NUM_THREADS environment variable to set the desired number of OpenMP threads. As a consequence the expected speed up from OpenMP threading in user code will not be present.
-
The PETs of all ESMF components, and any potentially created OpenMP threads under such PETs, are pinned to the PE on the respective shared memory node, corresponding to the PET number. As a consequence, even if a user overcomes the OpenMP thread count reset to 1 bug, e.g. by using omp_set_num_threads() API directly, the performance of OpenMP threaded user code is far below that of the expected speed up.
Platform-specific bugs:
-
The GNU and Intel compilers require GCC>=4.8 for C++11 support (Intel uses the GCC headers). By default ESMF now uses the C++11 standard and cannot be downgraded. If you run into build issues due to the C++11 dependency, you must make sure a GCC>=4.8 is loaded.
-
For GNU compilers GCC>=10.x, the default Fortran argument mismatch checking has become stricter. This results in build failures in some of the code that comes with ESMF. Setting environment variable ESMF_F90COMPILEOPTS="-fallow-argument-mismatch -fallow-invalid-boz", during the ESMF build, can be used as a work around for this issue.
-
On Darwin, with the GNU gfortran+gcc combination, when building MPICH3 from source, it is important to specify the "--enable-two-level-namespace" configure option. By default, i.e. without this option, on Darwin, the produced MPICH compiler wrappers include a linker flag (-flat_namespace) that causes issues with C++ exception handling. Building and linking ESMF applications with MPICH compiler wrappers that specify this linker option leads to “mysterious” application aborts during execution.
-
On Darwin, with the Intel Fortran compiler, command line arguments cannot be accessed from ESMF applications when linked against the shared library version of libesmf. There is no issue when linked against the static libesmf.a version. Setting environment variable ESMF_SHARED_LIB_BUILD=OFF, during the ESMF build, can be used as a work around for this issue.
-
Currently the ESMPy interface to retrieve regridding weights from Python is only supported under the GNU compiler. On all other compilers the method will flag an error.
-
There is an issue with intercepting the MPI calls for profiling on some of the supported platforms. This results in a single FAIL reported for ESMF_TraceMPIUTest.F90. The affected platforms are:
- Catania: Darwin+GNU+MPICH3
- Gaea: Unicos+GNU+cray-mpich
-
There is an issue with loading the
libesmftrace_preload.so
library on some of the supported platforms. This results in a reported CRASH for ESMF_TraceIOUTest.F90 and ESMF_TraceMPIUTest.F90. The affected platforms are:- Discover: Linux+GNU+intelmpi
- Gaea: Unicos+Intel+cray-mpich
- Gaea: Unicos+Intel+mpiuni
- Orion: Linux+GNU+mpiuni