esio  0.1.9
 All Classes Files Functions Variables Typedefs Enumerations Enumerator Macros Pages
Basic concepts

This page first covers a sample ESIO use case and then provides an introduction to and details about these ESIO concepts:

  1. Handles
  2. Files
  3. Attributes
  4. Lines
  5. Planes
  6. Fields
  7. Layouts

Sample usage

ESIO usage usually involves some variation on the following steps:

  1. Initialize an opaque esio_handle for a particular MPI communicator.
  2. Create or open a data file using the handle.
  3. Write or read some attributes using the handle.
  4. Establish the parallel decomposition used for distributed line, plane, or field operations using the handle.
  5. Write or read one or more distributed lines, planes, or fields of data using the handle.
  6. Close the data file using the handle.
  7. Finalize the esio_handle.

A complete, C-based MPI program using ESIO to write some (trivial) data might look like:

1 #include <stdlib.h>
2 #include <mpi.h>
3 #include <esio/esio.h>
4 
5 int main(int argc, char *argv[])
6 {
7  MPI_Init(&argc, &argv);
8  atexit((void (*) ()) MPI_Finalize);
9  int world_size, world_rank;
10  MPI_Comm_size(MPI_COMM_WORLD, &world_size);
11  MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
12 
13  esio_handle h = esio_handle_initialize(MPI_COMM_WORLD);
14  esio_file_create(h, "data.h5", 1 /* overwrite */);
15 
16  esio_string_set(h, "/", "program", argv[0]);
17 
18  int version = 1;
19  esio_attribute_write_int(h, "/", "version", &version);
20 
21  double example[2] = { 2.0 * world_rank, 2.0 * world_rank + 1 };
22  esio_line_establish(h, 2*world_size, 2*world_rank, 2);
23  esio_line_write_double(h, "example", example, 1, NULL);
24 
25  esio_file_close(h);
27 
28  return 0;
29 }

ESIO's C API is documented within the esio.h reference material.

A complete, Fortran-based MPI program using ESIO to perform the same task would look like:

1 program main
2 
3  use esio
4 
5  implicit none
6 
7  include 'mpif.h' !! Prefer 'use mpi' above 'implicit none' if possible
8 
9  integer :: ierr, world_size, world_rank, version
10  type(esio_handle) :: h
11  character(len = 255):: str
12  double precision :: example(2)
13 
14  call mpi_init(ierr)
15  call mpi_comm_size(mpi_comm_world, world_size, ierr)
16  call mpi_comm_rank(mpi_comm_world, world_rank, ierr)
17 
18  call esio_handle_initialize(h, mpi_comm_world)
19  call esio_file_create(h, "data.h5", .true.)
20 
21  call get_command_argument(0, str)
22  call esio_string_set(h, "/", "program", trim(str))
23 
24  version = 1
25  call esio_attribute_write_integer(h, "/", "version", version)
26 
27  example = [2d0 * world_rank, 2d0 * world_rank + 1]
28  call esio_line_establish(h, 2*world_size, 2*world_rank + 1, 2)
29  call esio_line_write_double(h, "example", example, 1)
30 
31  call esio_file_close(h)
32  call esio_handle_finalize(h)
33 
34  call mpi_finalize(ierr)
35 
36 end program main

ESIO's Fortran API is documented, relative to the C API, within the esio.F90 reference material.

Note that the C sample used the zero-indexed expression 2*world_rank within the call to esio_line_establish while the Fortran sample used the one-indexed expression 2*world_rank + 1. ESIO's C API respects "C-isms" like row-major ordering, zero-based offsets, and calling integers ints. ESIO's Fortran API respects "Fortran-isms" like column-major ordering, one-based offsets, and calling integers integers.

Handles

ESIO maintains state within opaque esio_handles. A user creates an esio_handle by invoking esio_handle_initialize() with an MPI communicator. All subsequent calls to the ESIO API using that particular handle must be made collectively across the MPI communicator supplied at the handle's creation time. When finished using a handle, a user calls esio_handle_finalize().

Multiple esio_handle may be created against the same or different MPI communicators. For example, one handle may created against MPI_COMM_WORLD and another created against MPI_COMM_SELF. ESIO calls using the first handle must be made collectively across all MPI ranks. In contrast, ESIO calls using the second handle may be made on only a single rank (that is, collectively across MPI_COMM_SELF).

To the extent allowed by the underlying MPI stack and the parallel HDF5 implementation, ESIO is thread safe provided that a single esio_handle is not used concurrently by more than one thread.

Files

ESIO data files are, for all intents and purposes, simply HDF5 files usable by any application conforming to the The HDF5 Group's specifications.

At any time, a esio_handle may have a data file associated with it. A file must be associated with a handle before any attributes or data may be read or written. Existing files may be opened using esio_file_open(). New files may be created using esio_file_create(). When finished using a file, a user should call esio_file_close().

Information written to a file is always buffered and should not be assumed to be on disk while a file is open. Buffers are flushed when a file is closed. Buffers may explicitly be flushed using esio_file_flush().

Attributes

ESIO can read or write either numeric or string attributes. Attribute access must be done using contiguous memory and each MPI rank must write or read the same data. Attributes are often useful for tracking simulation metadata. Examples include a program version identifier, a Reynolds number, or an array containing collocation point values needed on every MPI rank. Attributes may be attached to the entire file (by specifying a location of "/") or to an existing line, plane, or field (by specifying the object's name).

Signed integers, single-, and double-precision floating point scalar-valued attributes are supported via methods like esio_attribute_write_double() and esio_attribute_read_double(). Vector-valued attributes are manipulated using methods like esio_attribute_writev_double() and esio_attribute_readv_double(). When reading a numeric attribute, the user is responsible for having allocated a buffer sufficiently large to hold the information. The user may query the number of components in existing vector-valued data using esio_attribute_sizev(). Notice that methods handling vector-valued numeric data uniformly contain a v.

Arbitrary length strings can be written and read using esio_string_set() and esio_string_get(). Note that, unlike numeric attribute's read methods, esio_string_get() allocates space for variable length data. It is the user's responsibility to free the storage allocated by esio_string_get().

Lines

ESIO can read or write distributed one-dimensional, numeric data (termed "lines"). Though reading or writing a numeric field must be done collectively, different ranks can (and often do) manipulate disjoint data within a line.

The parallel decomposition for line operations is established by having all ranks collectively invoke esio_line_establish(). Each MPI rank must supply

  1. the global number of data elements within the line (aglobal),
  2. the global starting offset to be handled by this rank (astart), and
  3. the number of data elements to be handled by this rank (alocal).

The global number of data elements specified must be identical across all ranks. Global offsets start at zero within the C API. The behavior of a write operation is undefined when astart and alocal do not specify disjoint data. Once established, this parallel decomposition is in effect for all line operations until esio_line_establish() is called again.

In addition to a type and a memory buffer, when reading or writing a line each MPI rank must supply the stride between adjacent data elements in the local memory buffer (astride). Strides, which must be nonnegative, are specified in terms of the scalar size. For example, when writing double precision data strides are expressed in units of sizeof(double). Specifying a zero astride is equivalent to specifying that the data is stored contiguously in memory. Data is always stored contiguously in the file.

Example methods for scalar-valued lines are esio_line_write_float() and esio_line_read_float(). Information about the size of a line within a data file may be obtained using esio_line_size().

Vector-valued lines may also be utilized. These may be useful for storing complex numbers (as a two-component vector) or other "interleaved" data. Vector components must be adjacent in memory. Example methods for vector-valued lines are esio_line_writev_float() and esio_line_readv_float(). Information about the size of a line within a data file, including the number of vector components it contains, may be obtained using esio_line_sizev().

Strides between vector-valued line data are expressed in terms of the size of a single component. For example, strides for a five-component vector in double precision are expressed in terms of sizeof(double). This allows the same astride value provided to ESIO to be used in other pointer arithmetic contexts. Unfortunately, due to a limitation in HDF5, astride must be an even multiple of the number of components in use.

Planes

ESIO can read or write distributed two-dimensional, numeric data (termed "planes"). Such operations can be thought of as a tensor product of two line operations in all senses. In addition to a line's aglobal, astart, etc., a plane contains bglobal, bstart, etc. The parallel decomposition used for planes is controlled via esio_plane_establish(). The "A" direction is the faster index in files and appears after the "B" direction in all C API calls. Typically the "A" direction should be the faster direction in memory as well.

Sample plane routines are esio_plane_write_float(), esio_plane_writev_float(), esio_plane_read_float(), and esio_plane_readv_float(). Existing plane data may be queried using esio_plane_size() or esio_plane_sizev().

Fields

ESIO can read or write distributed three-dimensional, numeric data (termed "fields"). Field methods add cglobal, cstart, etc. arguments beyond those found for a plane. The parallel decomposition used for fields is controlled via esio_field_establish(). The "A" direction is the fastest index in files and appears rightmost in all C API calls. Typically the "A" direction should be the fastest direction in memory as well.

Sample field routines are esio_field_write_int(), esio_field_read_int(), esio_field_writev_int(), and esio_field_readv_int(). Existing field data may be queried using esio_field_size() or esio_field_sizev().

Layouts

To provide flexibility and aid IO performance tuning for particular HPC platforms, fields may be laid out in HDF5 files in different manners (termed "layouts"). Fields are stored with sufficient metadata for ESIO to be able to determine which layout was used to write data to file. Further, operations on any existing field will always use the layout with which a field was initially created.

The total number of field layouts available may be queried using esio_field_layout_count(). The default layout associated with a particular esio_handle can be retrieved using esio_field_layout_get() and specified using esio_field_layout_set(). The default layout is used whenever esio writes a new field to file. Note that any existing layout is preserved when a field is overwritten.

Two layouts are available in the current ESIO release. The first (and default) layout 0 provides maximum interoperability with other HDF5-based applications. The second, layout 1, has provided better parallel IO throughput on some systems. Layout 1 is readable by other HDF5-based applications but may require additional logic to extract the full three dimensional data in, for example, visualization applications. Future layout numbers will be monotonically increasing and stable across ESIO versions.

Users are advised to choose a non-default layout only after having benchmarked the available layouts and determined that performance gains offset usability and data longevity. At such a time, applications can be retrofit to use a higher performance layout simply by adding an esio_field_layout_set() call immediately after esio_handle_initialize(). All legacy fields will continue to work alongside any new fields.


Generated on Thu Jan 16 2014 10:16:33 for esio by  doxygen 1.8.2