NetCDF User's Guide - data compression

Go to the previous, next section.

Are there plans to add facilities for data compression to netCDF?

We have no plans to add data compression to netCDF (although we do plan to eventually add a form of transparent data packing on write and unpacking on read whenever the reserved attributes _Nbits, _Scale, and _Offset are defined).

Hyperslab access and direct access to individual array values conflict with most simple compression schemes. With netCDF, the elements of an array variable can be filled in any order or as cross-sections in any direction. NetCDF permits writing elements in one order and reading them later in different orders.

Some compression methods require that all the data to be compressed are known before starting the compression. Techniques like run-length encoding or anything that depends on exploiting similarities in nearby values can't be used if nearby values aren't all known at the time some of the data are to be written.

An alternative that can be implemented above the netCDF library is to adopt a convention for compressed data that uses a "compression" attribute to encode the method of compression, e.g.

x:compression = "rle" ;

for run-length encoding of the data in a variable x. Then when you write the data, compress them into a bland array of bytes and write all the bytes. Note that it would be difficult to define the size of such a variable in advance, since its compressed size depends on its values. You would also have to give up on hyperslab access for such variables, but instead read the compressed array in all at once and uncompress it before using it.

Go to the previous, next section.