NetCDF User's Guide - Missing Values

Go to the previous, next section.

Missing Values

What happens when you try to read a value that was never written in an open netCDF file? You might expect that this should always be an error, and that you should get an error message or an error status returned. You do get an error if you try to read data from a netCDF file that is not open for reading, if the variable ID is invalid for the specified netCDF file, or if the specified hyperslab is not properly within the range defined by the dimension sizes of the specified variable. Otherwise, reading a value that was not written returns a special fill value used to fill in any missing values when a netCDF variable is first written.

You may also ignore fill values and use the entire range of a netCDF data type, but in this case you should make sure you write all data values before reading them. If you know you will be writing all the data before reading it, you can specify that no prefilling of variables with fill values will occur by calling ncsetfill before writing. This may provide a significant performance gain for netCDF writes.

There are several reasons for using a fill value instead of an error return for missing data. First, the interface for hyperslab access would necessarily be more complex and slower if information had to be returned about whether each value read had been written. Since data may have been written in a different order from that in which it is later read, it is possibile that only a few values in a block of retrieved values were never written. Second, it is usually preferable to delay the detection of missing values until there is a need for the values, since they may not be used in subsequent computations. Finally, the use of missing values is a common way to represent data points outside the boundaries of irregular regions of data enclosed by a hyperslab, making it possible to handle such data in a simpler way than would be possible with a more compact representation that represented the boundary explicitly.

The default fill values for each type are defined in the include file `netcdf.h' (or `netcdf.inc' for FORTRAN). It is usually better to use your own fill value instead, by defining the attribute _FillValue for a variable before writing it. A disadvantage of the default fill values for floating-point and double-precision types is that they may be defined differently for different platforms, and may be difficult to compare with other values, since they are defined to be right at the edge of the valid floating-point number ranges for each machine.

Fill values are used for filling in missing data whenever a value is put beyond the end of data that has already been written. A default fill value has no other special meaning, so it can be used for valid values if you use your own fill value instead.

Currently the only difference between the netCDF byte and character types is that the two types have different default fill values. The fill value for bytes is on the edge of the range, representing the largest negative value for signed bytes. The fill value for characters, however, is the zero byte, a more useful value for detecting the end of C character strings.

Sometimes there is need for more than one value to represent different kinds of missing data. In this case, the user should use one or more other variable attributes for the different kinds of missing data. For example, it might be appropriate to use _FillValue to mean that data that was expected never appeared, but missing_value where the creator of the data intends data to be missing, as around an irregular region represented by a rectangular grid.

Go to the previous, next section.