(3) |
This storage method has the virtue of simplicity, and also conveniently mimics the memory organization of arrays in the ``C'' programming language. However, it has two major drawbacks. First, it assumes that all of the data is stored in a contiguous block. This reduces flexibility, making it extremely difficult to implement data compression, for example, since compression may remove the predictable relationship between adjacent data values. Second, this storage method inherently favors input and output which read entire sections along the fastest-varying dimension. In the above example, if your program wants to read a slice of data which varies in and but which is at a fixed , it must read through most of the file to collect the required data.
In a file with ``blocking'' enabled, data is stored in a more complex fashion. A block with the same dimensionality as the overall file is defined, and each block contains a fixed-size chunk of the overall data. For example, if a file has the overall dimensionality of , and a block size of , then the file's data will be grouped into 1000 blocks of data, each holding 1000 values each. The ``first'' block contains all data points for , , and . Within a block, the storage layout is the same as for the contiguous case described above.
To complicate matters, the file may contain some sort of translation table which defines the physical storage location of each of these blocks, so that blocks may not be stored in a specific order.
The advantage of a block-structured file is that in freeing the system from a strictly-ordered storage arrangement, it becomes possible for individual blocks to vary in location and size. If a compression algorithm is applied, some blocks will prove highly compressible and thus will be quite small, whereas relatively uncompressible blocks will be relatively large. In the case of a sparse array, it may be possible to avoid allocating space for those blocks which consist of only the default value.