The basic definition of a pipelining system is an environment in which large amounts of data are pushed through a series of processing stages which are linked together through an array of data dependencies. Ideally this whole process takes place in a parallel processing environment, though that is by no means necessary.
The design of most pipelines usually proceeds along the following steps:
If the data already exists, this stage tends to be fairly trivial. The decisions that have to be made is whether one is going to use the entirety of the data, or just a subset thereof. If it is only a subset, it tends to make sense to create a new directory tree for the source data and to copy or link the original data into it. The other option is to create a series of scripts or files which hold the name of all of the files that one wants to use and to parse them as input to the pipeline.
If the data is being acquired during the lifetime of the pipeline, one might also have to consider conversion of the scans from the native format of the scanner to the minc file format. This step is usually not part of the main pipeline, but is instead run separately as the data arrives. It makes sense to have a preexisting directory structure into which the data can be imported.