Interface DataFilter

    • Method Detail

      • filter

        DataSource filter​(DataSource original)
                   throws IOException
        Filter the data source.

        Filtering is often based on suffix. For example a gzip compressed file will have an original name of the form base.ext.gz when the corresponding uncompressed file will have a filtered name base.ext.

        A filter must never open the DataSource by itself, regardless of the fact it will return the original instance or a filtered instance. The rationale is that it is the upper layer that will decide to open (or not) the returned value and that a DataSource can be opened only once; this is the core principle of lazy-opening provided by DataSource.

        Beware that as the data providers manager will attempt to pile all filters in a stack as long as their implementation of this method returns a value different from the original parameter. This implies that the filter, must perform some checks to see if it must be applied or not. If for example there is a need for a deciphering filter to be applied once to all data, then the filter should for example check for a suffix in the name and create a new filtered DataSource instance only if the suffix is present, removing the suffix from the filtered instance. Failing to do so and simply creating a filtered instance with one deciphering layer without changing the name would result in an infinite stack of deciphering filters being built, until a stack overflow or memory exhaustion exception occurs.

        Parameters:
        original - original data source
        Returns:
        filtered data source, or original if this filter does not apply to this data source
        Throws:
        IOException - if filtered stream cannot be created