Compression Specification Format
We created a Compression Specification Format, which can be easily read and utilized by various applications written in different programming languages. An example of this format would be:
>>> lossy,zfp,rate,4.0
The currently implemented compressors include Blosc for lossless compression, and ZFP, SZ and SZ3 for lossy compression. In order to use lossless compression, one can simply write:
>>> lossless
This will use the default backend lz4 with compression level 9. It is also possible to select a different compression level (1 to 9) or backend:
blosclz
lz4
lz4hc
snappy
zlib
zstd
One example with a different backend and compression level could be:
>>> lossless,snappy,7
For lossy compression, it is mandatory to include the compressor, the mode and the parameter. At the moment the lossy supported compressors are: ZFP, SZ and SZ3.
- The compressors have different methods and different names for them:
- -ZFP:
accuracy: absolute threshold mode
rate: number of bits-per-value
precision: keep a certain bits of precision
- -SZ:
abs: absolute threshold mode
rel: relative threshold mode
pw_rel: point-wise relative threshold mode
- -SZ3:
abs: absolute threshold mode
rel: relative threshold mode
norm2: using norm2.
psnr: using the peak signal to noise ratio mode.
Few examples of lossy compression specifications:
lossy,sz,abs,0.01
lossy,zfp,rate,4.0
lossy,sz3,psnr,40
lossy,sz,rel,1e-3
lossy,zfp,precision,10
lossy,sz,pw_rel,0.05
lossy,zfp,accuracy,0.01
There are also few features that target datasets with multiple variables. One can write a different specification for different variables by using a list of space separated specifications:
>>> var1:lossy,zfp,rate,4.0 var2:lossy,sz,abs,0.1
It is possible too to specify the default value for the variables that are not explicitly mentioned:
>>> var1:lossy,zfp,rate,4.0 default:lossy,sz,abs,0.1
In case a specification doesn’t have a variable name, it will be considered the default. i.e:
>>> var1:lossy,zfp,rate,4.0 lossy,sz,abs,0.1 -> var1:lossy,zfp,rate,4.0 default:lossy,sz,abs,0.1
If no default value is provided, lossless compression will be applied:
>>> var1:lossy,zfp,rate,4.0 -> var1:lossy,zfp,rate,4.0 default:lossless
Coordinates are treated separately, by default are compressed using lossless, although it is possible to change that:
>>> coordinates:lossy,zfp,rate,6