togliendo lowpass, funziona:
codice:
time sox -S -V3 ZOOM0001_LR.wav -t dsf -b 1 test.dsf rate -v 2822400 sdm -f sdm-4
sox:      SoX v14.4.2
sox INFO formats: detected file format type `wav'

Input File     : 'ZOOM0001_LR.wav'
Channels       : 2
Sample Rate    : 44100
Precision      : 16-bit
Duration       : 00:00:47.19 = 2081230 samples = 3539.51 CDDA sectors
File Size      : 8.33M
Bit Rate       : 1.41M
Sample Encoding: 16-bit Signed Integer PCM
Endian Type    : little
Reverse Nibbles: no
Reverse Bits   : no

sox INFO sox: Overwriting `test.dsf'

Output File    : 'test.dsf'
Channels       : 2
Sample Rate    : 2.8224e+06
Precision      : 1-bit
Duration       : 00:00:47.19 = 133198720 samples ~ 3539.51 CDDA sectors
Sample Encoding: 1-bit Direct Stream Digital
Endian Type    : little
Reverse Nibbles: no
Reverse Bits   : no
Comment        : 'Processed by SoX'

sox INFO sox: effects chain: input        44100Hz  2 channels
sox INFO sox: effects chain: rate       2.8224e+06Hz  2 channels
sox INFO sox: effects chain: sdm        2.8224e+06Hz  2 channels
sox INFO sox: effects chain: output     2.8224e+06Hz  2 channels
In:100%  00:00:47.19 [00:00:00.00] Out:133M  [!=====|=====!] Hd:0.0 Clip:0
Done.

real    0m14.819s
user    0m14.620s
sys     0m0.164s
...e (con queste impostazioni, invero non delle più "spinte") è ampiamente in grado di fare la conversione in real-time anche sul mio vecchio core2-duo (ci ha messo poco meno di 15s per convertire uno stream di 47s e spicci).