I notice that, when I use FFMPEG to export a mod to a wave in 24 bits or higher, FFMPEG decides to use extensible encoding instead of regular PCM encoding and uses "Lavf58.76.100", whatever that is. I used exiftool to inspect the exported wav, and this is the result:
---- File ----
File Type : WAV
File Type Extension : wav
MIME Type : audio/x-wav
---- RIFF ----
Encoding : Extensible
Num Channels : 2
Sample Rate : 48000
Avg Bytes Per Sec : 288000
Bits Per Sample : 24
Software : Lavf58.76.100The problem is, that many programs don't understand this extensible wav format. Is there any way to tell FFMPEG to use regular PCM instead? I notice that other programs such as the bass library can export to a 24 bit wav using regular PCM encoding.
This is the command I am using:
ffmpeg -y -loglevel error -f libopenmpt -i c:\temp\sometrack.IT -map_metadata -1 -c:a pcm_s24le c:\temp\sometrack.wav
Edit
While I was writing this question, I was a bit frustrated and lacked knowledge. Now, it's all perfectly clear; wave files with a bit depth higher than 16 bits, should have the WAVE_FORMAT_EXTENSIBLE tag (and it makes sense, see the answer). The software that I was using to render mod files to wave, other than FFmpeg, were not honoring this rule. Thank you, Tom Yan for clarifying.
101 Answer
(Not all of my comments were accurate / correct, so refer to the following if you are interested.)
Theoretically speaking, PCMWAVEFORMAT can be used for audio with bit depth higher than 16. The header structure does not pose any limitation that prevents itself from supporting such audio.
However, there are apparently a few reasons that ffmpeg does not write a WAVE header in such format for that kind of audio.
For one, the format has been superseded by WAVEFORMATEX, and, according to the documentation:
...
wBitsPerSample
... If wFormatTag is WAVE_FORMAT_PCM, then wBitsPerSample should be equal to
8 or 16. ...
...In addition to the above requirement, there is no wFormatTag value other than WAVE_FORMAT_EXTENSIBLE that is defined for PCM audio with bit depth higher than 16.
Whether the "extensions" defined in WAVEFORMATEXTENSIBLE can be left out when wFormatTag is WAVE_FORMAT_EXTENSIBLE is not exactly clear. In WAVEFORMATEX, it is stated that:
...
wFormatTag
... When this structure is included in a WAVEFORMATEXTENSIBLE structure,
this value must be WAVE_FORMAT_EXTENSIBLE. ...
...With such a statement, probably no one would / should assume that it is allowed anyway.
If you read WAVEFORMATEX and WAVEFORMATEXTENSIBLE carefully, you'll notice that the real reason the latter exists (in terms of bit depth) is, it allows the former to store a multiple-of-8 container size with the "real" sample size being stored in one of the extensions defined in the latter. For example, 24 and 20 respectively for some (nasty) 20-bit PCM stream.
For the record though, the wav muxer of ffmpeg does NOT (at least not properly) support the quirky case just mentioned as far as I can see. (Both field would be written with the value of 0, I think, if such stream is not rejected.)
In case you really need ffmpeg to write the header in the format of PCMWAVEFORMAT for some 24-bit audio, you can consider to build it with the following patch:
diff --git a/libavformat/riff.h b/libavformat/riff.h
index 85d6786663..5794857f53 100644
--- a/libavformat/riff.h
+++ b/libavformat/riff.h
@@ -57,6 +57,11 @@ void ff_put_bmp_header(AVIOContext *pb, AVCodecParameters *par, int for_asf, int */ #define FF_PUT_WAV_HEADER_SKIP_CHANNELMASK 0x00000002
+/**
+ * Tell ff_put_wav_header() not to write WAVEFORMATEXTENSIBLE extensions if possible.
+ */
+#define FF_PUT_WAV_HEADER_FORCE_PCMWAVEFORMAT 0x00000004
+ /** * Write WAVEFORMAT header structure. *
diff --git a/libavformat/riffenc.c b/libavformat/riffenc.c
index ffccfa3d48..4dc8ca6e0f 100644
--- a/libavformat/riffenc.c
+++ b/libavformat/riffenc.c
@@ -80,9 +80,9 @@ int ff_put_wav_header(AVFormatContext *s, AVIOContext *pb, waveformatextensible = (par->channels > 2 && par->channel_layout) || par->channels == 1 && par->channel_layout && par->channel_layout != AV_CH_LAYOUT_MONO || par->channels == 2 && par->channel_layout && par->channel_layout != AV_CH_LAYOUT_STEREO ||
- par->sample_rate > 48000 || par->codec_id == AV_CODEC_ID_EAC3 ||
- av_get_bits_per_sample(par->codec_id) > 16;
+ ((par->sample_rate > 48000 || av_get_bits_per_sample(par->codec_id) > 16) &&
+ !(flags & FF_PUT_WAV_HEADER_FORCE_PCMWAVEFORMAT)); if (waveformatextensible) avio_wl16(pb, 0xfffe);
diff --git a/libavformat/wavenc.c b/libavformat/wavenc.c
index 2317700be1..bd41d6eeb3 100644
--- a/libavformat/wavenc.c
+++ b/libavformat/wavenc.c
@@ -83,6 +83,7 @@ typedef struct WAVMuxContext { int peak_block_pos; int peak_ppv; int peak_bps;
+ int extensible; } WAVMuxContext; #if CONFIG_WAV_MUXER
@@ -324,9 +325,10 @@ static int wav_write_header(AVFormatContext *s) } if (wav->write_peak != PEAK_ONLY) {
+ int flags = !wav->extensible ? FF_PUT_WAV_HEADER_FORCE_PCMWAVEFORMAT : 0; /* format header */ fmt = ff_start_tag(pb, "fmt ");
- if (ff_put_wav_header(s, pb, s->streams[0]->codecpar, 0) < 0) {
+ if (ff_put_wav_header(s, pb, s->streams[0]->codecpar, flags) < 0) { av_log(s, AV_LOG_ERROR, "Codec %s not supported in WAVE format\n", avcodec_get_name(s->streams[0]->codecpar->codec_id)); return AVERROR(ENOSYS);
@@ -494,6 +496,7 @@ static const AVOption options[] = { { "peak_block_size", "Number of audio samples used to generate each peak frame.", OFFSET(peak_block_size), AV_OPT_TYPE_INT, { .i64 = 256 }, 0, 65536, ENC }, { "peak_format", "The format of the peak envelope data (1: uint8, 2: uint16).", OFFSET(peak_format), AV_OPT_TYPE_INT, { .i64 = PEAK_FORMAT_UINT16 }, PEAK_FORMAT_UINT8, PEAK_FORMAT_UINT16, ENC }, { "peak_ppv", "Number of peak points per peak value (1 or 2).", OFFSET(peak_ppv), AV_OPT_TYPE_INT, { .i64 = 2 }, 1, 2, ENC },
+ { "extensible", "Write WAVEFORMATEXTENSIBLE extensions.", OFFSET(extensible), AV_OPT_TYPE_BOOL, { .i64 = 1 }, 0, 1, ENC }, { NULL }, }; Then by adding -extensible 0 before the output file path / name, you should be able to get what you called a "regular" 24-bit WAVE file.