Flac-in-mp4 draft v0.0.2.

Draft updates in response to feedback from David Evans.

Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
This commit is contained in:
Ralph Giles 2016-10-04 08:49:09 -07:00 committed by Erik de Castro Lopo
parent 52df9c6117
commit 2fd3c22fec

View File

@ -1,12 +1,12 @@
Encapsulation of FLAC in ISO Base Media File Format Encapsulation of FLAC in ISO Base Media File Format
Version 0.0.1 (draft) Version 0.0.2 (draft)
Table of Contents Table of Contents
1 Scope 1 Scope
2 Supproting Normative References 2 Supporting Normative References
3 Terms and Definitions 3 Terms and Definitions
4 Design Rules of Encapsulation 4 Design Rules of Encapsulation
4.1 File Type Indentification 4.1 File Type Identification
4.2 Overview of Track Structure 4.2 Overview of Track Structure
4.3 Definition of FLAC sample 4.3 Definition of FLAC sample
4.3.1 Sample entry format 4.3.1 Sample entry format
@ -46,13 +46,13 @@ Table of Contents
https://xiph.org/flac/format.html https://xiph.org/flac/format.html
Definition of the FLAC Audio Codec stream format Definition of the FLAC Audio Codec stream format
[4] FLAC-in-Ogg mapping specification [4] FLAC-in-Ogg mapping specification
https://xiph.org/flac/ogg_mapping.html https://xiph.org/flac/ogg_mapping.html
Ogg Encapsulation for the FLAC Audio Codec Ogg Encapsulation for the FLAC Audio Codec
[5] Matroska specification [5] Matroska specification
@ -75,45 +75,46 @@ Table of Contents
4 Design Rules of Encapsulation 4 Design Rules of Encapsulation
4.1 File Type Indentification 4.1 File Type Identification
This specification does not define any brand to declare files This specification does not define any brand to declare files
are conformant to this specification. Files conformant to which conform to this specification. Files which conform to
this specification shall contain at least one brand which this specification shall contain at least one brand which
supports the requirements and the requirements described in supports the requirements and the requirements described in
this clause without contradiction in the compatible brands this clause without contradiction in the compatible brands
list of the File Type Box. The minimal support of the list of the File Type Box. The minimal support of the
encapsulation of FLAC bitstreams in ISO Base Media file format encapsulation of FLAC bitstreams in ISO Base Media file format
requires the 'isom' brand. requires the 'isom' brand.
4.2 Overview of Track Structure 4.2 Overview of Track Structure
FLAC coded audio shall be encapsulated into the ISO Base FLAC coded audio shall be encapsulated into the ISO Base
Media File Format as media data within an audio track. Media File Format as media data within an audio track.
+ The handler_type field in the Handler Reference Box + The handler_type field in the Handler Reference Box
shall be set to 'soun'. shall be set to 'soun'.
+ The Media Information Box shall contain the Sound Media + The Media Information Box shall contain the Sound Media
Header Box. Header Box.
+ The codingname of the sample entry is 'fLaC'. + The codingname of the sample entry is 'fLaC'.
This specification does not define any encapsulation This specification does not define any encapsulation
using MP4AudioSampleEntry with objectTypeIndication using MP4AudioSampleEntry with objectTypeIndication
specified by the MPEG-4 Registration Authority specified by the MPEG-4 Registration Authority
(http://www.mp4ra.org/). See section 'Sample entry (http://www.mp4ra.org/). See section 'Sample entry
format' for the definition of the the sample entry. format' for the definition of the sample entry.
+ The 'dfLa' box is added to the sample entry to convey + The 'dfLa' box is added to the sample entry to convey
initializing information for the decoder. initializing information for the decoder.
See section 'FLAC Specific Box' for the definition of See section 'FLAC Specific Box' for the definition of
the box contents. the box contents.
+ A FLAC sample is exactly one FLAC packet. See section + A FLAC sample is exactly one FLAC frame as described
'Sample format' for details of the packet contents. in the format specification[3]. See section
'Sample format' for details of the frame contents.
+ Every FLAC sample is a sync sample. No pre-roll or + Every FLAC sample is a sync sample. No pre-roll or
lapping is required. See section 'Random Access' for lapping is required. See section 'Random Access' for
further details. further details.
@ -122,11 +123,11 @@ Table of Contents
4.3.1 Sample entry format 4.3.1 Sample entry format
For any track containing one or more FLAC bitstreams, a For any track containing one or more FLAC bitstreams, a
sample entry describing the corresponding FLAC bitstream sample entry describing the corresponding FLAC bitstream
shall be present inside the Sample Table Box. This version shall be present inside the Sample Table Box. This version
of the specification defines only one sample entry format of the specification defines only one sample entry format
named FLACSampleEntry whose codingname is 'fLaC'. This named FLACSampleEntry whose codingname is 'fLaC'. This
sample entry includes exactly one FLAC Specific Box sample entry includes exactly one FLAC Specific Box
defined in section 'FLAC specific box' as a mandatory box defined in section 'FLAC specific box' as a mandatory box
and indicates that FLAC samples described by this sample and indicates that FLAC samples described by this sample
@ -139,34 +140,37 @@ Table of Contents
stored in big-endian format, though for purposes of the stored in big-endian format, though for purposes of the
ISO BMFF container, FLAC native metadata and data blocks ISO BMFF container, FLAC native metadata and data blocks
are treated as unstructured octet streams. are treated as unstructured octet streams.
class FLACSampleEntry() extends AudioSampleEntry ('fLaC'){ class FLACSampleEntry() extends AudioSampleEntry ('fLaC'){
FLACSpecificBox(); FLACSpecificBox();
} }
The fields of the AudioSampleEntry portion shall be set as
follows:
+ channelcount: + channelcount:
The channelcount field shall be set equal to the The channelcount field shall be set equal to the
channel count specified by the FLAC bitstream's native channel count specified by the FLAC bitstream's native
METADATA_BLOCK_STREAMINFO header as described in [3]. METADATA_BLOCK_STREAMINFO header as described in [3].
Note that the FLAC FRAME_HEADER structure that begins Note that the FLAC FRAME_HEADER structure that begins
each FLAC sample redundantly encodes channel number; each FLAC sample redundantly encodes channel number;
the number of channels declared in each FRAME_HEADER the number of channels declared in each FRAME_HEADER
MUST match the number of channels declared here and in MUST match the number of channels declared here and in
the METADATA_BLOCK_STREAMINFO header. the METADATA_BLOCK_STREAMINFO header.
+ samplesize: + samplesize:
The samplesize field shall be set equal to the bits The samplesize field shall be set equal to the bits
per sample specified by the FLAC bitstream's native per sample specified by the FLAC bitstream's native
METADATA_BLOCK_STREAMINFO header as described in [3]. METADATA_BLOCK_STREAMINFO header as described in [3].
Note that the FLAC FRAME_HEADER structure that begins Note that the FLAC FRAME_HEADER structure that begins
each FLAC sample redundantly encodes the number of each FLAC sample redundantly encodes the number of
bits per sample; the bits per sample declared in each bits per sample; the bits per sample declared in each
FRAME_HEADER MUST match the samplesize declared here FRAME_HEADER MUST match the samplesize declared here
and the bits per sample field declared in the and the bits per sample field declared in the
METADATA_BLOCK_STREAMINFO header. METADATA_BLOCK_STREAMINFO header.
+ samplerate: + samplerate:
The samplerate field shall be set equal to the sample The samplerate field shall be set equal to the sample
@ -178,14 +182,16 @@ Table of Contents
declared in each FRAME_HEADER MUST match the sample declared in each FRAME_HEADER MUST match the sample
rate declared here and in the rate declared here and in the
METADATA_BLOCK_STREAMINFO header. METADATA_BLOCK_STREAMINFO header.
Finally, the FLACSpecificBox carries codec headers:
+ FLACSpecificBox + FLACSpecificBox
This box contains initializing information for the This box contains initializing information for the
decoder as defined in section 'FLAC specific box' decoder as defined in section 'FLAC specific box'
4.3.2 FLAC Specific Box 4.3.2 FLAC Specific Box
Exactly one FLAC Specific Box shall be present in each Exactly one FLAC Specific Box shall be present in each
FLACSampleEntry. The FLAC Specific Box contains the FLACSampleEntry. The FLAC Specific Box contains the
Version field and this specification defines version 0 of Version field and this specification defines version 0 of
@ -202,128 +208,120 @@ Table of Contents
as follows. as follows.
aligned(8) class FLACMetadataBlock { aligned(8) class FLACMetadataBlock {
unsigned int(1) LastMetadataBlockFlag; unsigned int(1) LastMetadataBlockFlag;
unsigned int(7) BlockType; unsigned int(7) BlockType;
unsigned int(24) Length; unsigned int(24) Length;
unsigned int(8) MetadataBlockData[BlockLength]; unsigned int(8) BlockData[Length];
} }
aligned(8) class FLACSpecificBox aligned(8) class FLACSpecificBox
(unsigned int32 MetadataBlocks) extends Box('dfLa'){ extends FullBox('dfLa', version=0, 0){
unsigned int(8) Version; for (i=0; ; i++) { // to end of box
for(i=0; i <= MetadataBlocks; i++){ FLACMetadataBlock();
FLACMetadataBlock(); }
}
} }
+ MetadataBlocks:
The number of FLAC[3] native metadata blocks to
follow. This value must be at least 1 as a native
METADATA_BLOCK_STREAMINFO structure is required to
decode FLAC audio data.
This value is not coded as the end of the
FLACMetadataBlock list can be intuited from the
LastMetadataBlockFlag (see below).
+ Version: + Version:
The Version field shall be set to 0. The Version field shall be set to 0.
In the future versions of this specification, this In the future versions of this specification, this
field may be set to other values. And without support field may be set to other values. And without support
of those values, the reader shall not read the fields of those values, the reader shall not read the fields
after this within the FLACSpecificBox. after this within the FLACSpecificBox.
The Version field is followed by a sequence of FLAC[3] + Flags:
native-metadata block structures that fill the remainder
of the box length.
+ LastMetadataBlockFlag: The Flags field shall be set to 0.
The LastMetadataBlockFlag field maps semantically to After the FullBox header, the box contains a sequence of
the FLAC[3] native MEATADATA_BLOCK_HEADER FLAC[3] native-metadata block structures that fill the
Last-metadata-block flag as defined in the FLAC[3] remainder of the box.
file specification.
The LastMetadataBlockFlag is set to 1 if this
MetadataBlock is the last metadata block in the
FLACSpecificBox. It is set to 0 otherwise.
+ BlockType:
The BlockType field maps semantically to the FLAC[3] Each FLACMetadataBlock structure consists of three fields
native MEATADATA_BLOCK_HEADER BLOCK_TYPE field as filling a total of four bytes that form a FLAC[3] native
defined in the FLAC[3] file specification. METADATA_BLOCK_HEADER, followed by raw octet bytes that
comprise the FLAC[3] native METADATA_BLOCK_DATA.
The BlockType is set to a valid FLAC[3] BLOCK_TYPE + LastMetadataBlockFlag:
value that identifies the type of this native metadata
block. The BlockType of the first FLACMetadataBlock
must be set to 0, signifying this is a FLAC[3] native
METADATA_BLOCK_STREAMINFO block.
+ Length:
The Length field maps semantically to the FLAC[3] The LastMetadataBlockFlag field maps semantically to
native MEATADATA_BLOCK_HEADER Length field as the FLAC[3] native METADATA_BLOCK_HEADER
defined in the FLAC[3] file specification. Last-metadata-block flag as defined in the FLAC[3]
file specification.
The LastMetadataBlockFlag is set to 1 if this
MetadataBlock is the last metadata block in the
FLACSpecificBox. It is set to 0 otherwise.
+ BlockType:
The length field specifies the number of bytes of The BlockType field maps semantically to the FLAC[3]
MetadataBlockData to follow. native MEATADATA_BLOCK_HEADER BLOCK_TYPE field as
defined in the FLAC[3] file specification.
+ MetadataBlockData The BlockType is set to a valid FLAC[3] BLOCK_TYPE
value that identifies the type of this native metadata
block. The BlockType of the first FLACMetadataBlock
must be set to 0, signifying this is a FLAC[3] native
METADATA_BLOCK_STREAMINFO block.
+ Length:
The MetadataBlockData field maps semantically to the The Length field maps semantically to the FLAC[3]
FLAC[3] native MEATADATA_BLOCK_HEADER native MEATADATA_BLOCK_HEADER Length field as
METADATA_BLOCKDATA as defined in the FLAC[3] file defined in the FLAC[3] file specification.
specification.
The FLACMetadataBlock structure consists of three fields The length field specifies the number of bytes of
filling a total of four bytes that form a FLAC[3] native MetadataBlockData to follow.
METADATA_BLOCK_HEADER, followed by raw octet bytes that
comprise the FLAC[3] native METADATA_BLOCK_DATA. Taken
together, the bytes of the FLACMetadataBlock form a
complete FLAC[3] native METADATA_BLOCK structure.
Note that a minimum of a single FLACMetadataBlock, + BlockData
consisting of a FLAC[3] native METADATA_BLOCK_STREAMINFO
structure, is required. Should the FLACSpecificBox
contain more than a single FLACMetadataBlock structure,
the FLACMetadataBlock contianing the FLAC[3] native
METADATA_BLOCK_STREAMINFO must occur first in the list.
Other containers that package FLAC audio streams, such as The BlockData field maps semantically to the FLAC[3]
Ogg[4] and Matroska[5], wrap FLAC[3] native metadata without native MEATADATA_BLOCK_HEADER METADATA_BLOCK_DATA as
modification similar to this specification. When defined in the FLAC[3] file specification.
repackaging or remuxing FLAC[3] streams from another
format that contains FLAC[3] native metadata into an ISO Taken together, the bytes of the FLACMetadataBlock form a
BMFF file, the complete FLAC[3] native metadata should be complete FLAC[3] native METADATA_BLOCK structure.
preserved in the ISO BMFF stream as described above. It
is also allowed to parse this native metadata and include Note that a minimum of a single FLACMetadataBlock,
contextually redundant ISO BMFF-native repackagings and/or consisting of a FLAC[3] native METADATA_BLOCK_STREAMINFO
reparsings of FLAC[3] native metadata, so long as the structure, is required. Should the FLACSpecificBox
native metadata is also preserved. contain more than a single FLACMetadataBlock structure,
the FLACMetadataBlock containing the FLAC[3] native
METADATA_BLOCK_STREAMINFO must occur first in the list.
Other containers that package FLAC audio streams, such as
Ogg[4] and Matroska[5], wrap FLAC[3] native metadata without
modification similar to this specification. When
repackaging or remuxing FLAC[3] streams from another
format that contains FLAC[3] native metadata into an ISO
BMFF file, the complete FLAC[3] native metadata should be
preserved in the ISO BMFF stream as described above. It
is also allowed to parse this native metadata and include
contextually redundant ISO BMFF-native repackagings and/or
reparsings of FLAC[3] native metadata, so long as the
native metadata is also preserved.
4.3.3 Sample format 4.3.3 Sample format
A FLAC sample is exactly one FLAC audio FRAME packet (as A FLAC sample is exactly one FLAC audio FRAME (as defined
defined in the FLAC[3] file specification) belonging to a in the FLAC[3] file specification) belonging to a FLAC
FLAC bitstreams. The FLAC sample data begins with a bitstreams. The FLAC sample data begins with a complete
complete FLAC FRAME_HEADER, followed by one FLAC SUBFRAME FLAC FRAME_HEADER, followed by one FLAC SUBFRAME per
per channel, any necessary bit padding, and ends with the channel, any necessary bit padding, and ends with the
usual FLAC FRAME_FOOTER. usual FLAC FRAME_FOOTER.
Note that the FLAC native FRAME_HEADER structure that Note that the FLAC native FRAME_HEADER structure that
begins each FLAC sample redundantly encodes channel count, begins each FLAC sample redundantly encodes channel count,
sample rate, and sample size. The values of these fields sample rate, and sample size. The values of these fields
must agree both with the values declared in the FLAC must agree both with the values declared in the FLAC
METADATA_BLOCK_STREAMINFO structure as well as the METADATA_BLOCK_STREAMINFO structure as well as the
FLACDSampleEntry box. FLACSampleEntry box.
4.3.4 Duration of a FLAC sample 4.3.4 Duration of a FLAC sample
The duration of any given FLAC sample is determined by The duration of any given FLAC sample is determined by
dividing the decoded block size of a FLAC frame, as dividing the decoded block size of a FLAC frame, as
encoded in the FLAC FRAME's FRAME_HEADER structure, by the encoded in the FLAC FRAME's FRAME_HEADER structure, by the
value of the timescale field in the Media Header Box. value of the timescale field in the Media Header Box.
@ -337,11 +335,11 @@ Table of Contents
specification. specification.
4.3.6 Random Access 4.3.6 Random Access
This subclause describes the nature of the random access of FLAC sample. This subclause describes the nature of the random access of FLAC sample.
4.3.6.1 Random Access Point 4.3.6.1 Random Access Point
All FLAC samples can be independently decoded All FLAC samples can be independently decoded
i.e. every FLAC sample is a sync sample. The Sync i.e. every FLAC sample is a sync sample. The Sync
Sample Box shall not be present as long as there are Sample Box shall not be present as long as there are
@ -352,7 +350,7 @@ Table of Contents
4.4 Basic Structure (informative) 4.4 Basic Structure (informative)
4.4.1 Initial Movie 4.4.1 Initial Movie
This subclause shows a basic structure of the Movie Box as follows: This subclause shows a basic structure of the Movie Box as follows:
+----+----+----+----+----+----+----+----+------------------------------+ +----+----+----+----+----+----+----+----+------------------------------+
@ -420,7 +418,7 @@ Table of Contents
in ISO/IEC 14496-12 [1]. The additional boxes and the in ISO/IEC 14496-12 [1]. The additional boxes and the
additional requirements, restrictions and recommendations additional requirements, restrictions and recommendations
to the other boxes are described in this specification. to the other boxes are described in this specification.
4.5 Example of Encapsulation (informative) 4.5 Example of Encapsulation (informative)
[File] [File]
size = 17790 size = 17790
@ -581,15 +579,16 @@ Table of Contents
samplerate = 48000.000000 samplerate = 48000.000000
[dfLa: FLAC Specific Box] [dfLa: FLAC Specific Box]
position = 484 position = 484
size = 48 size = 50
Version = 0 version = 0
MetadataBlocks = 1 flags = 0x000000
LastMetadataBlockFlag = 1 [FLACMetaDataBlock]
BlockType = 0 LastMetadataBlockFlag = 1
Length = 34 BlockType = 0
MetadataBlockData[34]; Length = 34
BlockData[34];
[stts: Decoding Time to Sample Box] [stts: Decoding Time to Sample Box]
position = 490 position = 492
size = 24 size = 24
version = 0 version = 0
flags = 0x000000 flags = 0x000000
@ -598,7 +597,7 @@ Table of Contents
sample_count = 18 sample_count = 18
sample_delta = 1920 sample_delta = 1920
[stsc: Sample To Chunk Box] [stsc: Sample To Chunk Box]
position = 514 position = 516
size = 40 size = 40
version = 0 version = 0
flags = 0x000000 flags = 0x000000
@ -612,7 +611,7 @@ Table of Contents
samples_per_chunk = 5 samples_per_chunk = 5
sample_description_index = 1 sample_description_index = 1
[stsz: Sample Size Box] [stsz: Sample Size Box]
position = 554 position = 556
size = 92 size = 92
version = 0 version = 0
flags = 0x000000 flags = 0x000000
@ -637,7 +636,7 @@ Table of Contents
entry_size[16] = 962 entry_size[16] = 962
entry_size[17] = 848 entry_size[17] = 848
[stco: Chunk Offset Box] [stco: Chunk Offset Box]
position = 646 position = 648
size = 24 size = 24
version = 0 version = 0
flags = 0x000000 flags = 0x000000
@ -645,10 +644,10 @@ Table of Contents
chunk_offset[0] = 686 chunk_offset[0] = 686
chunk_offset[1] = 12985 chunk_offset[1] = 12985
[free: Free Space Box] [free: Free Space Box]
position = 670 position = 672
size = 8 size = 8
[mdat: Media Data Box] [mdat: Media Data Box]
position = 678 position = 680
size = 17001 size = 17001
5 Acknowledgements 5 Acknowledgements