Flac-in-mp4 draft v0.0.2.
Draft updates in response to feedback from David Evans. Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
This commit is contained in:
parent
52df9c6117
commit
2fd3c22fec
307
doc/isoflac.txt
307
doc/isoflac.txt
@ -1,12 +1,12 @@
|
||||
Encapsulation of FLAC in ISO Base Media File Format
|
||||
Version 0.0.1 (draft)
|
||||
Version 0.0.2 (draft)
|
||||
|
||||
Table of Contents
|
||||
1 Scope
|
||||
2 Supproting Normative References
|
||||
2 Supporting Normative References
|
||||
3 Terms and Definitions
|
||||
4 Design Rules of Encapsulation
|
||||
4.1 File Type Indentification
|
||||
4.1 File Type Identification
|
||||
4.2 Overview of Track Structure
|
||||
4.3 Definition of FLAC sample
|
||||
4.3.1 Sample entry format
|
||||
@ -46,13 +46,13 @@ Table of Contents
|
||||
|
||||
https://xiph.org/flac/format.html
|
||||
|
||||
Definition of the FLAC Audio Codec stream format
|
||||
Definition of the FLAC Audio Codec stream format
|
||||
|
||||
[4] FLAC-in-Ogg mapping specification
|
||||
|
||||
https://xiph.org/flac/ogg_mapping.html
|
||||
https://xiph.org/flac/ogg_mapping.html
|
||||
|
||||
Ogg Encapsulation for the FLAC Audio Codec
|
||||
Ogg Encapsulation for the FLAC Audio Codec
|
||||
|
||||
[5] Matroska specification
|
||||
|
||||
@ -75,45 +75,46 @@ Table of Contents
|
||||
|
||||
4 Design Rules of Encapsulation
|
||||
|
||||
4.1 File Type Indentification
|
||||
4.1 File Type Identification
|
||||
|
||||
This specification does not define any brand to declare files
|
||||
are conformant to this specification. Files conformant to
|
||||
which conform to this specification. Files which conform to
|
||||
this specification shall contain at least one brand which
|
||||
supports the requirements and the requirements described in
|
||||
this clause without contradiction in the compatible brands
|
||||
list of the File Type Box. The minimal support of the
|
||||
encapsulation of FLAC bitstreams in ISO Base Media file format
|
||||
requires the 'isom' brand.
|
||||
|
||||
|
||||
4.2 Overview of Track Structure
|
||||
|
||||
FLAC coded audio shall be encapsulated into the ISO Base
|
||||
Media File Format as media data within an audio track.
|
||||
|
||||
|
||||
+ The handler_type field in the Handler Reference Box
|
||||
shall be set to 'soun'.
|
||||
|
||||
|
||||
+ The Media Information Box shall contain the Sound Media
|
||||
Header Box.
|
||||
|
||||
|
||||
+ The codingname of the sample entry is 'fLaC'.
|
||||
|
||||
|
||||
This specification does not define any encapsulation
|
||||
using MP4AudioSampleEntry with objectTypeIndication
|
||||
specified by the MPEG-4 Registration Authority
|
||||
(http://www.mp4ra.org/). See section 'Sample entry
|
||||
format' for the definition of the the sample entry.
|
||||
|
||||
format' for the definition of the sample entry.
|
||||
|
||||
+ The 'dfLa' box is added to the sample entry to convey
|
||||
initializing information for the decoder.
|
||||
|
||||
See section 'FLAC Specific Box' for the definition of
|
||||
See section 'FLAC Specific Box' for the definition of
|
||||
the box contents.
|
||||
|
||||
+ A FLAC sample is exactly one FLAC packet. See section
|
||||
'Sample format' for details of the packet contents.
|
||||
|
||||
|
||||
+ A FLAC sample is exactly one FLAC frame as described
|
||||
in the format specification[3]. See section
|
||||
'Sample format' for details of the frame contents.
|
||||
|
||||
+ Every FLAC sample is a sync sample. No pre-roll or
|
||||
lapping is required. See section 'Random Access' for
|
||||
further details.
|
||||
@ -122,11 +123,11 @@ Table of Contents
|
||||
|
||||
4.3.1 Sample entry format
|
||||
|
||||
For any track containing one or more FLAC bitstreams, a
|
||||
For any track containing one or more FLAC bitstreams, a
|
||||
sample entry describing the corresponding FLAC bitstream
|
||||
shall be present inside the Sample Table Box. This version
|
||||
of the specification defines only one sample entry format
|
||||
named FLACSampleEntry whose codingname is 'fLaC'. This
|
||||
named FLACSampleEntry whose codingname is 'fLaC'. This
|
||||
sample entry includes exactly one FLAC Specific Box
|
||||
defined in section 'FLAC specific box' as a mandatory box
|
||||
and indicates that FLAC samples described by this sample
|
||||
@ -139,34 +140,37 @@ Table of Contents
|
||||
stored in big-endian format, though for purposes of the
|
||||
ISO BMFF container, FLAC native metadata and data blocks
|
||||
are treated as unstructured octet streams.
|
||||
|
||||
|
||||
class FLACSampleEntry() extends AudioSampleEntry ('fLaC'){
|
||||
FLACSpecificBox();
|
||||
}
|
||||
|
||||
The fields of the AudioSampleEntry portion shall be set as
|
||||
follows:
|
||||
|
||||
+ channelcount:
|
||||
|
||||
The channelcount field shall be set equal to the
|
||||
channel count specified by the FLAC bitstream's native
|
||||
METADATA_BLOCK_STREAMINFO header as described in [3].
|
||||
Note that the FLAC FRAME_HEADER structure that begins
|
||||
each FLAC sample redundantly encodes channel number;
|
||||
the number of channels declared in each FRAME_HEADER
|
||||
MUST match the number of channels declared here and in
|
||||
the METADATA_BLOCK_STREAMINFO header.
|
||||
The channelcount field shall be set equal to the
|
||||
channel count specified by the FLAC bitstream's native
|
||||
METADATA_BLOCK_STREAMINFO header as described in [3].
|
||||
Note that the FLAC FRAME_HEADER structure that begins
|
||||
each FLAC sample redundantly encodes channel number;
|
||||
the number of channels declared in each FRAME_HEADER
|
||||
MUST match the number of channels declared here and in
|
||||
the METADATA_BLOCK_STREAMINFO header.
|
||||
|
||||
+ samplesize:
|
||||
|
||||
The samplesize field shall be set equal to the bits
|
||||
per sample specified by the FLAC bitstream's native
|
||||
METADATA_BLOCK_STREAMINFO header as described in [3].
|
||||
Note that the FLAC FRAME_HEADER structure that begins
|
||||
each FLAC sample redundantly encodes the number of
|
||||
bits per sample; the bits per sample declared in each
|
||||
FRAME_HEADER MUST match the samplesize declared here
|
||||
and the bits per sample field declared in the
|
||||
METADATA_BLOCK_STREAMINFO header.
|
||||
|
||||
The samplesize field shall be set equal to the bits
|
||||
per sample specified by the FLAC bitstream's native
|
||||
METADATA_BLOCK_STREAMINFO header as described in [3].
|
||||
Note that the FLAC FRAME_HEADER structure that begins
|
||||
each FLAC sample redundantly encodes the number of
|
||||
bits per sample; the bits per sample declared in each
|
||||
FRAME_HEADER MUST match the samplesize declared here
|
||||
and the bits per sample field declared in the
|
||||
METADATA_BLOCK_STREAMINFO header.
|
||||
|
||||
+ samplerate:
|
||||
|
||||
The samplerate field shall be set equal to the sample
|
||||
@ -178,14 +182,16 @@ Table of Contents
|
||||
declared in each FRAME_HEADER MUST match the sample
|
||||
rate declared here and in the
|
||||
METADATA_BLOCK_STREAMINFO header.
|
||||
|
||||
|
||||
Finally, the FLACSpecificBox carries codec headers:
|
||||
|
||||
+ FLACSpecificBox
|
||||
|
||||
|
||||
This box contains initializing information for the
|
||||
decoder as defined in section 'FLAC specific box'
|
||||
|
||||
4.3.2 FLAC Specific Box
|
||||
|
||||
|
||||
Exactly one FLAC Specific Box shall be present in each
|
||||
FLACSampleEntry. The FLAC Specific Box contains the
|
||||
Version field and this specification defines version 0 of
|
||||
@ -202,128 +208,120 @@ Table of Contents
|
||||
as follows.
|
||||
|
||||
aligned(8) class FLACMetadataBlock {
|
||||
unsigned int(1) LastMetadataBlockFlag;
|
||||
unsigned int(7) BlockType;
|
||||
unsigned int(24) Length;
|
||||
unsigned int(8) MetadataBlockData[BlockLength];
|
||||
unsigned int(1) LastMetadataBlockFlag;
|
||||
unsigned int(7) BlockType;
|
||||
unsigned int(24) Length;
|
||||
unsigned int(8) BlockData[Length];
|
||||
}
|
||||
|
||||
aligned(8) class FLACSpecificBox
|
||||
(unsigned int32 MetadataBlocks) extends Box('dfLa'){
|
||||
unsigned int(8) Version;
|
||||
for(i=0; i <= MetadataBlocks; i++){
|
||||
FLACMetadataBlock();
|
||||
}
|
||||
aligned(8) class FLACSpecificBox
|
||||
extends FullBox('dfLa', version=0, 0){
|
||||
for (i=0; ; i++) { // to end of box
|
||||
FLACMetadataBlock();
|
||||
}
|
||||
}
|
||||
|
||||
+ MetadataBlocks:
|
||||
|
||||
The number of FLAC[3] native metadata blocks to
|
||||
follow. This value must be at least 1 as a native
|
||||
METADATA_BLOCK_STREAMINFO structure is required to
|
||||
decode FLAC audio data.
|
||||
|
||||
This value is not coded as the end of the
|
||||
FLACMetadataBlock list can be intuited from the
|
||||
LastMetadataBlockFlag (see below).
|
||||
|
||||
+ Version:
|
||||
|
||||
The Version field shall be set to 0.
|
||||
The Version field shall be set to 0.
|
||||
|
||||
In the future versions of this specification, this
|
||||
In the future versions of this specification, this
|
||||
field may be set to other values. And without support
|
||||
of those values, the reader shall not read the fields
|
||||
after this within the FLACSpecificBox.
|
||||
|
||||
The Version field is followed by a sequence of FLAC[3]
|
||||
native-metadata block structures that fill the remainder
|
||||
of the box length.
|
||||
+ Flags:
|
||||
|
||||
+ LastMetadataBlockFlag:
|
||||
The Flags field shall be set to 0.
|
||||
|
||||
The LastMetadataBlockFlag field maps semantically to
|
||||
the FLAC[3] native MEATADATA_BLOCK_HEADER
|
||||
Last-metadata-block flag as defined in the FLAC[3]
|
||||
file specification.
|
||||
|
||||
The LastMetadataBlockFlag is set to 1 if this
|
||||
MetadataBlock is the last metadata block in the
|
||||
FLACSpecificBox. It is set to 0 otherwise.
|
||||
|
||||
+ BlockType:
|
||||
After the FullBox header, the box contains a sequence of
|
||||
FLAC[3] native-metadata block structures that fill the
|
||||
remainder of the box.
|
||||
|
||||
The BlockType field maps semantically to the FLAC[3]
|
||||
native MEATADATA_BLOCK_HEADER BLOCK_TYPE field as
|
||||
defined in the FLAC[3] file specification.
|
||||
Each FLACMetadataBlock structure consists of three fields
|
||||
filling a total of four bytes that form a FLAC[3] native
|
||||
METADATA_BLOCK_HEADER, followed by raw octet bytes that
|
||||
comprise the FLAC[3] native METADATA_BLOCK_DATA.
|
||||
|
||||
The BlockType is set to a valid FLAC[3] BLOCK_TYPE
|
||||
value that identifies the type of this native metadata
|
||||
block. The BlockType of the first FLACMetadataBlock
|
||||
must be set to 0, signifying this is a FLAC[3] native
|
||||
METADATA_BLOCK_STREAMINFO block.
|
||||
|
||||
+ Length:
|
||||
+ LastMetadataBlockFlag:
|
||||
|
||||
The Length field maps semantically to the FLAC[3]
|
||||
native MEATADATA_BLOCK_HEADER Length field as
|
||||
defined in the FLAC[3] file specification.
|
||||
The LastMetadataBlockFlag field maps semantically to
|
||||
the FLAC[3] native METADATA_BLOCK_HEADER
|
||||
Last-metadata-block flag as defined in the FLAC[3]
|
||||
file specification.
|
||||
|
||||
The LastMetadataBlockFlag is set to 1 if this
|
||||
MetadataBlock is the last metadata block in the
|
||||
FLACSpecificBox. It is set to 0 otherwise.
|
||||
|
||||
+ BlockType:
|
||||
|
||||
The length field specifies the number of bytes of
|
||||
MetadataBlockData to follow.
|
||||
The BlockType field maps semantically to the FLAC[3]
|
||||
native MEATADATA_BLOCK_HEADER BLOCK_TYPE field as
|
||||
defined in the FLAC[3] file specification.
|
||||
|
||||
+ MetadataBlockData
|
||||
The BlockType is set to a valid FLAC[3] BLOCK_TYPE
|
||||
value that identifies the type of this native metadata
|
||||
block. The BlockType of the first FLACMetadataBlock
|
||||
must be set to 0, signifying this is a FLAC[3] native
|
||||
METADATA_BLOCK_STREAMINFO block.
|
||||
|
||||
+ Length:
|
||||
|
||||
The MetadataBlockData field maps semantically to the
|
||||
FLAC[3] native MEATADATA_BLOCK_HEADER
|
||||
METADATA_BLOCKDATA as defined in the FLAC[3] file
|
||||
specification.
|
||||
The Length field maps semantically to the FLAC[3]
|
||||
native MEATADATA_BLOCK_HEADER Length field as
|
||||
defined in the FLAC[3] file specification.
|
||||
|
||||
The FLACMetadataBlock structure consists of three fields
|
||||
filling a total of four bytes that form a FLAC[3] native
|
||||
METADATA_BLOCK_HEADER, followed by raw octet bytes that
|
||||
comprise the FLAC[3] native METADATA_BLOCK_DATA. Taken
|
||||
together, the bytes of the FLACMetadataBlock form a
|
||||
complete FLAC[3] native METADATA_BLOCK structure.
|
||||
The length field specifies the number of bytes of
|
||||
MetadataBlockData to follow.
|
||||
|
||||
Note that a minimum of a single FLACMetadataBlock,
|
||||
consisting of a FLAC[3] native METADATA_BLOCK_STREAMINFO
|
||||
structure, is required. Should the FLACSpecificBox
|
||||
contain more than a single FLACMetadataBlock structure,
|
||||
the FLACMetadataBlock contianing the FLAC[3] native
|
||||
METADATA_BLOCK_STREAMINFO must occur first in the list.
|
||||
+ BlockData
|
||||
|
||||
Other containers that package FLAC audio streams, such as
|
||||
Ogg[4] and Matroska[5], wrap FLAC[3] native metadata without
|
||||
modification similar to this specification. When
|
||||
repackaging or remuxing FLAC[3] streams from another
|
||||
format that contains FLAC[3] native metadata into an ISO
|
||||
BMFF file, the complete FLAC[3] native metadata should be
|
||||
preserved in the ISO BMFF stream as described above. It
|
||||
is also allowed to parse this native metadata and include
|
||||
contextually redundant ISO BMFF-native repackagings and/or
|
||||
reparsings of FLAC[3] native metadata, so long as the
|
||||
native metadata is also preserved.
|
||||
The BlockData field maps semantically to the FLAC[3]
|
||||
native MEATADATA_BLOCK_HEADER METADATA_BLOCK_DATA as
|
||||
defined in the FLAC[3] file specification.
|
||||
|
||||
Taken together, the bytes of the FLACMetadataBlock form a
|
||||
complete FLAC[3] native METADATA_BLOCK structure.
|
||||
|
||||
Note that a minimum of a single FLACMetadataBlock,
|
||||
consisting of a FLAC[3] native METADATA_BLOCK_STREAMINFO
|
||||
structure, is required. Should the FLACSpecificBox
|
||||
contain more than a single FLACMetadataBlock structure,
|
||||
the FLACMetadataBlock containing the FLAC[3] native
|
||||
METADATA_BLOCK_STREAMINFO must occur first in the list.
|
||||
|
||||
Other containers that package FLAC audio streams, such as
|
||||
Ogg[4] and Matroska[5], wrap FLAC[3] native metadata without
|
||||
modification similar to this specification. When
|
||||
repackaging or remuxing FLAC[3] streams from another
|
||||
format that contains FLAC[3] native metadata into an ISO
|
||||
BMFF file, the complete FLAC[3] native metadata should be
|
||||
preserved in the ISO BMFF stream as described above. It
|
||||
is also allowed to parse this native metadata and include
|
||||
contextually redundant ISO BMFF-native repackagings and/or
|
||||
reparsings of FLAC[3] native metadata, so long as the
|
||||
native metadata is also preserved.
|
||||
|
||||
4.3.3 Sample format
|
||||
|
||||
A FLAC sample is exactly one FLAC audio FRAME packet (as
|
||||
defined in the FLAC[3] file specification) belonging to a
|
||||
FLAC bitstreams. The FLAC sample data begins with a
|
||||
complete FLAC FRAME_HEADER, followed by one FLAC SUBFRAME
|
||||
per channel, any necessary bit padding, and ends with the
|
||||
|
||||
A FLAC sample is exactly one FLAC audio FRAME (as defined
|
||||
in the FLAC[3] file specification) belonging to a FLAC
|
||||
bitstreams. The FLAC sample data begins with a complete
|
||||
FLAC FRAME_HEADER, followed by one FLAC SUBFRAME per
|
||||
channel, any necessary bit padding, and ends with the
|
||||
usual FLAC FRAME_FOOTER.
|
||||
|
||||
Note that the FLAC native FRAME_HEADER structure that
|
||||
begins each FLAC sample redundantly encodes channel count,
|
||||
sample rate, and sample size. The values of these fields
|
||||
must agree both with the values declared in the FLAC
|
||||
METADATA_BLOCK_STREAMINFO structure as well as the
|
||||
FLACDSampleEntry box.
|
||||
Note that the FLAC native FRAME_HEADER structure that
|
||||
begins each FLAC sample redundantly encodes channel count,
|
||||
sample rate, and sample size. The values of these fields
|
||||
must agree both with the values declared in the FLAC
|
||||
METADATA_BLOCK_STREAMINFO structure as well as the
|
||||
FLACSampleEntry box.
|
||||
|
||||
4.3.4 Duration of a FLAC sample
|
||||
|
||||
The duration of any given FLAC sample is determined by
|
||||
The duration of any given FLAC sample is determined by
|
||||
dividing the decoded block size of a FLAC frame, as
|
||||
encoded in the FLAC FRAME's FRAME_HEADER structure, by the
|
||||
value of the timescale field in the Media Header Box.
|
||||
@ -337,11 +335,11 @@ Table of Contents
|
||||
specification.
|
||||
|
||||
4.3.6 Random Access
|
||||
|
||||
|
||||
This subclause describes the nature of the random access of FLAC sample.
|
||||
|
||||
4.3.6.1 Random Access Point
|
||||
|
||||
|
||||
All FLAC samples can be independently decoded
|
||||
i.e. every FLAC sample is a sync sample. The Sync
|
||||
Sample Box shall not be present as long as there are
|
||||
@ -352,7 +350,7 @@ Table of Contents
|
||||
4.4 Basic Structure (informative)
|
||||
|
||||
4.4.1 Initial Movie
|
||||
|
||||
|
||||
This subclause shows a basic structure of the Movie Box as follows:
|
||||
|
||||
+----+----+----+----+----+----+----+----+------------------------------+
|
||||
@ -420,7 +418,7 @@ Table of Contents
|
||||
in ISO/IEC 14496-12 [1]. The additional boxes and the
|
||||
additional requirements, restrictions and recommendations
|
||||
to the other boxes are described in this specification.
|
||||
|
||||
|
||||
4.5 Example of Encapsulation (informative)
|
||||
[File]
|
||||
size = 17790
|
||||
@ -581,15 +579,16 @@ Table of Contents
|
||||
samplerate = 48000.000000
|
||||
[dfLa: FLAC Specific Box]
|
||||
position = 484
|
||||
size = 48
|
||||
Version = 0
|
||||
MetadataBlocks = 1
|
||||
LastMetadataBlockFlag = 1
|
||||
BlockType = 0
|
||||
Length = 34
|
||||
MetadataBlockData[34];
|
||||
size = 50
|
||||
version = 0
|
||||
flags = 0x000000
|
||||
[FLACMetaDataBlock]
|
||||
LastMetadataBlockFlag = 1
|
||||
BlockType = 0
|
||||
Length = 34
|
||||
BlockData[34];
|
||||
[stts: Decoding Time to Sample Box]
|
||||
position = 490
|
||||
position = 492
|
||||
size = 24
|
||||
version = 0
|
||||
flags = 0x000000
|
||||
@ -598,7 +597,7 @@ Table of Contents
|
||||
sample_count = 18
|
||||
sample_delta = 1920
|
||||
[stsc: Sample To Chunk Box]
|
||||
position = 514
|
||||
position = 516
|
||||
size = 40
|
||||
version = 0
|
||||
flags = 0x000000
|
||||
@ -612,7 +611,7 @@ Table of Contents
|
||||
samples_per_chunk = 5
|
||||
sample_description_index = 1
|
||||
[stsz: Sample Size Box]
|
||||
position = 554
|
||||
position = 556
|
||||
size = 92
|
||||
version = 0
|
||||
flags = 0x000000
|
||||
@ -637,7 +636,7 @@ Table of Contents
|
||||
entry_size[16] = 962
|
||||
entry_size[17] = 848
|
||||
[stco: Chunk Offset Box]
|
||||
position = 646
|
||||
position = 648
|
||||
size = 24
|
||||
version = 0
|
||||
flags = 0x000000
|
||||
@ -645,10 +644,10 @@ Table of Contents
|
||||
chunk_offset[0] = 686
|
||||
chunk_offset[1] = 12985
|
||||
[free: Free Space Box]
|
||||
position = 670
|
||||
position = 672
|
||||
size = 8
|
||||
[mdat: Media Data Box]
|
||||
position = 678
|
||||
position = 680
|
||||
size = 17001
|
||||
|
||||
5 Acknowledgements
|
||||
|
Loading…
Reference in New Issue
Block a user