MP4 File Format Part 2
ISO IEC 14496-12 defined the base media file format for MPEG file structure.
ISO IEC 14496-15 Information technology — Coding of audio-visual objects — Part 15: Advanced Video Coding (AVC) file format extends part 12 to provide specific atom/box type for AVC (H.264)
As mdat is about frame, I have to mention about AVC sample structure
ISO 14496-15 define AVC sample structure as externally framed sample and have a frame length supplied by external framing. Thus, AVC access unit means a set of NAL units where each NAL has
- a usually 4 bytes fields to denote the frame size
- followed by a NAL unit
With that in mind, the screenshot shows a the red box denote the frame size (4 bytes). The blue box is the start of the frame, in this case, it is H.264 Non-IDR frame. See 5.2.3 of that document.
AVC Decoder Configuration Record
H.264 require decoder configuration data to initialize the decoder prior to any decoding process. Thus, MP4 file must have this record in the Movie Box.
These decoder configuration record data are stored in STSD (sample decription box) - Visual Sample Entry. For H.264, this is stored in a avc1 atom.
Before going into that, I should provide some information regarding the AVC decoder configuration record. Below is the AVC decoder configuration record in a class structure.
configurationVersion - 8 bits int value that is always 1. If decoder see unrecognized version, the decoder should not decode the stream
AVCProfileIndication - 8 bits int value that contains profile code in ISO IEC 14496-10
profile_compatibility - 8 bits int value that exactly the same byte that occurs between profile_IDC and level_IDC in the SPS
AVCLevelIndication - 8 bits int value that define the level code
lengthSizeMinusOne - 2 bits int value that indicate the length in bytes of the NALUnitLength field in an AVC video sample
numOfSequenceParameterSets - 5 bits int value that indicate the number os SPSs that are used as the initial set of SPSs for decoding the AVC elementary stream
sequenceParameterSetLength - 16 bits int value that indicate the length in bytes of SPS
sequenceParameterSetNALUnit - the actual SPS. The length is defined by the preceding sequenceParameterSetLength field
numOfPictureParameterSets - 8 bits int value that indicate the number of PPS that are used as the initial set of PPSs for decoding the AVC elementary stream
pictureParameterSetLength - 16 bits int value that indicate the length in bytes of the PPS
pictureParameterSetNALUnit - the actual PPS. The length is defined by the preceding pictureParameterSetLength field
acv1 -Sample Description Name And Format
avc1 is an AVC visual sample entry. It has a avcC (AVC Configuration Box) atom. This atom contains an AVCDecoderConfigurationRecord as state above.
I have broker down and highlighted those value in acvC atom
Red - these 4 bytes in red denote the length of the atom, including the length and type field
Green - these 4 bytes in green denote the type field
Purple - those bytes surrounded by purple is the AVCDecoderConfigurationRecord. So, the length of AVCDecoderConfigurationRecord is length of atom - 8
By following AVCDecoderConfigurationRecord structure, you can parse the values easily.
0x01 - the first byte is configurationVersion = 1
0x42 - the second byte is AVCProfileIndication = 66
0xC0 - the third byte is profile_compatibility = 192
0x15 - the fourth byte is AVCLevelIndication = 21
Reference: ISO IEC 14496-12 and ISO IEC 14496-15