Estimation of MP3 Duration

To calculate the duration of one MP3 file, we need to know how it is encoded first. The estimation of durtation is different based on how they are encoded.

CBR(Constant Bitrate) vs VBR(Variable Bitrate) Encoding

MP3 can be encoded either with constant bitrate(CBR) or variable bitrate(VBR). The quality of the MP3 encoded with VBR is better than one with CBR since each frame can adopt different bitrate where the music needs it, while the CBR file uses same bitrate regardless of what sound wave is.

How to know whether the file is CBR or VBR

MP3 header has two types: Xing and VBRI. The ID of the Xing header is either Xing or Info; The ID labelled to VBRI is VBRI header.

The ID of Xing header marked with Info is definitely encoded with CBR. Nonetheless, assigning Xing as the header ID of a CBR file is logically acceptable because CBR is a special case of VBR.

How to estimate the duration

CBR(Constant Bitrate)

The calculation for CBR MP3 is straightforward:

\[\text{Duration (seconds)} = \frac{ \text{File Size (bits)} }{ \text{Bitrate (bits/second)} }\]

The unit of file size is usually bytes, so the bit size of the file is file size(bytes) * 8.

VBR(Variable Bitrate)

The calculation for VBR MP3 is a little complicated:

\[\text{Duration (seconds)} = \frac{ \text{ Samples Per Frame $\cdot$ Total Frames (samples)} }{ \text{Sample Rate (samples/second)} }\]

The duration is not accurate when the total frames above is an estimated value. If the total frames isn’t predefined, then it could be estimated by:

\[\text{Estimated Total Frames} = \frac{ \text{File Size} }{ \text{Average Frame Size} }\]

Live stream

No matter what type the MP3 is encoded with, the duration is calculated by the file size, but what if the file size is unknown? Before addressing the question, we should ask what type of media will have an unknown file size. The answer is live stream.

The live stream can be closed anytime, hence we don’t need to calculate the duration beforehand. We just need to make sure the position of the playback stays at the end of the media track and the end-time keeps increasing.

(The position stays at the end of the media track during streaming.)

However, for those live streams with opening remark, we still need to estimate how long the opening talks will be and show the playback UI as they are non-live streams before they finish introducing and start streaming. After finishing the opening talks, the UI should behave as the same as they are live streams.

(You can check this by opening KHNY Honey 103 directly, or go to shoutcast and play KHNY Honey 103 under Genre Jazz)

As file size is unknown, we should use number of frames (example) to calculate the duration of the opening introduction(example). (The number of frames here is same as the total frames mentioned in the estimation of VBR MP3’s duration.)

To sum up this case, the ending time shown on playback at first should be the duration of the opening talk. After the playback’s position reaches the end, it should stay at the end and the duration should keep increasing during streaming music.

How to know whether the file size is unknown

Usually, the file size of a live stream will be set to -1.

Sample code

Useful References

These points are the summary of what I’ve learned from bug 1419736. You can see more detail there. It’s my first bug in demuxer field. I quickly write a note here in case I need to recall it someday.

The following links are some useful resources I found when I tried to get into this field: