It’s common mistake to to consider CBR (Constant Bit Rate) as “every frame is allocated the same number of bits”. If it were the case, then what would be the purpose of P or B frames? The whole purpose of P/B frame is to reduce the number of bits by referencing another frame. Of course, there are a lot of CBR streams with P or B frames. You can easily see every frame have very different number of bits even in a CBR stream.
So, what is CBR? In MPEG-2 and H.264, CBR means the number of bits fed to the decoder is constant over time. In other words, the data transfer rate to the decoder is constant. It’s nothing to do with the number of bits of individual frames.
Confused? How is it possible to allocate different number of bits to frames while keeping the incoming data rate constant?
Answer: you need a buffer. To understand the logic, consider a water outlet, a water tank, and a series of “picture decode guys” lined up in front of the tank.
The water (coded MPEG-2 or H.264 stream) is constantly flowing into the tank. The guys are lined up in front of the tank and remove the water for each frame to be decoded. The removal happens at the fixed time interval in most cases.
Even though Mr.I, P, and B are removing different amount of water (=each frame needs different number of bits), the water outlet speed is constant thanks to the tank (buffer).
In MPEG-2, the buffer is called VBV buffer (Video Buffer Verifier Buffer). In H.264, the buffer is called CPB (Coded Picture Buffer).
The water level of the tank at certain time instance is called buffer fullness and described in number of bits. The size of the tank is called VBV buffer size in MPEG-2 and CPB buffer size in H.264.
The coded stream must be constructed so that the tank (=buffer) never overflow or underflow. There are commercial/non-commercial software called “buffer verifier” to check the errors.
When the buffer size is set to large value (it’s an encoded stream parameter), the encoder can use large variance of bits for each frame which generally results in better video quality. However, the decoder needs to have the large buffer, which means more expensive hardware.