View Single Post
Old 03-30-2026, 05:39 AM   #77
jbjb
Somewhat clueless
jbjb ought to be getting tired of karma fortunes by now.jbjb ought to be getting tired of karma fortunes by now.jbjb ought to be getting tired of karma fortunes by now.jbjb ought to be getting tired of karma fortunes by now.jbjb ought to be getting tired of karma fortunes by now.jbjb ought to be getting tired of karma fortunes by now.jbjb ought to be getting tired of karma fortunes by now.jbjb ought to be getting tired of karma fortunes by now.jbjb ought to be getting tired of karma fortunes by now.jbjb ought to be getting tired of karma fortunes by now.jbjb ought to be getting tired of karma fortunes by now.
 
Posts: 798
Karma: 11569273
Join Date: Nov 2008
Location: UK
Device: Kindle Oasis
Quote:
Originally Posted by ratinox View Post
"More detail"

That's not what bit depth is.

A sample is a value at a point in time. It is a number from 0 to whatever the maximum bit size which for 16 bits is 65,535. As previously explained, 16 bits covers a dynamic range of 96dB. If the sample is within 0dB and 96dB then this number is going to be between 0 and 65,535 (64K) with 64K being the loudest possible sound.

What happens if you increase this to 24 bits, for 144dB at the top? This number is going to be between 0 and 16,777,215, right?

Here's why that video is nonsense: 65,535 is the same value whether it's represented with 16 bits or with 24 bits. If the dynamic range of the input is within that 96dB "window", which it usually will be because human hearing, then you will get the EXACT SAME SAMPLES whether you use 16 bits or 24 bits.
The use of 24 bits doesn't mean you're just encoding louder sounds - it can be used to capture the same sounds at higher resolution, leading to lower levels of quantisation noise.

Let's say you're encoding a signal which has a maximum amplitude of 1V peak-to-peak (in the range of -0.5V to 0.5V). CD 16-bit audio gives a range of values from -32768 to 32767 (8000 to 7FFF in hex). If we're talking about signals which are balanced around zero, which we usually are for audio, we'll ignore the extra negative value at -32768 and represent -0.5V with -32767 (8001 hex) and 0.5V with 32767 (7FFF hex). So we have 65535 possible sample values, and are splitting the 1V peak to peak into 65534 steps. This means that the maximum error (quantisation noise) on each sample will be +/-(1/(2*65534))V, or about +/-7.6uV (ignoring dithering)

With 24-bit audio in this scenario, we'd encode -0.5V as 800001 hex and 0.5V as 7FFFFF, dividing the 1V into 16777214 steps and leading to an error range of +/- 0.03uV - a quantisation noise floor 256 times better in terms of voltage (65536 times better in terms of power as power goes with the square of voltage).

So, the question isn'i whether or not 24 bits can encode audio more accurately, with much better SNR and much better dynamic range (it definitely can), it's whether or not this improvement is actually detectable when using real-world amplifiers and speakers etc., and when listened to by human ears. I've yet to see any convincing tests which indicate that people can tell the difference, so 24-bit seems pointless as a distribution format.

It does still make sense to do manipulation and mixing at 24-bit, however, to stop the errors at each stage accumulating.
jbjb is offline   Reply With Quote