pink fish media

Go Back   pink fish media > discussion > audio

Notices

Reply
 
Thread Tools
  #1  
Old 09-02-17, 05:13 AM
Jim Audiomisc Jim Audiomisc is offline
pfm Member
 
Join Date: Mar 2016
Posts: 2,138
Noise Shaping

I've continued to think about ways to reduce the 'sea of noise bits' that tend to occupy many of the lowest bits per sample of most 'high rez' files and streams and bloat their sizes. It occured to me to wonder if the technique known as 'Noise Shaping' might help.

If anyone is interested, they might like to have a look at

http://www.audiomisc.co.uk/MQA/intos...ngHighRez.html

where I investigate this for basic feasibility.
Reply With Quote
  #2  
Old 09-02-17, 05:37 AM
Julf Julf is offline
Evil brother of Mark V Shaney
 
Join Date: Dec 2010
Posts: 7,469
Quote:
Originally Posted by Jim Audiomisc View Post
If anyone is interested, they might like to have a look at

http://www.audiomisc.co.uk/MQA/intos...ngHighRez.html
Interesting discussion - and yes, explaining noise shaping without resorting to the proper mathematical tools is a challenge.

In any case, it really boils down to "there is no free lunch". You can trade sample rate for number of bits and vice versa (DSD is the perfect example), but the total amount of data stays the same if you want to represent the original waveform as precisely as possible. The question is where the optimal trade-off point is. Do we really need to represent more than 16 bits of amplitude, considering the limited dynamic range of any real-life recording?
Reply With Quote
  #3  
Old 09-02-17, 06:43 AM
Jim Audiomisc Jim Audiomisc is offline
pfm Member
 
Join Date: Mar 2016
Posts: 2,138
Quote:
Originally Posted by Julf View Post
Interesting discussion - and yes, explaining noise shaping without resorting to the proper mathematical tools is a challenge.

The question is where the optimal trade-off point is. Do we really need to represent more than 16 bits of amplitude, considering the limited dynamic range of any real-life recording?
FWIW I've now added a link to a demo copy of the program I wrote to generate the example results. This may help some to get the idea. But I'm a lousy programmer so apologies to good coders who have a weak stomach. :-)

For the purposes of argument I assumed that 16bit should be the target for the output LPCM. I doubt every real recording gets anywhere needing that given how many recordings have three quarters of booger-all in terms of dyanmic range. But that's another story... The advantage of 16bit is that its a standard value. I have wondered about 384k 8bit Noise Shaped optimally. I wonder if that would make more sense than SACD/DSD as at least you *can* dither it properly without hitting its endstops. But it isn't exactly a common standard and might put people off. 8-]

Anyway, I was really just trying to examine the concept and invite people to have a think about it. More the better if it also helps people twig Noise Shaping.

I did look around for suitable filter values to get optimum shaping for audible noise reduction. But the examples I found all seem to assume you want the output at 44.1k or 48k when I want here to keep the rate as per the input 'high rez' and preserve that. I found the Lipshitz paper that lists some examples, but dunno how to tranform them up for the high rates.
Reply With Quote
  #4  
Old 09-02-17, 08:01 AM
Julf Julf is offline
Evil brother of Mark V Shaney
 
Join Date: Dec 2010
Posts: 7,469
Quote:
Originally Posted by Jim Audiomisc View Post
FWIW I've now added a link to a demo copy of the program I wrote to generate the example results. This may help some to get the idea. But I'm a lousy programmer so apologies to good coders who have a weak stomach. :-)
It is nowhere as bad as some of the code I have to deal with

Quote:
The advantage of 16bit is that its a standard value. I have wondered about 384k 8bit Noise Shaped optimally. I wonder if that would make more sense than SACD/DSD as at least you *can* dither it properly without hitting its endstops. But it isn't exactly a common standard and might put people off.
I agree. Considering 192k/16 accomplishes the same result and takes up the same amount of data, but works with existing systems, I don't think it would catch on - even with the marketing budget of MQA.
Reply With Quote
  #5  
Old 09-02-17, 12:35 PM
martin clark martin clark is offline
pinko bodger
 
Join Date: Jul 2003
Posts: 11,088
Interesting thread scope...

Quote:
Originally Posted by Jim Audiomisc View Post
For the purposes of argument I assumed that 16bit should be the target for the output LPCM. I doubt every real recording gets anywhere needing that given how many recordings have three quarters of booger-all in terms of dyanmic range. But that's another story... The advantage of 16bit is that its a standard value.
Quite, the elephant in the 'HD' room - while there remains advantage capturing and editing at higher edit depth (purely to take the residue of editing dsp well below the noise floor) the reality is - studio mics and preamps struggle to achieve even 18bits of resolution at optimal 'level'; capsule size vs. hf extension vs. thermal noise vs a few other considerations etc. And if it was recorded to tape... seems a shame we cant do something constructive with 'the 'sea of noise bits' that tend to occupy many of the lowest bits per sample' (great phrase!)

For instance - some (considerable) while ago Werner posted a link here to a recording reduced to dithered & noise-shaped at just 4-bit LPCM output; at the time many couldn't believe the result. It is certainly educational.
Reply With Quote
  #6  
Old 10-02-17, 02:12 AM
adamdea adamdea is offline
pfm Member
 
Join Date: Jul 2006
Posts: 4,966
Quote:
Originally Posted by Jim Audiomisc View Post
FWIW I've now added a link to a demo copy of the program I wrote to generate the example results. This may help some to get the idea. But I'm a lousy programmer so apologies to good coders who have a weak stomach. :-)

For the purposes of argument I assumed that 16bit should be the target for the output LPCM. I doubt every real recording gets anywhere needing that given how many recordings have three quarters of booger-all in terms of dyanmic range. But that's another story... The advantage of 16bit is that its a standard value. I have wondered about 384k 8bit Noise Shaped optimally. I wonder if that would make more sense than SACD/DSD as at least you *can* dither it properly without hitting its endstops. But it isn't exactly a common standard and might put people off. 8-]

Anyway, I was really just trying to examine the concept and invite people to have a think about it. More the better if it also helps people twig Noise Shaping.

I did look around for suitable filter values to get optimum shaping for audible noise reduction. But the examples I found all seem to assume you want the output at 44.1k or 48k when I want here to keep the rate as per the input 'high rez' and preserve that. I found the Lipshitz paper that lists some examples, but dunno how to tranform them up for the high rates.
Thanks Jim , this is a really interesting article.
My feeling is that what stands in the way of people grasping this stuff is the fundamental point that once dithering is taken into account the bit depth merely defines the overall noise level.

Once that is grasped, it's fairly easy to have a rational conversation about what is required because its clear that 16 bits is enough for any (final format distribution) need and also that a "rectangular box" noise function is pretty wasteful even at 44.1kHz. But the digital stair step idea runs deep.

However- back to noise shaping- one point which has been made in the past is that noise shaping might be dangerous if one is to perfom dsp susbsequently eg for eq or room processing or digital speaker filtering. Do you have any views on that. Also what level of ultrasonic garbage is safe?

On the 8/384 point, whilst it might have some merit, I really struggle as to the need anything more than 16/96. If there really were any need for any more than 16/44, then I would expect that 16/96 would nail it. Those who insist on high rez seem to experience continuously improving returns as one gets beyond the possible corner case (24/96) through the hugely implausible (24/192), to the absurd (DXD) and the frankly get a grip on yourself man and get a life (DSD 512 and beyond).

This is the hallmark of foo: there are no limits to the imagination. Sadly this leaves the problem that no one is really interested in defining a reasonable spec for something more than 16/44 which covers the possibility that it might be too tight a spec but without wasting bits unnecessarily. Unless (possibly) that's what MQA is (wrapped up in a blanket of marketing nonsense)
Reply With Quote
  #7  
Old 10-02-17, 02:25 AM
DANOFDANGER DANOFDANGER is offline
pfm Member
 
Join Date: Sep 2012
Posts: 481
Quote:
Originally Posted by adamdea View Post
This is the hallmark of foo: there are no limits to the imagination. Sadly this leaves the problem that no one is really interested in defining a reasonable spec for something more than 16/44 which covers the possibility that it might be too tight a spec but without wasting bits unnecessarily. Unless (possibly) that's what MQA is (wrapped up in a blanket of marketing nonsense)
Couldnt agree more.
Reply With Quote
  #8  
Old 10-02-17, 03:42 AM
Jim Audiomisc Jim Audiomisc is offline
pfm Member
 
Join Date: Mar 2016
Posts: 2,138
Quote:
Originally Posted by adamdea View Post
However- back to noise shaping- one point which has been made in the past is that noise shaping might be dangerous if one is to perfom dsp susbsequently eg for eq or room processing or digital speaker filtering. Do you have any views on that. Also what level of ultrasonic garbage is safe?
Someone would have to give me some more details wrt the idea that shaping is 'dangerous' for later DSP before I could really comment on that. But the reality is that most decent systems from the ADC onwards now will employ dither and noise shaping anyway.

Similarly, I'm not sure what the dividing line between 'safe' and 'unsafe' might be. But consider SACD/DSD which is essentially swamped in dither and noise shaping, and has to be because 'one bit' makes that unavoidable. If people think that is 'safe' then a few LSB-worth of shaped dither for 16bit would seem many orders of magnitude 'safer' to me.

People worried by this might also like to chew on the thought that having some shaped noise at HF may actually help *linearise* the behaviour of later stages in the chain like the DAC or even power amp.

As an opinion, I think that 96k/16 decently made and shaped should be fine for an audio delivery format. However it makes sense to use a higher rate and 24 bit for source recordings at the *start* of the process, though. Just as it makes sense to ensure the peaks don't get closer than a few dB to the max.

That to me seems good practice to simply to give more 'elbow room' for the recording process and any following reprocessing into the final result sold to the end-users.

IIRC Bob Stuart published at least one paper which said much the same many years ago. The problem is that none of this is new. It is that people in general don't seem aware of it.

FWIW I don't have any real worries if people prefer to use 192k/16 or higher rates. I suspect the reality is that - once you've dodged the wasted noise bits - there isn't much up there anyway. So once FLACed it wouldn't change the file size much. And an advantage of removing the excess noise is that you've taken out the 'added fat' and people can then see more clearly how much audio was in the package because the FLACed size is a better guide to the amount of content! 8-]

That might help people to stop judging by the size of the *box*.

As an aside, this might also help people to realise the implications when a HFN examination of a 'high rez' download shows it is actually a high rate 24 bit version of DSD with a *lot* of HF process noise.

Last edited by Jim Audiomisc; 10-02-17 at 03:53 AM.
Reply With Quote
  #9  
Old 10-02-17, 04:04 AM
davidsrsb davidsrsb is offline
pfm Member
 
Join Date: Jan 2013
Posts: 5,673
96/8 would probably work well, while keeping file size sane. The limit case of single bit suffered from clock jitter sensitivity, so 8 bit is likely a sensible compromise, suiting computer architectures and 96k is high enough to allow plenty of room to low pass the shaping noise out
Reply With Quote
  #10  
Old 10-02-17, 04:59 AM
Tony L Tony L is offline
Administrator
 
Join Date: Jul 2003
Posts: 57,153
I must admit I've only skimmed Jim's article as the bits and bytes of digital audio are way beyond my current pay grade, but even so I've been around digital pro-audio pretty much from it's birth so have a few views/hunches.

I agree completely that bit-depth is not the issue for domestic audio and that 16bits used correctly is way more dynamic range than 98% of audio systems would have a hope in hell of handling, and even if they could chances are you'd not want to be in the room without ear-protection. It equates to about 96db, so given we can't really hear much at all below about 40db that is a heck of a lot of usable range. A look at the real DR stats for pop and rock CDs show just how little is often used. I really don't think below 16 bit is worth considering though, I have too many memories of crunchy 8 bit samplers etc, though I guess much of that was the low sample-rate, which I am convinced is where the issue lies.

I am absolutely convinced that the problems with standard red-book is not the bit depth or frequency range, but the impact of the filter that abruptly cuts of all the malformed digital noise above 22kHz. This being why CD players, DACs etc sound different, and this being why well-sorted 96kHz or above recordings just tend to sound 'more analogue'. As such and without having the math or anything to prove it my hunch is 16/96 is all anyone should ever need assuming the whole recording process is to that standard or it is a transcription of a analogue tape. It just gets that filter right up and out of the way. It likely won't do anything much to help old digital masters recorded to DAT etc, which is actually a heck of a lot of music from the mid-80s onward.

By saying all that bog standard red book 16/44 can sound superb when really done right!
Reply With Quote
  #11  
Old 10-02-17, 06:13 AM
Jim Audiomisc Jim Audiomisc is offline
pfm Member
 
Join Date: Mar 2016
Posts: 2,138
I'd be wary of 96k/8bit for reasons akin to Tony's comments about the problems of reconstruction filters, etc, with 44.1k/48k.

To get 96k/8bit to work you'd need reasonably high order Noise shaping and a somewhat higher HF noise level than 96k/16bit. That would take things a lot close a lot of the safety-space between what is required and what might often be done.

Nice thing about low-order Shaping is that it isn't difficult to do reasonably well and still give a clear 'space' for the audio. Above third order can become difficult to do. And the harder something is to do well, the more scope there is for the music biz to louse it up! :-/
Reply With Quote
  #12  
Old 10-02-17, 07:46 AM
davidsrsb davidsrsb is offline
pfm Member
 
Join Date: Jan 2013
Posts: 5,673
Yes, you would need to add pre-emphasis and de-emphasis to stand a chance. One thing the MQA debate and analysis has shown is that you don't need to handle anything like 0dbFS above 20 kHz
Reply With Quote
  #13  
Old 10-02-17, 10:37 AM
darrenyeats darrenyeats is offline
pfm Member
 
Join Date: Sep 2011
Posts: 4,751
I've been playing around with filters since I started doing upsampling via Squeezebox/SoX. I've used only linear phase, non-imaging filters, the only tweaking I've done is to where the passband ends. SoX always gives you an optimally smooth transition band.

ISTM that relaxing the "flat to 20kHz" thing helps a lot. Passband to 19kHz is enough - in fact, I'd be very happy if my hearing approached 19kHz! Am yet to try 18kHz.

A reconstruction filter rolling off 19-22kHz sounds better to me at the top end - for some recordings anyway - and I don't feel I'm missing anything. In fact the opposite. I don't know whether this is:
1. Simply the amps/speakers being asked to produce less energy >19kHz.
2. A gradual filter roll off. Does linear phase mean this should not matter? Don't know enough. I assume brick wall filters are done because they are easy/cheap/low processing power though.

I'm thinking there's a third possibility:
3. Mitigation of problems caused by filters used in producing music for 44/48kHz distribution (whether in ADCs or in studio). Again, I assume brick wall digital filters are done because they are easy/cheap/low processing power. But filter shape aside, fair to say a lot digital filtering falls significantly short of current SOTA (which I believe is Saracon), and http://src.infinitewave.ca/ indicates the errors in most filters, whatever their general level, tend to build toward higher frequencies. So using a good up-sampling filter to undercut higher frequency problems caused by a poor down-sampling filter could improve quality on average! This would obviously depend on the recording - care taken, vintage of ADCs, resamplers etc.

The above would be in combination with non-linearity from amp/speakers, with inaudible frequencies producing distortion in the audible band.

SoX VHQ is not far behind Saracon in quality.
__________________
Check it, add to it! http://dr.loudness-war.info

Last edited by darrenyeats; 10-02-17 at 02:54 PM.
Reply With Quote
  #14  
Old 10-02-17, 06:00 PM
davidsrsb davidsrsb is offline
pfm Member
 
Join Date: Jan 2013
Posts: 5,673
Relaxing the filter 3dB point down to 19kHz simplifies the filter design.
Most CD players are made with >20kHz brick wall filters because the chip does it and specmanship.
Not feeding a typical dome tweeter with >19kHz energy that we cannot hear anyway avoids breakup nasties
Reply With Quote
  #15  
Old 11-02-17, 12:39 AM
darrenyeats darrenyeats is offline
pfm Member
 
Join Date: Sep 2011
Posts: 4,751
Re:OP I use TPDF in SoX - see https://en.m.wikipedia.org/wiki/Dither
"If the signal being dithered is to undergo further processing, then it should be processed with a triangular-type dither that has an amplitude of two quantisation steps"

Given the upsampling, oversampling, filtering, sigma-delta etc that goes on in modern DACs, I'd go TPDF except with a NOS DAC.
__________________
Check it, add to it! http://dr.loudness-war.info
Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 08:42 PM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2017, vBulletin Solutions Inc.
pink fish media