All times are UTC+02:00




Post new topic  Reply to topic  [ 42 posts ]  Go to page 1 2 3 4 5 Next
Author Message
PostPosted: Thu Jul 17, 2008 2:41 pm 

Joined: Mon Jun 09, 2008 2:45 pm
Posts: 104
I have developed a preset which keeps dynamic range (per song) as much inline with the source as reasonably possible. This will keep the "kick" on snare and bass drums intact. This will also maintain an audio sound similar to the artist's/mastering engineer's intent, while normalizing the audio between tracks.

aacPlus Processing:
When preprocessing audio for aacPlus, especially at lower bitrates, it is important to understand how the CODEC behaves under compression and limiting environments to prevent artifacting. With aacPlus (V1 or V2), there are a few types of artifacting which can occur under certain audio conditions, primarily due to the Spectral Band Replication (SBR) algorithm which reproduces high frequencies using a noise generator.

What is SBR? Simply put, SBR takes high frequencies, translates them algorithmically onto the lower frequency range. Upon encoding, the CODEC encodes only the lower frequencies (with the higher frequency content riding "on top"). Upon decoding, the CODEC decodes the lower frequencies, and simultaneously translates the high frequency content back to the high frequency range as appropriate. The high frequency content has to be reproduced with a noise generator since the true content was never actually encoded.

So, what artifacts actually arise? First and foremost, there is a phenomenon I call percussion "slush". Percussion "slush" occurs primarily when a loud bass tone is played, and the peak of it coincides with a loud peak of something in the SBR range. Any cymbals will then sound like they aren't cymbals but actually sheets of water being smashed with a mallet. You can avoid this by processing the sharper peaks of high frequency audio with quick and abrupt clipping. Or, you can attenuate the audio to reduce the peaks and eliminate the phenomenon overall. As you can see with my algorithm, I am employing both techniques. The upper three bands are the ones primarily in question; the uppermost and lowermost of those bands are being attenuated; the midmost of those bands are being clipped sharply as appropriate. This band represents a lot of the higher frequency content; the nature of the content in that band tends to be momentary, so slushiness isn't as noticeable there. Furthermore, attenuating that band too drastically will create a "muffled" high frequency sound. This "muffled" sound will fatigue listeners over time. Thus, I think clipping is the appropriate way to optimize that band for encoding, and it serves as an effective de-esser too.

Secondly, there is a phenomenon I call "hi-hat inconsistency". This is a phenomenon which occurs when a hi-hat is played back rapidly and repetitively; the hi-hat sounds going into the encoder sound the same, but the hi-hat sounds going out of the encoder each sound different. Their amplitudes and phases will differ from one another. This occurs when there is a large amount of bass being encoded while hi-hats are being encoded simultaneously. The demand on the CODEC is too great, and the percussion suffers through aacPlus's lossy algorithm. Attenuating these hi-hats as appropriate is important to help avoid this. Thus, pre-limiting is used to adjust the gain of the bass and treble to help unify the response between songs, and the multiband compression levels are set to coordinate with this accordingly to avoid this issue. Centering bass is also another key aspect in alleviating this issue.

Thirdly, there is a phenomenon I call "booosh woooosh fooosh chee chee chee shhhhhhhhhhhhhh". This is a phenomenon which occurs when you aggressively compress the higher frequencies and cause a great amount of intermodulation distortion. This is another issue inherent to SBR where the pre-limiter (handling high frequency content gently before multiband compression) helps drastically. If you rapidly change the volume of an audio band, you are pushing the amplitude of the audio band up and down accordingly. This practice in and of itself can cause intermodulation, as you have now created a new audio crest or trough, which can create coding nightmares with high frequency content, as these issues will arise in frequencies slightly higher than those you are manipulating. Thus, hard clipping is limited to lower frequency content (except that solitary high frequency band mentioned above).

Finally, specifically with aacPlus V2 and "Parametric Stereo" (PS), there are unique issues which can arise pertaining to aggressive phase manipulation. Audio nullification ? audio "cancelled out" ? can occur from improper phase manipulation. Phase and width settings must be utilized with care, or you could potentially cancel out voices in the encoding process (but you will hear them prior to encoding!) Obviously, unless you want to lose your vocal section ? this is undesirable. Parametric Stereo is a "steering" algorithm; in aacPlus V2 with PS, the audio is encoded as monaural. Steering information is added to the audio stream to guide frequencies as appropriate to the left and right channels. This can double effective coding bitrate based on the source audio. For example, without parametric stereo, aacPlus audio at 40kbps "Stereo" would be treated as two 20kbps "monaural" streams regardless of the source content being monaural or stereo at that time. With "Parametric Stereo", however, audio encoded at 40kbps being monaural at that moment will get 40kbps of audio bandwidth. Once stereo content plays, about 3kbps of audio is utilized to steer that audio left or right, as appropriate. The monaural portion of the content still gets 37kbps of audio bandwidth, resulting in overall higher audio fidelity.

Also, there is "squeakiness". This is where midrange content can squeak when the phase is misaligned, such as with many lower bitrate (~128kbps) MP3s. The azimuth settings alleviate this issue almost completely.

The overall algorithm provides as minimal manipulation of the source audio as possible; the average user shouldn't notice substantial manipulation of the audio aside from normalization and an overall pleasant listening experience, especially at lower audio bitrates of 24kbps, 32kbps and 40kbps (bitrates frequently employed by streams across the internet, and also on XM satellite radio, which does a hardware version of what I am accomplishing here).

I have attached audio waveforms prior to processing, after processing but prior to encoding, and after encoding to show its effectiveness. Feel free to use these to help you with improving sound quality. I will update this preset accordingly as I discover new tricks to aacPlus processing.

Image

The topmost waveform is the source audio. The center waveform is the audio after processing, but prior to encoding. The bottom waveform is the audio after processing and encoding; the CODEC utilized was 32kbps aacPlus V2 with Parametric Stereo enabled.

Hvz, if you like this concept and approach, feel free to implement this as a built-in preset for aacPlus streaming!
PRESET [aacPlus CODEC Optimization]:
Code:
[Common]
Pre amplifier=2.800000191
Post amplifier=0.899999976
Extra loudness=1
Hard limit output=1
Downsample very high input sample rates to near 44.1 kHz=1
Process for low latency=0
Mode=Advanced
[Noise Gate]
Enabled=1
Difference=0
Noise level=2
[Singleband Compressor]
Enabled=0
Difference=0
Maximum volume=10
Maximum value=32767
Attack speed=0.999998987
Decay speed=0.999000013
Above Top Limiter=1
[Pre Compressor]
Enabled=1
Difference=0
Delay enabled=0
Maximum volume - Band 1=20500
Maximum volume - Band 2=17500
Attack speed - Band 1=0.000002059
Attack speed - Band 2=0.000002059
Decay speed - Band 1=0.013692533
Decay speed - Band 2=0.013600473
[Multiband Compressor]
Enabled=1
Difference=0
Delay enabled=0
Very high quality enabled=1
Maximum volume - Band -1=6950
Maximum volume - Band 0=6500
Maximum volume - Band 1=4850
Maximum volume - Band 2=4600
Maximum volume - Band 3=3650
Maximum volume - Band 4=2750
Maximum volume - Band 5=2000
Maximum volume - Band 6=2900
Maximum volume - Band 7=3950
Maximum volume - Band 8=2400
Attack speeds linked=1
Attack speed - Band -1=0.00150717
Attack speed - Band 0=0.00150717
Attack speed - Band 1=0.00150717
Attack speed - Band 2=0.00150717
Attack speed - Band 3=0.00150717
Attack speed - Band 4=0.00150717
Attack speed - Band 5=0.00150717
Attack speed - Band 6=0.00150717
Attack speed - Band 7=0.00150717
Attack speed - Band 8=0.00150717
Decay speeds linked=1
Decay speed - Band -1=0.000314613
Decay speed - Band 0=0.000314613
Decay speed - Band 1=0.000314613
Decay speed - Band 2=0.000314613
Decay speed - Band 3=0.000314613
Decay speed - Band 4=0.000314613
Decay speed - Band 5=0.000314613
Decay speed - Band 6=0.000314613
Decay speed - Band 7=0.000314613
Decay speed - Band 8=0.000314613
Above Top Limiter=1
Clipping enabled=1
Postprocessing enabled=1
Relative clip position - Band -1=1.352941036
Relative clip position - Band 0=1.352941036
Relative clip position - Band 1=1.150537729
Relative clip position - Band 2=1.150537729
Relative clip position - Band 3=1.352941036
Relative clip position - Band 4=-1
Relative clip position - Band 5=-1
Relative clip position - Band 6=-1
Relative clip position - Band 7=0.600000024
Relative clip position - Band 8=-1
Final limiter value=0.556199968
Final limiter decay speed=0
Final limiter clipping=1
Equalizer enabled=1
Equalize before multiband-compression=1
Equalizer position - Band -1=3
Equalizer position - Band 0=2.333333254
Equalizer position - Band 1=1.500000238
Equalizer position - Band 2=1.500000238
Equalizer position - Band 3=1.409638405
Equalizer position - Band 4=1
Equalizer position - Band 5=0.801801801
Equalizer position - Band 6=0.801801801
Equalizer position - Band 7=0.754385948
Equalizer position - Band 8=0.754385948
[Stereo]
Enabled=1
Delay enabled=0
Difference=0
Center bass=1
AZIMUTH limit=60.979999542
AZIMUTH change speed=0.200000003
Image phase amplifier=1.549999952
Image phase amplifier maximum angle=126
Image phase amplifier maximum separation strength=60.86000061
Image width amplifier=1.299999952
Extra phase shift=0
Mono or stereo only=-0.550000012
[Channel Delay]
Enabled=0
Left Delay=0
[Output Filter]
Enabled=1
Lowpass filter=16500
[Final Pre-Limiter]
Enabled=1
Difference=0
Pre-amp=1.381183982
Response time=0.800000012
[Final Limiter]
Enabled=1
Difference=0
Pre-amp=1.000255942
Response time=0.0125
[FM Transmitter]
Enabled=0
Pre-emphasize=0
Pre-emphasis time=50
Output is pre-emphasized=0
Stereo encoder enabled=0
RDS encoder enabled=0
RDS PS text=2s:STEREO/2s:TOOL/<1=1.5s,2..-2=2t,-1=1.5s:WWW.STEREOTOOL.COM
RDS RadioText text=60s:Stereo Tool: Professional Audio Processing - http://www.stereotool.com/30s:Stereo Tool by Hans van Zutphen, 1999-2008 - http://www.stereotool.com
RDS PTY=0
RDS PI=65535
RDS Alternative frequency 1=0
RDS Alternative frequency 2=0
RDS Alternative frequency 3=0
RDS Alternative frequency 4=0
RDS Alternative frequency 5=0
RDS Alternative frequency 6=0
RDS Alternative frequency 7=0
RDS Alternative frequency 8=0
RDS Alternative frequency 9=0
RDS Alternative frequency 10=0
RDS Alternative frequency 11=0
RDS Alternative frequency 12=0
RDS Alternative frequency 13=0
RDS Alternative frequency 14=0
RDS Alternative frequency 15=0
RDS Alternative frequency 16=0
RDS Alternative frequency 17=0
RDS Alternative frequency 18=0
RDS Alternative frequency 19=0
RDS Alternative frequency 20=0
RDS Alternative frequency 21=0
RDS Alternative frequency 22=0
RDS Alternative frequency 23=0
RDS Alternative frequency 24=0
RDS Alternative frequency 25=0
RDS TP=0
RDS TA=0
RDS Music=1
RDS Artificial Head=0
RDS Compressed=1
RDS Dynamic PTY=0
RDS RadioText Enabled=1
RDS ClockTime Enabled=1
[Direct soundcard access]
Enabled=0
Device ID=
Volume=1
Buffer size=1
Send to Winamp=Nothing
ASIO Override channel 1=4
ASIO Override channel 2=5
[Low latency output]
Enabled=0
Device ID=
Volume=1
Buffer size=0.079999998
ASIO Override channel 1=2
ASIO Override channel 2=3


Top
   
PostPosted: Mon Jul 21, 2008 8:20 am 

Joined: Mon Jun 09, 2008 2:45 pm
Posts: 104
Bumping for attention to this comment...

If anyone uses this and hears something they aren't satisfied with per my description above, please let me know. I want to improve this to accurately reproduce (and condition audio) as necessary for listeners. I want to know what you all hear! This is tuned based on measurement, waveform analysis, and my ear.

Enjoy!


Top
   
PostPosted: Mon Jul 21, 2008 11:54 am 
Site Admin
User avatar

Joined: Mon Mar 17, 2008 1:40 am
Posts: 11459
Hi SuperH,

When I have time to play with AAC encoding etc, I'll look into this further and see if I will add it as a preset. It does sound a lot better than your previous preset, and I think - based on the type of sounds that I hear - that SBR-based encoders will indeed have less troubles with it. Right now I'm very busy with some other improvements, but I'll get back on this.


Top
   
PostPosted: Mon Jul 21, 2008 2:37 pm 

Joined: Mon Jun 09, 2008 2:45 pm
Posts: 104
Thanks! Good luck with all of your other improvements!


Top
   
PostPosted: Thu Jul 31, 2008 6:27 pm 

Joined: Thu Jul 31, 2008 6:09 pm
Posts: 3
I haven't started internet streaming yet but I am fascinated with the AACplus v2 with PS. Right now I think it's the best codec out there for these bit rate stereo streams (24, 32, 40 kbps bandwidths). If I get my steam up and running I will definitely try your settings Super H. I wondered if the codec sampling rate was set to 32kHz instead of 44.1kHz (or brickwall filtering the sound output to say 16 or 17kHz) would that also improve the sound quality.

I think a lot of stations could improve the quality of their streams (reduce the swooshing artifacts) with the use of better processing. I think the combination of Stereo Tool with settings such as you suggested helps to establish a best practice for internet streaming at low bitrates. I hope streaming stations out there that are trying to improve their sound quality, read this forum.

Super H, could you post a link to your stream.


Top
   
PostPosted: Sat Aug 02, 2008 7:05 am 

Joined: Mon Jun 09, 2008 2:45 pm
Posts: 104
Quote:
I haven't started internet streaming yet but I am fascinated with the AACplus v2 with PS. Right now I think it's the best codec out there for these bit rate stereo streams (24, 32, 40 kbps bandwidths). If I get my steam up and running I will definitely try your settings Super H. I wondered if the codec sampling rate was set to 32kHz instead of 44.1kHz (or brickwall filtering the sound output to say 16 or 17kHz) would that also improve the sound quality.

I think a lot of stations could improve the quality of their streams (reduce the swooshing artifacts) with the use of better processing. I think the combination of Stereo Tool with settings such as you suggested helps to establish a best practice for internet streaming at low bitrates. I hope streaming stations out there that are trying to improve their sound quality, read this forum.

Super H, could you post a link to your stream.
Sure -- but I want to iron a few kinks out of the processing first. I have found that I do have to tweak on this a bit to compensate for changes to the latest version of Stereo Tool. I will post slightly modified settings as appropriate soon.


Top
   
PostPosted: Wed Aug 27, 2008 5:02 am 

Joined: Mon Jun 09, 2008 2:45 pm
Posts: 104
There were a lot of kinks I found, especially as some processing aspects of Stereo Tool have changed so much very recently.

I have came up with the following preset, preliminarily, that provides good sound quality at low bitrates based on the behavior of aacPlus (HE-AAC).

Anyone want to give me some feedback? Even as a simple non-CODEC audio preset. It gives better normalization than the previously posted settings, while maintaining a fair amount of dynamic range, keeping sibilance low, and avoiding all the artifacts above. Plus, there is a great amount of stereo separation, and it works with aacPlus (HE-AAC) V1 and V2, with and without parametric stereo.
Code:
[Common]
Pre amplifier=3
Post amplifier=0.850000024
Extra loudness=1
Extra loudness Maximum increase per step=1.340000033
Hard limit output=1
Downsample very high input sample rates to near 44.1 kHz=1
Process for low latency=0
Mode=Advanced
[Noise Gate]
Enabled=1
Difference=0
Noise level=2
[Singleband Compressor]
Enabled=0
Difference=0
Maximum volume=10
Maximum value=32767
Attack speed=0.999998987
Decay speed=0.999000013
Above Top Limiter=1
[Pre Compressor]
Enabled=1
Difference=0
Delay enabled=0
Maximum volume - Band 1=33000
Maximum volume - Band 2=20500
Attack speed - Band 1=0.000004831
Attack speed - Band 2=0.000004831
Decay speed - Band 1=0.000382996
Decay speed - Band 2=0.000382996
[Multiband Compressor]
Enabled=1
Difference=0
Delay enabled=0
Very high quality enabled=1
Maximum volume - Band -1=6950
Maximum volume - Band 0=6500
Maximum volume - Band 1=6250
Maximum volume - Band 2=5250
Maximum volume - Band 3=4250
Maximum volume - Band 4=4250
Maximum volume - Band 5=4500
Maximum volume - Band 6=6250
Maximum volume - Band 7=7800
Maximum volume - Band 8=5250
Attack speeds linked=0
Attack speed - Band -1=0.00150717
Attack speed - Band 0=0.00150717
Attack speed - Band 1=0.00150717
Attack speed - Band 2=0.00150717
Attack speed - Band 3=0.00150717
Attack speed - Band 4=0.00150717
Attack speed - Band 5=0.00150717
Attack speed - Band 6=0.000018806
Attack speed - Band 7=0.000018806
Attack speed - Band 8=0.000018806
Decay speeds linked=0
Decay speed - Band -1=0.000314613
Decay speed - Band 0=0.000314613
Decay speed - Band 1=0.000314613
Decay speed - Band 2=0.000314613
Decay speed - Band 3=0.000314613
Decay speed - Band 4=0.000314613
Decay speed - Band 5=0.000314613
Decay speed - Band 6=0.000958171
Decay speed - Band 7=0.000958171
Decay speed - Band 8=0.000958171
Above Top Limiter=1
Clipping enabled=1
Postprocessing enabled=1
Relative clip position - Band -1=1.500000238
Relative clip position - Band 0=1.500000238
Relative clip position - Band 1=1.666666746
Relative clip position - Band 2=1.857142687
Relative clip position - Band 3=1.857142687
Relative clip position - Band 4=-1
Relative clip position - Band 5=-1
Relative clip position - Band 6=0.65289247
Relative clip position - Band 7=0.408450603
Relative clip position - Band 8=-1
Final limiter value=0.5
Final limiter decay speed=0
Final limiter clipping=1
Equalizer enabled=1
Equalize before multiband-compression=1
Equalizer position - Band -1=1.857142687
Equalizer position - Band 0=1.857142687
Equalizer position - Band 1=1.500000238
Equalizer position - Band 2=1.500000238
Equalizer position - Band 3=1.409638405
Equalizer position - Band 4=1
Equalizer position - Band 5=0.801801801
Equalizer position - Band 6=0.801801801
Equalizer position - Band 7=0.754385948
Equalizer position - Band 8=0.754385948
[Stereo]
Enabled=1
Delay enabled=0
Difference=0
Center bass=1
AZIMUTH limit=60.979999542
AZIMUTH change speed=0.200000003
Image phase amplifier=1.450000048
Image phase amplifier maximum angle=145.800003052
Image phase amplifier maximum separation strength=65.870002747
Image width amplifier=0.949999988
Extra phase shift=0
Mono or stereo only=-0.25
[Channel Delay]
Enabled=0
Left Delay=0
[Output Filter]
Enabled=1
Lowpass filter=16000
Highpass filter=20
[Final Pre-Limiter]
Enabled=1
Difference=0
Pre-amp=0.957104027
Response time=0.200000003
[Final Limiter]
Enabled=1
Difference=0
Pre-amp=1.000255942
Response time=0.0125
[FM Transmitter]
Enabled=0
Pre-emphasize=0
Pre-emphasis time=50
Output is pre-emphasized=0
Stereo encoder enabled=0
RDS encoder enabled=0
Pilot signal volume=9
RDS signal volume=4.5
RDS PS text=2s:STEREO/2s:TOOL/<1=1.5s,2..-2=2t,-1=1.5s:WWW.STEREOTOOL.COM
RDS RadioText text=60s:Stereo Tool: Professional Audio Processing - www.stereotool.com/30s:Stereo Tool by Hans van Zutphen, 1999-2008 - www.stereotool.com
RDS PTY=0
RDS PI=65535
RDS Alternative frequency 1=0
RDS Alternative frequency 2=0
RDS Alternative frequency 3=0
RDS Alternative frequency 4=0
RDS Alternative frequency 5=0
RDS Alternative frequency 6=0
RDS Alternative frequency 7=0
RDS Alternative frequency 8=0
RDS Alternative frequency 9=0
RDS Alternative frequency 10=0
RDS Alternative frequency 11=0
RDS Alternative frequency 12=0
RDS Alternative frequency 13=0
RDS Alternative frequency 14=0
RDS Alternative frequency 15=0
RDS Alternative frequency 16=0
RDS Alternative frequency 17=0
RDS Alternative frequency 18=0
RDS Alternative frequency 19=0
RDS Alternative frequency 20=0
RDS Alternative frequency 21=0
RDS Alternative frequency 22=0
RDS Alternative frequency 23=0
RDS Alternative frequency 24=0
RDS Alternative frequency 25=0
RDS TP=0
RDS TA=0
RDS Music=1
RDS Artificial Head=0
RDS Compressed=1
RDS Dynamic PTY=0
RDS RadioText Enabled=1
RDS ClockTime Enabled=1
[Direct soundcard access]
Enabled=0
Device ID=
Volume=1
Buffer size=1
Send to Winamp=Nothing
ASIO Override channel 1=4
ASIO Override channel 2=5
[Low latency output]
Enabled=0
Device ID=
Volume=1
Buffer size=0.079999998
ASIO Override channel 1=2
ASIO Override channel 2=3


Top
   
PostPosted: Mon Oct 26, 2009 5:13 am 

Joined: Mon Jun 09, 2008 2:45 pm
Posts: 104
hvz: I'd be interested in hearing your take on my updated preset, before I publish anything. I moved back to Windows. At least, I can play with Windows 7 for a while...

In my Mac's defense - I love it.

http://superh.dyndns.org:8000/listen

This is HE-AAC (SBR) 32kbps.

http://superh.dyndns.org:8000/high

This is MP3 128kbps.


Top
   
PostPosted: Mon Oct 26, 2009 9:58 am 
Site Admin
User avatar

Joined: Mon Mar 17, 2008 1:40 am
Posts: 11459
Hi SuperH,

Welcome back!

I could only listen for a very short time (had to leave for work) but it seems to sound good. I couldn't really notice that it was only 32 kbit/s!

I'm planning to release a new version today or tomorrow, if you post your settings before I release it I can include them in the new version.

By the way, did you use version 4.21 or the latest pre-release (see viewtopic.php?f=3&t=729&start=10 )? The new filter that I introduced there ("protect against excessive reverb") MIGHT make the Stereo Boost filter better usable for very low bitrate streams...


Top
   
PostPosted: Mon Oct 26, 2009 2:44 pm 

Joined: Mon Jun 09, 2008 2:45 pm
Posts: 104
Darn it!

I would have posted my settings, but I'm at work now. The soonest I can get them is nine hours from now.

I'll try to paste them as soon as I can. I no longer destroy dynamic range by pumping the gain down. I took a different approach to it.

Thanks!

4.21 is what I used.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic  [ 42 posts ]  Go to page 1 2 3 4 5 Next

All times are UTC+02:00


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Limited