USER’S MANUAL
Section 6: AUDIO CODING REFERENCE 110
delta,betweensuccessiveaudiosamplescomparedtousingtheindividualvalues.Further
efficiencyishadbyadaptivelyvaryingthedifferencecomparatoraccordingtothenatureofthe
programmaterial.G.722andAPT‐XareexamplesofADPCMschemes.Theyachievearounda
factorof4:1reductioninbitrate.
G.722achievesadditionalefficien
cybyallocatingitsbitstomatchthepatternsinthehuman
voice,andit’sconsideredadequatefornewsandtalkprogrammingoverISDN.But,forhigh‐
fidelitytransmission,
algorithmswithmorepowerarerequired.Thesearebasedon
psychoacoustics,wherethecodingprocessisadaptedtothewaywehearsounds.Thereare
severalalgorithmsavailable,withvaryingcomplexityandperformancelevels.
Someyearsago,theinternationalstandardsgroupISO/IECestablishedtheISO/MPEG(Moving
PicturesExpertGroup),todevelopauniversalstandardforen
codingmovingpicturesandsound
fordigitalstorageandtransmissionmedia.ThestandardwasfinalizedinNovember1992with
threerelatedalgorithms,calledLayers,definedtotakeadvantageofpsychoacousticeffects
whencodingaudio.Layer1and2areintendedforcompressionfactorsofabout4:1and6or
8:1respectively,andthes
ealgorithmshavebecomepopularinsatelliteandhard‐disksystems.
Layer‐3achievescompressionupto12.5:1—8%oftheoriginalsize—makingitidealforISDN.
Basic Principles of Perceptual Coding
Withperceptualcoding,onlyinformationthatcanbeperceivedbythehumanauditorysystemis
retained.
Lossless–which,foraudio,translatestonoiseless–codingwithpe
rfectreconstructionwould
beanoptimumsystem,sincenoinformationwouldbelostoraltered.Itmightseemthat
lossless,redundancy‐reducingmethods(suchasPKZIP,Stuffit,Stacker,andothersusedfor
computerhard‐diskcompression)wouldbeappli
cabletoaudio.Unfortunately,noconstant
compressionrateispossibleduetosignal‐dependentvariationsinredundancy.Therearehighly
redundantsignalslikeconstantsinetones(wheretheonlyinformationnecessaryisthe
frequency,phase,amplitude,anddurationofthetone),whileothersignals,suchasthosewhich
approachbroadbandnoise,maybecompletelyun
predictableandcontainnoredundancyatall.
Furthermore,lookingforredundancycantaketime.Whileapopularsongmighthavethree
choruseswithidenticalaudiodatathatwouldneedtobecodedonlyonce,you’dhavetostore
andanalyzetheentiresonginordertofi
ndthem.Anysystemintendedforareal‐timeuseover
telephonechannelsmusthaveaconsistentoutputrateandbeabletoaccommodatetheworst
case,soeffectiveaudiocompressionisimpossiblewithredundancyreductionalone.
Fortunately,psychoacousticspermitsacleversolution!Effectscalled“masking”havebeen
discoveredinthehu
manauditorysystem.Thesemaskingeffects(whichmerelyprovethatour
brainisalsodoingsomethingsimilartobitratereduction)havebeenfoundtooccurinboththe
frequencyandtimedomainsandcanbeexploitedforaudiodatareduction.
Mostimportantforaudiocodingaretheeffectsinth
efrequencydomain.Researchinto
perceptionhasrevealedthatatoneornarrow‐bandnoiseatacertainfrequencyinhibitsthe
audibilityofothersignalsthatfallbelowathresholdcurvecenteredonamaskingsignal.
Thefigurebelowshowstwo“thresholdofaudibility”curves.Theloweroneisthetypical
frequen
cysensitivityofthehumanearwhenpresentedwithasingleswepttone.Whenasingle
constanttoneisadded,thethresholdofaudibilitychangesasshownintheuppercurve.The