What is jitter and how does it affect audio quality? In the audio field the term jitter designates a timing uncertainty of digital clock signals. E.g. in an Analog to Digital Converter (A/D) the analog signal is sampled (measured) at regular time intervals; in the case of a CD, 44,100 times a second or every 22.675737 microseconds.
If these time intervals are not strictly constant then one talks of a jittery conversion clock. In practice it is of course not possible to generate exactly the same time interval between each and every sample. After all, even digital signals are analog in their properties and thus are influenced by noise, crosstalk, power supply fluctuations, temperature etc.
Hence a jittery clock introduces errors to the measurements taken by the A/D, resulting from measurements being taken at the wrong time. One can easily observe that the level of the error introduced is higher during high audio frequencies, because high frequency signals have a steeper signal form.
A good designer takes care that the jitter amount in his/her design is minimized as well as possible.
What Type Of Equipment Can Be
There are three types: The A/D Converter as described above, then there is the D/A Converter where the same mechanism as in the A/D Converter applies and the third is the Asynchronous Sample Rate Converter (ASRC). The ASRC is not something usually found in Hi-Fi systems. It is used by Sound Engineers to change the sample rate from e.g. 96kHz to 44.1kHz, or e.g. for putting a 96kHz recording onto a 44.1kHz CD.
You may now argue that in High-End Hi-Fi there are such things as "Oversamplers" or "Upsamplers".
Yes, those are in essence sampling rate converters, however in a well designed system these converters employ a synchronous design, where jitter does not play any role. Of course a conversion between 96kHz and 44.1kHz as in the example above, can be done in a synchronous manner as well. An ASRC in fact is only required either where one or both of the sampling frequencies involved are changing over time ("varispeed" mode of digital audio recorders) or where it is unpractical to synchronize the two sampling frequencies.
So basically in Hi-Fi jitter matters where there are A/D or D/A converters involved. CD and DVD players are by far the most numerous type of equipment employing D/A converters. And of course stand-alone D/A converters. Jitter, being an analog quantity, can creep in at various places. The D/A converter built into CD or DVD players can be "infected" by jitter through various crosstalk mechanisms, like power supply contamination by power hungry motors (spindle / servo) or microphony of the crystal generating the sampling clock or capacitive / inductive crosstalk between clock signals etc.
In the standalone D/A converter jitter can be introduced by inferior cables between the source (e.g. CD player) and the D/A converter unit or by the same mechanisms as described above except for the motors of course.
In the case of a stand-alone D/A converter (as the MEDEA), one has to take two different jitter contamination paths into account.
One is the internal path where internal signals can affect the jitter amount of the sampling clock generator. Here, all the good old analog design principles have to be applied. Such as shielding from electric or magnetic fields, good grounding, good power supply decoupling, good signal transmission between the clock generator and the actual D/A chip.
The other path is the external signal coming from the source to which the sampling clock has to be locked. I.e. the D/A converter has to run synchronous to the incoming digital audio signal and thus the frequency of the internal sampling clock generator has to be controlled so that it runs at the same sampling speed as the source (CD player). This controlling is done by a Phase Locked Loop (PLL) which is a control system with error feedback. Of course the PLL has to be able to follow the long term fluctuations of the source, e.g. the sampling rate of the source will alter slightly over time or over temperature, it will not be a constant 44.1kHz in the case of a CD. But the PLL should not follow the short term fluctuations (jitter). Think of the PLL as being like a very slow-reacting fly-wheel.
In the MEDEA we employ a two stage PLL circuitry which very effectively suppresses jitter. A common problem with most PLLs used in audio circuitry is that they suppress jitter only for higher frequencies. Jitter frequencies which are low (e.g. below 1kHz or so) are often only marginally suppressed. It has been shown that low frequency jitter can have a large influence on the audio quality though. The MEDEA suppresses even very low frequency jitter components down to the sub-Hertz range.
This means that the MEDEA is virtually immune to the quality of the audio source regarding jitter. For a CD player as a source this means that as long as the data is read off the CD in a correct manner (i.e. no interpolations or mutes) you should hardly hear any difference between different makes of CD players or between different pressings of the same CD. Also "accessories" like disk dampening devices or extremely expensive digital cables will not make any difference in sonic quality. Of course it is always a good idea to have a good quality cable for digital (or analog) audio transmission - but within reason.
In consumer audio circles the two terms oversampling and upsampling are in common use. Both terms essentially mean the same, a change in the sampling frequency to higher values. Upsampling usually means the change in sampling rate using a dedicated algorithm [e.g. implemented on a Digital Signal Processor chip (DSP)] ahead of the final D/A conversion (the D/A chip), while oversampling means the change in sampling rate employed in today's modern D/A converter chips themselves.
But let's start at the beginning. What is the sampling frequency? For any digital storage or transmission it is necessary to have time discrete samples of the signal which has to be processed. I.e. the analog signal has to be sampled at discrete time intervals and later converted to digital numbers. (Also see "Jitter Suppression and Clocking" above)). This sampling and conversion process happens in the so called Analog to Digital Converter (A/D). The inverse in the Digital to Analog Converter (D/A).
A physical law states that in order to represent any given analog signal in the digital domain, one has to sample that signal with at least twice the frequency of the highest frequency contained in the analog signal. If this law is violated so called aliasing components are generated which are perceived as a very nasty kind of distortion. So if one defines the audio band of interest to lie between 0 and 20 kHz, then the minimum sampling frequency for such signals must be 40kHz.
For practical reasons explained below, the sampling frequency of 44.1kHz was chosen for the CD. A sampling frequency of 44.1kHz allows to represent signals up to 22.05kHz. The designer of the system has to take care that any frequencies above 22.05kHz are sufficiently suppressed before sampling at 44.1kHz. This suppression is done with the help of a low pass filter which cuts off the frequencies above 22.05kHz. In practice such a filter has a limited steepness, i.e. if it suppresses frequencies above 22.05kHz it also suppresses frequencies between 20kHz and 22.05kHz to some extent. So in order to have a filter which sufficiently suppresses frequencies above 22.05kHz one has to allow it to have a so called transition band between 20kHz and 22.05kHz where it gradually builds up its suppression.
Note that so far we have talked about the so called anti-aliasing filter which filters the audio signal ahead of the A/D conversion process. For the D/A conversion, which is of more interest to the High-End Hi-Fi enthusiast, essentially the same filter is required. This is because after the D/A conversion we have a time discrete analog signal, i.e. a signal which looks like steps, having the rate of the sampling frequency.
Such a signal contains not only the original audio signal between 0 and 20kHz but also replicas of the same signal symmetrical around multiples of the sampling frequency. This may sound complicated, but the essence is that there are now signals above 22.05kHz. These signals come from the sampling process. There are now frequencies above 22.05kHz which have to be suppressed, so that they do not cause any intermodulation distortion in the amplifier and speakers, do not burn tweeters or do not make the dog go mad.
Again, a low pass filter, which is called a "reconstruction filter", is here to suppress those frequencies. The same applies to the reconstruction filter as to the anti-aliasing filter: Pass-band up to 20kHz, transition-band between 20kHz and 22.05kHz, stop-band above 22.05kHz. You may think that such a filter is rather "steep", e.g. frequencies between 0 and 20kHz go through unaffected and frequencies above 22.05kHz are suppressed to maybe 1/100,000th of their initial value. You are right, such a filter is very steep and as such has some nasty side effects.
For instance it does strange things to the phase near the cutoff frequency (20kHz) or it shows ringing due to the high steepness. In the early days of digital audio these side effects have been recognized as being one of the main culprits for digital audio to sound bad.
So engineers looked for ways to enhance those filters. They can't be eliminated because we are talking laws of physics here. But what if we run the whole thing at higher sampling rates? Like 96kHz or so? With 96kHz we can allow frequencies up to 48kHz, so the reconstruction filter can have a transition band between 20kHz and 48kHz, a very much relaxed frequency response indeed. So let's run the whole at 96kHz or even higher! Well - the CD stays at 44.1kHz. So in order to have that analog lowpass filter (the reconstruction filter) to run at a relaxed frequency response we have to change the sampling frequency before the D/A process. Here is where the Upsampler comes in. It takes the 44.1kHz from the CD and upsamples it to 88.2kHz or 176.4kHz or even higher. The output of the upsampler is then fed to the D/A converters which in turn feeds the reconstruction filter.
All modern audio D/A converter chips have such an upsampler (or oversampler) already built into the chip. One particular chip, for instance, upsamples the signal by a factor of eight, i.e. 44.1kHz ends up at 352.8kHz. Such a high sampling frequency relaxes the job of the reconstruction filter very much, it can be built with a simple 3rd order filter.
So, how come that upsamplers are such a big thing in High-End Hi-Fi circles? The problem with the upsamplers is that they are filters again, digital ones, but still filters. So in essence the problem of the analog reconstruction filter has been transferred to the digital domain into the upsampler filters. The big advantage when doing it in the digital domain is that it can be done with a linear phase response, which means that there are no strange phase shifts near 20kHz and the ringing can also be controlled to some extent. Digital filters in turn have other problems and of course have quite a few degrees of freedom for the designer to specify. This means that the quality of digital filters can vary at least as much as the quality of analog filters can. So for a High-End Hi-Fi designer it is a question whether the oversampling filter built into the D/A chips lives up to his/her expectations. If not, he/she can chose to design his/her own upsampler and bypass part of or the whole oversampler in the D/A chip. This gives the High-End Hi-Fi designer yet another degree of freedom to optimize the sonic quality of the product.
For the MEDEA we have decided to do part of the upsampling (the most critical part in fact) in the Digital Signal Processor (DSP) chip external to the D/A chip.
Reconstruction filters have been mentioned in the "Upsampling, Oversampling and Sampling Rate Conversion in General" paragraph above. If you have read that paragraph you know what the purpose of the reconstruction filter is. The main point about this analog filter is that its frequency response should be as smooth and flat as possible in order to have a virtually linear phase response. The MEDEA employs a 3rd order filter for that purpose. 3rd order is sufficient to suppress the frequencies above approximately 176.4kHz. 176.4kHz because the D/A converter in the MEDEA is running at 352.8kHz (or 384kHz) sampling frequency.
Analog Output Stages
The MEDEA employs class A output stages with a virtually zero ohm output impedance. Class A inherently guarantees very low distortion figures even when operating the amplifier in open loop (i.e. without feedback). The feedback is added to lower the distortion figures even further. The distortion figures of the MEDEA, which by the way are excellent, are predominantly caused by the D/A converter and not by the output stage.
A very low output impedance assures that the performance of the MEDEA and the subsequent amplifier combination is not compromised by the cables between the two or by the input impedance characteristics of the amplifier. A low output impedance can be difficult when it comes to stability issues, but we have taken care of that problem in the MEDEA.
You have probably not heard the term dithering in conjunction with audio. Actually it is a term widely used in the professional audio realm but not so much in the High-End Hi-Fi market.
What is dithering? Suppose a digital recording has been made with a 24 bit A/D converter and a 24 bit recorder. Now this recording should be transferred to a CD which has just 16 bits per sample, as you know. What to do with those 8 bits which are too many? The simplest way is to cut them off, truncate them. This, unfortunately, generates harmonic distortions at low levels, but which nonetheless cause the audio to sound harsh and unpleasant. The harmonic distortion is generated because the eight bits which are cut off from the 24 bits are correlated with the audio signal, hence the resulting error is also correlated and thus there are distortions and not just noise (noise would be uncorrelated). The dithering technique now is used to de-correlate the error from the signal. This can be achieved by adding a very low level noise to the original 24 bit signal before truncation. After truncation the signal does not show any distortion components but a slightly increased noise floor. This works like magic... the distortion is replaced by a small noise - much more pleasant.
I have given the example of a 24 bit recording which has to be truncated to 16 bits. Where is the application in High-End Hi-Fi audio? More and more signal processing is implemented in the digital domain. Think of digital equalizers, digital volume controls, upsamplers, digital pre-amplifiers, decoders for encoded signals on DVD etc. All those applications perform some mathematical operations on the digital audio signal. This in turn causes the wordlength of the signal to be increased. E.g. an input signal to an upsampler may have a wordlength of 16 bits (off a CD), but the output signal of the upsampler may have 24 bits or even more. This comes from the fact that the mathematical operations employed in such devices increase the word length (e.g. a multiplication of two 2 digit numbers results in a four digit number). So after the upsampler the word length may be higher than the subsequent processor may be able to accept. In this example, after the upsampler there may be a D/A converter with a 24 bit input word length capability. So if the upsampler generates a word length of more than 24 bits it should be dithered to 24 bits for maximum signal fidelity.
I hope these excursions into the theory and practice of audio engineering have been useful for you. If you would like to dive further into those issues I recommend to visit the website of Mr. Bob Katz, a renowned Mastering Engineer and a Weiss Engineering customer. He publishes articles on Dithering and Jitter and many other topics at http://www.digido.com.