In a previous four-part series on computer audio sound quality (The Absolute Sound, issues 218-221), we examined and ranked a number of hardware and software factors that affected ripping, burning, and playback sound quality, using standard PC-based computers. These articles defined several industry-standard criteria on which subjective judgments were based. And in order to provide a measure of the magnitude of these effects in such a way that readers could understand and reproduce in their own systems, we created a numerical scale based on subjective estimates of the fractional difference between the sound of a CD, of a SACD, or of a high resolution download derived from the same master recording.
Part three reported that a FLAC file sounded inferior to the WAV file from which it was made, and we found to our surprise that when these FLAC files were reconverted, the resulting WAV file did not recover the full sound quality of the original. We repeated these conversion steps five times and observed a hyperbolic decline in WAV sound quality, the greatest loss occurring in the first two or three conversions.
To avoid inevitable controversy, these experiments were repeated several times, using a single blind protocol with various trained listeners, and consistently obtained the same results. Since the FLAC compression is considered lossless, it was no surprise to find considerable objections posted on the 'bits are bits' forums when we published. The validity of our results was recently questioned again, so we reexamined our original findings, partly in an attempt to understand the cause, but also using updated computers, software, DACs, cables and other system improvements (see TAS issues 246 and 248). The results of these reinvestigations are the subject of this summary article, and the reader is referred to the full two-part article available as a free download at this link.
Data And Results
The order in which this sequence was generated is indicated by dashed lines in Fig 1, but for clarity has been omitted in subsequent figures. Where applicable, in each experiment we show the height estimate on the vertical axis, plotted against the conversion number of the complete series (abbreviated WFW) on the horizontal axis. Height estimates were made (using a tape measure hung from the ceiling behind the speakers) by listening to a specific chord, repeated twice on a harp passage during a 6s segment (39-45s into track 1). This tight musical restriction was found necessary to ensure consistent height estimates over weeks and months of listening sessions. We initially repeated the experiment conducted in 2010, and despite various system upgrades made over this time, the height method obtained the same pattern of results found previously with our subjective sound quality scale, as long as the same version of J River Media Center software (JRMC v15) was employed. Also, as found in our original experiments, we found little or no significant benefit of engaging the memory playback setting of this software version. Using these defined methods, our results are shown in Fig 1.
Following the initial experiment shown in Fig 1, we undertook a number of control experiments to check whether these results were not artifact of our procedures. Although there were some modest variations in absolute height values under these diverse situations, we found that there was no change in the resultant patterns shown in the initial experiment due to:
• Multi-use versus dedicated computer/servers
• Computers or
software used to generate the sequential WAV to FLAC series (dBPowerAmp
• Playback software (using JRMC v15 or VLC v2.1)
• Diverse speaker types and playback electronics
• Various listening rooms
• Different listeners (4)
We now come to one of the more important results in this investigation. We learned that one of the JRMC software changes introduced after version 15 increased the size of memory allocation during playback. We examined the effect of this change using the later version 19 on our standard sequential WAV-to-FLAC-to-WAV conversion protocol. The results of this experiment are shown in Fig 2. As we had observed earlier, using JRMC v19 (and later v20), foregoing the memory playback feature produced a similar hyperbolic pattern of decline with repeated WAV and FLAC conversions. However, with the memory playback feature engaged, we observed a very substantial then the file header containing the metadata could be the only other source of the problem. We consulted with many computer and audio gurus. All agreed that there was no way the miniscule amount of information contained in the metadata could affect the sound quality. How could the small size of the header (~50,000 bytes) at a mere 0.1% of a single high resolution track (~500,000,000 bytes) have any effect on sound quality? The burden this presents to the computer “would not even be remotely noticeable” was the usual refrain. On the other hand, many manufacturers of high performance audio servers and audio programmers disagree with this point of view. They remain convinced that it is small variations in computer-generated noise that significantly disrupts replay sound quality.
Having run out of ideas about what to do next, we decided to test the 'unlikely' possibility that removing the metadata might somehow be involved in the loss of FLAC sound quality and the associated height reproduction change we had repeatedly observed. To remove the metadata in a Windows/PC environment we called up the file 'Properties' option and removed all the printed information and the attached metadata-associated cover art. These results are presented in Fig 3. Here, only the first and last WAV conversions with and without metadata are shown. The solid bars in this graph show that all four WAV files variants are equal in magnitude and quite similar to the values obtained in the previous experiment, as expected.
The 0 and 5 FLAC files with metadata also gave values quite similar to what we observed in Fig 2, but now we found a major and equal height increase of both 0 and 5 FLAC files from which metadata had been removed. So, what is responsible for the FLAC-induced sound quality degradation?
Obtaining these clear-cut results was quite a Eureka moment for us. Regardless of what the computer experts believed, their assumptions proved erroneous. Something in the metadata seemed to be the source of the cumulative decline in FLAC file conversions, with or without memory playback. These results show that the attribute that was responsible for reduced FLAC audio performance can also be transferred to WAV files, and survives to degrade ultimate WAV quality, this also depending on the playback software and the available memory allocation. However, even with the benefit of expanded memory buffering, the aimed for improvement to the FLAC replay remained incomplete and only reached 87% of that exhibited by the companion WAV file. This difference was statistically significant and indicates that there must be a second factor affecting FLAC sound quality that is unrelated to the metadata effect.
Based on these results, we attempted to pinpoint which section of the metadata might be responsible. Since the cover art file associated with the metadata is the largest contributor to the metadata header size, we began by examining the effect of deleting cover art prior to the WAV-to-FLAC-to-WAV conversion protocol. This proved fortuitous, as our first suspicion proved correct.
The results of this experiment are shown in Fig 4, and several conclusions can be gleaned from these data. First and foremost, removal of metadata alone eliminates the hyperbolic decline in sound quality, not only in WAV but also in FLAC format, when the memory playback feature of the JRMC software was engaged (as was done in this experiment). This result replicates the Fig 3 findings where all metadata was removed. This occurs regardless of whether the original unconverted metadata is added back (as shown here) or not added back (data not shown) and indicates that the metadata is the major (if not the only) contributor to the degradation of height and sound quality FLAC files (and also WAV files when played back without JRMS memory playback).
In addition, first removing all metadata and then only adding back metadata prior to a WAV-to-FLAC-to-WAV conversion, replicates the results shown in Fig 1, this with either JRMC versions 15 or 19 (with memory playback defeated) (see Figs 2 and 3 in Part 2 of the full article available on the HIFICRITIC website).
[We distinguish metadata from the cover art file, which is typically downloaded along with the music file, since the cover art is usually significantly larger in size or resolution than that which is attached to the file header.]