about audio audio audio audio audio audio
audio audio audio

COMPACT VWF DISTRIBUTION FORMATS

Vector format audio

We have seen that triads of independent signals can be used to represent orthogonal component parts of the velocity of a sound wave at a point so that the resultant velocity vector at the observation point can be reconstructed. The use of arrays of sensor elements combined with multiple triads can be used to reconstruct the wavefront and hence enable its reconstruction. A microphone capturing distance and direction information in three dimensions would need a minimum of four channels [[1]] - one triad ( three channels ) and a reference signal sensed at the (displaced) central location. Higher order microphones would add multiples of triads.

 

Today, audio broadcast and audio distribution media use channel or loudspeaker associated channels. Each loudspeaker has its own reserved distribution channel. Stereo required two channels, 5.1 surround needs six. If we were to go to higher order surround systems we would need more. The 10.2 format, for example, requires twelve channels and fourteen if the option of dipole surrounds is to be catered for.

 

The alternative would be to define a flexible transmission standard that was simple and had capped channel count yet distributed the full vector field information and so enabled the full decoding (and therefore the reconstruction) of the original source Vector Wave Front Field.

 

Vector format audio achieves this. A minimum of four channels is required, but from these four channels, all formats can be extracted including mono, stereo, 5.1 surround, 10.2 and VWF, and even Wave Field Synthesis (WFS) formats. It would make some sense therefore to consider this format for distribution.

 

Higher order formats are also possible where additional orthogonal triads are used, providing the encoding from additional “shells”. 7 channel Vector Format for high resolution distance and direction encoding for captured sound fields is an example,.

 

Implementation issues

It is the advent of digital audio capture and distribution that has made the concept of a common Vector Format possible. Point of Use Render processing capability is now available in most digital decoder platforms. The use of metadata (data about the data) would enable the flexible control of the decoding, and could even take into account each listener’s specific playback configuration and preferences.

 

The critical parameter for digital distribution of audio is the amount of data transmitted. The three audio data “consumers” are:

  • Channel count.
  • Audio bandwidth (sample rate)
  • Resolution (word size)

The size of the data “pipe” required for audio is the result of choosing the values of these three parameters. Any restriction on available data throughput or storage capability must impact one or more of these parameters.

 

It would be possible to cap and even to reduce channel count by use of Vector Format distribution. This would come at a price because additional decoding capability would be required at distribution and the point of use, but the recovery of a channel and the immediate benefit of a 20% reduction in bandwidth is not to be overlooked. Further, the support of all formats can only be achieved this way, and as we will see certain capabilities of audio delivery cannot be achieved without this approach.

 

Compression strategies

An immediate issue arising from the use of this format is that of compression. The present strategy for controlling bandwidth is to compress the signals before transmission and expand or recover them at replay. Pressures of increasing channel count have forced higher compression levels to reduce file sizes. This has meant that in recent years the initial loss-less compression strategies used in distribution have also had perceptual encoding overlaid. Loss-less encoding means that the full original signal can be reconstructed from the compressed transmission or distribution. The term “perceptual coding” means that the compression has taken advantage of human hearing limitations to remove audio data that it was deemed would not be discernible from the reproduction channel loudspeaker by the listener. Most common formats including MP3 and AC3 do this. Some formats have maintained loss-less packing strategies. Meridian Loss-less Packing (MLP) is one such format.

 

If Vector Format distribution is introduced, compression strategies will have to change. The perceptual limitations would only apply to the extracted channels, not the source Vector Format. We will have to keep more audio information audio during distribution and therefore tend towards loss-less packing. This will be offset by the fact that the channel count is now less and capped.

 

There are other ways of limiting demand for data throughput.

 

Metadata is the data about the data. In previous sections it was seen that the component parts of the captured and reproduced audio could be expressed as metadata rather than transmitted in the audio signal. Some examples already identified include early reflection locations, reverberation decay time constants, activation or de-activation of the array render and special render effects.

 

Compatibility

The move to Vector Format distribution could be as a next generation product set or it could be a migratory approach maintaining compatibility. In its crudest form, the signals could simply be transmitted within four of the existing five surround channels, and the fifth channel not used.

 

A more sophisticated approach could address the compatibility issue by resolving the electrical signals into defined separate axis component parts such as left-right, forward-back and up-down and transmit these as a vector set in such a way that a basic (stereo) decoder would be able to directly present the signals to its channel associated loudspeakers without further decoding, yet accurate higher order decoding remained possible. The spare channel could be used to transmit a difference signal to assist with this backward compatibility.

 

The audio production chain of the future

Text Box:  
FIGURE 1 The distribution chain of the future

With digital distribution systems, the need for decoding at the point of use already exists and so the issue reduces to one of fitting the additional decoding requirements into the available processing platforms. Figure 1 shows the structure of the new production chain. This structure would apply to streamed audio applications including interactive gaming as much as to traditional broadcast and point of sale media.

 


To understand what would be required for the broadcast of the revised audio formats previously described, we compare the new format to the existing 5.1 channel format. The new capability would include Vector WaveFront (VWF) reproduction, Wave Focus low frequency reproduction (WF), Early reflection render, Whiteroom reverberant field reproduction and masking and metadata delivered using loss-less compressed Vector Format audio.

 

Low frequencies

Mode free low frequency reproduction would use a bass manager function in the decoder to extract all low frequency content. The decoders could also optionally manage annihilation.

The separate Low Frequency Effects channel is no longer needed. The recovered “0.1” channel would then have sufficient throughput for the envisaged metadata rates. The main issue would be to ensure that legacy decoders did not try and reproduce the metadata as bass!

 

Main channels

Vector format 4 channel distribution would recover a complete channel. Sound fields could be recreated. Sweet spots would cease to exist. Surround format/sweet spot listeners would also be transparently supported

 

Additional facilities

  • VWF format would enable early reflection placement and recreation of bounded spaces. The four distributed channels contain all the necessary material. The metadata channel would create (say) the first ten reflection placements as required. Existing formats cannot do this.
  • Whiteroom would control envelopment effects and boundary masking. A reverberation manager would extract the sum of all four distributed channels. Metadata would turn whiteroom on and off and specify decay time constants and spectral colouring. No additional channel would be required. Present formats cannot do this.
  • The Premium end user can provide special effects and perspectives such as stage walkthrough and favourite instrument seating. This could be an additional charge feature – pay per render. Present formats cannot do this.

 

Vector Format electrical transmission should not be confused with Vector WaveFront (VWF) acoustic capture. VWF applies to the acoustic sound field. VWF information could be transmitted using Vector Format electrical signals.

 

The information contained herein is copyright to HuonLabs. No material can be reproduced in its totality or in part or without the express permission of HuonLabs Pty Ltd. Any reference to this material must quote the HuonLabs source. Trade marks and Patent applications apply to most aspects of the work disclosed here. Contact HuonLabs for further details, product information or licensing enquiries.



[1] This format is not new. It was identified by Gerzon and Craven in 1975 amongst others.