From WikiFAQ

Ambisonic Surround Sound FAQ

Related Topics
Sponsor Links
Description
Information about Ambisonic Surround Sound.
Table of contents

What is surround sound?

First there was mono with the sound emanating from a single "point". Then there was stereo with directional formation spread along a line in front of the listener. In real life, of course, sound reaches our ears from all directions. Surround sound attempts to improve the realism of the perceived sound by providing information from all directions, not just from in front of the listener.

Surround sound can place you into the middle of an orchestra and there are many pieces of music staged this way. However, in the West, since the eighteenth century most music has been staged with the orchestra in front of he audience. Here, surround sound can be used to reproduce the "acoustic" of the recording venue. This is important because stereo will always be limited to creating the illusion of musicians playing in your living room - a "they are here" illusion. Only if the listener is surrounded with sound can there be any hope of creating the illusion that you have been transported into the recording venue - a "you are there" illusion. When this illusion is successful, surround sound is as big an improvement over stereo as stereo was over mono.

What is Ambisonic Surround Sound?

Ambisonic Surround Sound is a set of techniques, developed in the 1970s, for the recording, studio processing and reproduction of the complete sound field experienced during the original performance. Ambisonic technology does this by decomposing the directionality of the sound field into spherical harmonic components, termed W, X, Y and Z. The Ambisonic approach is to use all speakers to cooperatively recreate these directional components. That is to say, speakers to the rear of the listener help localise sounds in front of the listener, and vice versa. Ambisonic technology is based on a meta-theory (a theory of theories) of sound localisation developed by the late Michael A Gerzon when he was with the Mathematical Institute, University of Oxford (see the Gerzon 1992a reference). Ambisonic decoder design aims to satisfy simultaneously and consistently as many as possible of the mechanisms used by the ear/brain to localise sounds. The Gerzon theory takes account of non-central as well as central listening positions.

In an Ambisonic decoder the spherical harmonic direction signals, W, X, Y and Z, are passed through a set of shelf filters which have different gains at low and high frequencies designed to match different ways the ear/brain localises sounds. (The different localisation mechanisms operate below and above about 700 Hz.) The speaker feeds are then derived by passing the outputs from the shelf filters through a simple amplitude matrix. An important aspect of Ambisonic decoder technology is that it is only at this final stage of processing that the number and layout of speakers is considered. The listening area for Ambisonic Surround Sound is comparable with that for conventional stereo, but larger.

What are W, X, Y and Z?

With Ambisonic technology, the directionality of the sound field is composed of spherical harmonic components. The zero-order component is termed W and is omnidirectional. The first-order components are figure-of-eight (lemniscate) responses which point forward, left and up. These are termed X, Y and Z, respectively. In practice, second-order and higher components are ignored. The W, X, Y and Z channels are collectively called B-Format.

The fact that the Z component can be recorded creates the opportunity for periphonic (full-sphere) reproduction. Periphony requires speakers to be placed above and below the height of the listeners' ears. Readers familiar with microphone techniques will realise that the W and Y spherical harmonic components are equivalent to the M and S components of the M-S stereo recording technique. Ambisonics is a natural extension of this recording technique to three dimensions.

What were Matrix H and HJ?

Matrix H and HJ were surround sound encodings used by the British Broadcasting Corporation in the late 1970s for experimental FM radio broadcasts. Matrix H was based on the QS quadraphonic system and was modified to HJ which was based on Ambisonic principles. The system was not adopted by the BBC because some mono transmissions sounded "phasey". The other rumoured problem was that someone high-up in the BBC thought sound engineers might have to be paid more if they worked with twice as many speakers. The `H' has no meaning. The BBC called the first matrix they assessed Matrix A, and then worked up the alphabet.

What is UHJ?

The Ambisonic format recommended for recording and studio processing is called B-Format and is just the W, X, Y and Z direction signals. If only horizontal surround sound is required, then the Z signal can be omitted. However, this destroys the possibility of periphonic (full-sphere) reproduction. Established transmission media (LP, FM radio, CD) are all two-channel and, unfortunately, it is impossible to obtain reasonable surround sound using only two of the B-Format signals. To overcome this, two-channel UHJ matrix encoding was developed. Not only can two-channel UHJ be decoded back into horizontal surround sound, but also this C-Format is mono and stereo compatible. When two-channel UHJ is played in stereo, the front- and side-stage material is reproduced with sharply defined images. The rear-stage material is reproduced, but given a less focused, more "recessed" quality. This helps to provide an audible distinction between front and rear sounds when played in stereo. When two-channel UHJ is played in mono, sounds from all directions, including due back, are reproduced in the single speaker at a level within 5 dB of one another.

Two-channel UHJ was extended into a hierarchy of C-Formats for 2, 2.5, 3 and 4 transmission channels, termed BHJ, SHJ, THJ and PHJ, respectively. The extra channels are used to augment the two base channels to give improved horizontal surround sound and, for four-channel UHJ (PHJ), periphonic (full-sphere) surround sound. In practice, only two-channel UHJ (BHJ) encoded material has ever been released. For this reason UHJ has become a synonym for BHJ, and UHJ is the symbol you will see on LPs and CDs.

The advantages of UHJ over B-Format are that it is mono and stereo compatible, and allows horizontal surround sound to be transmitted using two-channel media. The disadvantages of UHJ are that both encoding and decoding require the use of 90 degree phase shifters, and that encoding into only two channels requires compromise. ("No-compromise" horizontal surround sound requires three transmission channels.) It is UHJ which has caused Ambisonics to be described as a "matrix" system, but Ambisonics is much more than UHJ.

Readers interested in seeing the set of encoding/decoding equations should consult the appendices of the Gerzon 1985 reference. UHJ is more symbolism than initialism. The `U' stands for Universal, and is taken from the UMX quadraphonic system which pioneered the technique of using supplementary channels to enhance directional resolution. The `H' represents the BBC's Matrix H and their work on mono and stereo compatible matrices. The `J' is taken from System 45 J, the name of a progenitor of the UHJ system. (The `J' was simply a code letter used to describe a possible surround sound system.)

What are BHJ, SHJ, THJ and PHJ?

BHJ is the engineering specification for encoding the W, X and Y direction signals into two channels. The two channels, called Left and Right, can then be transmitted using conventional stereo media before being decoded back into W, X and Y. The BHJ format has been designed to be mono and stereo compatible. In practice, BHJ is the only UHJ encoding that has been used for commercial record releases. For this reason UHJ has become a synonym for BHJ and UHJ is the symbol you will see on BHJ encoded LPs and CDs.

SHJ specifies how W, X and Y can be encoded into 2.5 channels, called Left, Right and T, where the T channel is of reduced bandwidth (5 kHz). The original intention was to provide the reduced bandwidth channel in broadcasting by additional modulation of the 38 kHz sub-carrier. Presumably RDS, Minicall, etc, kills this possibility. THJ specifies how W, X and Y can be encoded into three channels called Left, Right and T. This is the "no-compromise" horizontal C-Format. PHJ specifies how W, X, Y and Z can be encoded into four channels, called Left, Right, T and Q, and is the "no-compromise" periphonic C-Format. Periphonic (full-sphere) reproduction requires speakers to be placed above and below the height of the listeners' ears.

BHJ, SHJ, THJ and PHJ are all inter-compatible. That is to say, to go from one member of the set to the next you add or delete additional signals without changing those that remain. A beauty of this is that each member of the UHJ set is mono and stereo compatible. In addition, a BHJ decoder, for example, can decode SHJ, THJ and PHJ material simply by ignoring the extra T and Q channels.

It is possible, however, to use "buried data" to encode a third channel of reduced bandwidth onto a CD such that an existing CD player is unaware the channel exists. This would allow SHJ encoded CDs to be produced that are completely compatible with conventional stereo CD players. That is to say, a stereo system would produce stereo, a BHJ decoder would produce surround sound, and an SHJ decoder (fed from a CD player with special digital electronics) would produce even better surround sound. All this from the same CD! The technique is too complicated to describe here, and interested readers should consult the Gerzon and Craven 1995 reference.

Peter Knight has pointed out that the CD format specification includes a four-channel quad format that would be suitable for PHJ encoded material. The problem, of course, is that existing CD players are not quad CD "aware" and would produce a mishmash if asked to play a quad CD. He has also pointed out that quad CDs have to be spun twice as fast as stereo CDs and have only half the playtime.

Are UHJ encoded CDs available?

Yes. A discography of 160 or so UHJ encoded record releases is available on the Surround Sound Discography Home Page at http://personal.riverusers.com/~manderso/. In addition, it is available as a text file by anonymous FTP from ftp://ftp.omg.unb.ca/pub/ambisonic/. I am also happy to e-mail copies to people. The discography was originally compiled by Eero Aro and is being maintained by Mark Anderson. It is not complete and updates are requested; details on how to do this are given at the beginning of the discography. Omissions include about 450 recordings from Nimbus Records. (All recordings from Nimbus Records are UHJ encoded.)

What happens with stereo sources (conventional LPs, CDs, etc)?

Domestic Ambisonic decoders feature a Super Stereo button for decoding stereo sources. This uses the rear part of the sound field to reinforce the location of sounds in front of the listener. In addition, any ambience present in the source will be directed all around the listener, although the effectiveness of this depends greatly on the recording technique that was used. The Super Stereo mode also includes a stereo width control which allows the stereo image to be compressed to mono-like or expanded into a horseshoe around the listener.

How does Ambisonics differ from quadraphonics?

Quadraphonics was a collection of incompatible systems introduced in the 1970s. The collection included SQ, QS, UMX and CD-4. As the name suggests they assumed four channels feeding four speakers, and usually assumed that the speakers would be 90 degrees apart. (Compare this with the 60 degrees between speakers in stereo.) The systems strived for channel separation and were characterised by the ping-pung-pang-pong effects they reproduced. While these effects were extremely impressive, they were also the antithesis of fatigue-free realistic sound reproduction. In contrast, Ambisonics attempts to recreate for the listener the complete sound field of the original performance. A particular number of speakers is not assumed and the technology can use various numbers and speaker layouts. With the Ambisonic Surround Sound system all of the speakers cooperate to localise a sound in its correct position. A problem with the quadraphonic systems was that they did not work, and could never have been made to work because they were based on false premisses. As explained in the Sommerwerck 1984 reference, part I, the major problem was that quadraphonics used the pair-wise mixing style. In contrast, Ambisonics does work. It is possible to use Ambisonic technology to decode many of the quadraphonic systems. A decoder to do this was manufactured by Integrex Limited and most of the design published in the Gerzon 1977 reference.

What is the pair-wise mixing style?

To produce a sound from the direction of a speaker requires only channel separation. To produce phantom sound images between speakers requires a mixing style. In stereo, the most popular mixing style is "pair-wise" mixing.

Pair-wise mixing is also called "pan-potting", "amplitude mixing" and "intensity stereophony". It mixes signals into the feeds for a pair of speakers to create the illusion that a sound is coming from a point somewhere between the speakers. During mixing, the apparent location of each sound is determined only by the relative amplitude of that sound in the two speakers. Almost all stereo recordings are mixed using the pair-wise mixing style.

The ear/brain localises sounds using phase differences between the ears as well as amplitude differences. (Phase is used to localise sounds with frequencies between 150 Hz and 1.5 kHz, amplitude for frequencies between 300 Hz and 5 kHz, and other cues for frequencies above 2.5 kHz. Note that the three frequency ranges overlap.) Fortunately, when a pair of speakers are in front of the listener and separated by 60 degrees or less, because each ear hears both speakers, low-frequency amplitude differences between the speakers are converted to phase differences between the ears. For most people the pair-wise mixing style works well in stereo. Unfortunately, pair-wise mixing works poorly when the speakers are to the rear of the listener and not-at-all when they are to one side. (See the Gerzon 1985 or the Fellgett 1981 references. Better still, try it yourself!) This means that any surround sound system that relies on pair-wise mixing between adjacent speakers must fail. This is as true for the 5.1 discrete channel systems of today as it was true for the quadraphonic systems of yesterday. Such absolute statements can be made because the way that the ear/brain localises sound has not changed. Ambisonics is completely unconnected with pair-wise mixing and does not suffer from its surround sound limitations.

With the Ambisonic mixing style, sounds can originate from any direction, either 360-degree horizontal or periphonic (full-sphere).

Can Ambisonics reproduce Dolby MP?

Dolby MP encoding, and Dolby Surround and Pro Logic decoding, are all described in the excellent article, Introduction to [Dolby] surround sound written by Bob Niland, rjn@csn.net. The article, which is available by anonymous FTP from ftp://ftp.csn.org/Laserdisc/ld03.txt, suggests that:

Dolby Motion Picture matrix encoding (Dolby MP) is an encoding system designed for motion picture sound tracks. Four channels are encoded - left, centre, right and a mono surround channel. Dolby Stereo is the two-channel result of this encoding. Dolby Surround is a decoding process designed to decode Dolby Stereo in the living room. Pro Logic is an active decoding process, also designed for the living room. Lucasfilm Home THX is an enhancement of Pro Logic. A Dolby MP encoded source is intended to be reproduced through a Dolby Surround or Pro Logic decoder. These assume rear speakers arranged only to create a reverberant sound field (typically, bipolar speakers facing the walls), and the use of a time delay to ensure that front sound effects are not localised to the rear.

Ambisonics is not limited to creating a reverberant rear sound field, and requires a different arrangement of speakers. Also, with Ambisonics all speakers cooperate to localise sounds, so the front-rear time delay is unnecessary (and would be detrimental).

Dolby Surround and Pro Logic decoders suffer from non-existent sound imaging to the rear and side of the listener. While not a serious impediment to the enjoyment of motion pictures, these limitations do make them unsuitable for music. Of course, there is nothing to prevent you reproducing Dolby MP encoded material through an Ambisonic UHJ or Super Stereo decoder. Many people prefer this, and describe the seamless coherent sound field which results as "superb". But it will not be what the sound engineer who created the recording intended you to hear.

Can Ambisonics reproduce Dolby Surround AC-3?

Ambisonics cannot contribute to the Dolby Surround AC-3 encoding or decoding processes; it can make contributions before the 5.1 discrete channels are encoded and after they are decoded.

Dolby Surround AC-3 is described in technical publications available from the Dolby Laboratories Inc WWW page at http://www.dolby.com/. These suggest that: AC-3 is a digital encoding technique that exploits "audio masking" to achieve high bit-rate reductions. AC-3 can be used to encode between 1 and 5.1 audio channels. Dolby Stereo Digital film sound format uses AC-3 to encode 5.1 audio channels onto film stock. The 5 channels, left, centre, right, right surround and left surround, are all full bandwidth. The .1 channel is a band limited (20 Hz to 120 Hz) bass effects channel. Dolby Surround AC-3, also called Dolby Surround Digital, is the consumer equivalent of Dolby Stereo Digital film sound and is also based on AC-3 coding of 5.1 channels.

Dolby Surround AC-3 does not use matrixing and the 5.1 audio channels have complete separation. Sadly, this is not sufficient for realistic surround sound reproduction, the problem being the "pair-wise" mixing style. Dolby Surround AC-3 is just a delivery mechanism and is not tied to pair-wise mixing, however, to date all Dolby Surround AC-3 movie sound tracks have been mixed using the pair-wise mixing style. Dolby Surround AC-3 was designed to enhance the enjoyment of motion pictures. The limitations of pair-wise mixing are not a serious impediment to this, however, they do make pair-wise mixed Dolby Surround AC-3 unsuitable for music.

One solution is for sound engineers to use a mixing style other than pair-wise mixing to mix the Dolby Surround AC-3 format. Happily, an alternative exists - Ambisonics.

Another poorer solution is for the 5.1 pair-wise mixed channels to be converted into W, X, Y and additional signals, and to then use Ambisonic technology to reproduce the sound field. This is described in the Gerzon 1992b reference.

Pair-wise mixed Dolby Surround AC-3 is only impressive; Ambisonics is accurate and can be impressive or subtle as required.

Can Ambisonics reproduce DTS Digital Surround?

Ambisonics cannot contribute to the DTS Digital Surround encoding or decoding processes; it can make contributions before the 5.1 discrete channels are encoded and after they are decoded. DTS Digital Surround is a method of compressing and encoding 5.1 discrete audio channels to make them suitable for transmission using CD, Laserdisc, DAT or DVD. More details are available on the DTS Digital Surround WWW page at http://www.dtstech.com/. The 5.1 channels have complete separation but, sadly, this is insufficient for realistic surround sound reproduction. Channel separation only permits sounds to be reproduced from the direction of a speaker. To reproduce sounds from between speakers requires a mixing style, and the most popular stereo mixing style, pair-wise mixing, simply does not work in surround sound.

DTS Digital Surround was originally designed to enhance the enjoyment of motion pictures. The limitations of pair-wise mixing are not a serious impediment to this, however, they do make pair-wise mixed DTS Digital Surround unsuitable for music.

One solution is for sound engineers to use a mixing style other than pair-wise mixing to mix the 5.1 discrete channels. Happily, an alternative exists - Ambisonics.

Another poorer solution is for the 5.1 pair-wise mixed channels to be converted into W, X, Y and additional signals, and to then use Ambisonic technology to reproduce the sound field. This is described in the Gerzon 1992b reference. Pair-wise mixed DTS Digital Surround is only impressive; Ambisonics is accurate and can be impressive or subtle as required.

Can Ambisonics make use of DVD?

A consortium has announced a Digital Versatile Disc format which can contain between 7 and 25 times as much data as the current audio CD format. (The higher figure is for a double-sided double-layered disc.) The proposal for an audio-only version of DVD, called High-Quality Audio Disc, has been released by the Acoustic Renaissance for Audio. The proposal, which is available on the ARA WWW page at http://www.meridian-audio.com/ara/, is for:

  • Full 3-D surround sound with up to six channels as well as a separate (conventional) two-channel feed.
  • Sampling at either 48 kHz or 96 kHz.
  • Up to 24 bits of precision. (Normally 20 bits would be used with 48 kHz sampling and 16 bits with 96 kHz.)
  • The use of loss-less compression, termed `packing'.
  • A trade-off, decided upon by the record producer, between precision, frequency bandwidth, number of channels and playing time.

To carry Ambisonic Surround Sound the proposal suggests encoding the W, X, Y and Z signals onto the DVD as a set of five feeds for speakers arranged in a regular pentagon, plus an optional height channel. These either can be reproduced directly using the standard five speaker layout, or the W, X, Y and Z channels can very simply be reclaimed for processing by an Ambisonic decoder. I guess that the five speaker feeds will not include psychoacoustic shelf filtering. HQAD is the first real opportunity for periphonic (full-sphere) Ambisonic source material to be released commercially. A periphonic decoder was demonstrated as long ago as 1980, but to date domestic full-sphere surround sound reproduction has been enjoyed only by a few enthusiasts with access to Ambisonic master tapes. The advent of HQAD and its accommodation of Ambisonics will, for the first time, bring full-sphere surround sound within the reach of everybody.

The advantages of periphony over horizontal surround sound are not only the possibility of using height for special effects but also that recordings sound more lifelike and less "hi-fi". For example, the timbre of orchestral instruments has the "feathery" quality heard at live events.

Practical periphony requires a minimum of six or eight speakers, some of which must be placed above and below the height of the listeners' ears. Readers interested in seeing the various speaker layouts which are possible should consult the Gerzon 1980 reference. Of course, horizontal Ambisonic decoders will still be able to produce "no compromise" horizontal surround sound simply by ignoring the Z (height) signal.

Have Ambisonic decoders been manufactured commercially?

Yes. Minim Electronics Limited marketed a range of three decoders, the AD 7, AD 8 and AD 10, and also a printed circuit board module for enthusiasts to incorporate into their own projects.

IMF Electronics assisted in the development of Ambisonics and manufactured a decoder, the D20B. Part of the company was resurrected under the name TDL Electronics Limited, but it is unlikely that they have had any interest in Ambisonics.

If Ambisonics is so wonderful, why is it not a commercial success?

We should first note that technical excellence and commercial success do not necessarily go hand-in-hand. This is why you are all watching VHS video tapes and not Betamax. Ambisonics has suffered from the following:

  • It came to market just as quadraphonics was dying away. Manufacturers had lost a bundle on quadraphonics and were not receptive to "yet another" surround sound system.
  • It was never supported by a major record company. The record majors had all backed different quadraphonic systems.
  • The rights were held by the National Research Development Corporation, now defunct. This was a sort of venture capital company, but one owned and run by the British government. The NRDC has little commercial nous. (Yes "nous", look it up.)
  • Ambisonics is thought of as a "purist" technique and not applicable to multi-track studio recording. This fallacy is demolished by The Alan Parson's Project Stereotomy, Arista 8384.
  • While Ambisonics can lend itself to the impressive ping-pung-pang-pong effects beloved by salespeople, it is usually used with more subtlety. This makes it difficult to sell.
  • It is British, ie, not invented in the USA or Japan.
  • It is British, ie, not well marketed.

How many speakers does Ambisonics use?

A major advantage of Ambisonic Surround Sound is that recording and studio processing are disengaged from reproduction. The former produce and operate on the W, X, Y and Z channels, but these can be reproduced through any number of speakers. The more speakers which are used the better, as this gives a larger listening area and a more stable sound localisation. Using more speakers also improves the illusion that the speakers have vanished; that is to say, the listeners hear a single seamless sound field. For horizontal surround sound a minimum of four speakers is required. Ambisonic technology places restrictions on the choice and placement of speakers. Specifically:

  • The speakers (power-amps) should have similar efficiencies(gains) and phase responses. The easiest way to achieve this is to use identical units.
  • All speakers (and power-amps) should cover the full frequency range. Unlike Dolby MP, there is no "surround channel" and it is not band limited.
  • Four speakers are placed in a rectangle, preferably with the longer side running front to back, all facing a point in the centre of your living room. A "layout" control compensates for different aspect ratios. The four speakers can be driven from either three or four power-amps.
  • Five speakers are placed in a regular pentagon, all facing a central point. They are driven from five power-amps.
  • Six speakers are placed in a regular hexagon, all facing a central point. They can be driven from between four and six power-amps.
  • A speaker (or speakers) can be moved closer to the central point, but should then be fed through a delay line (or lines).

Some diagrams would be useful here. The disadvantage of using only four speakers is that sounds with spiky waveforms (audience applause, harpsichords, oboes) tend to be drawn away from their correct location and towards a speaker. Using five or six speakers gives considerably more robust side and rear imaging. For periphonic (full-sphere) surround sound a minimum of six or eight speakers is required driven from at least four or five power-amps, respectively. Readers interested in seeing the possible speaker layouts should consult the Gerzon 1980 reference.

Auditorium decoders that can drive between 8 and 128 speakers are available (from Cepiar Limited). For domestic use the limiting factors are the cost of the necessary speakers and power-amps, and the practical problem of squeezing them into your living room. (The use of Ambisonics in auditoria is described in the Malham 1992 reference.) The speaker feeds are each a simple weighted sum of the W, X, Y and Z signals after they have passed through the shelf filters. Readers interested in seeing the equation should consult the appendices of the Gerzon 1985 reference.

With the "old-type" Ambisonic decoders described above, the apparent position of the centre-front sound image varies as the listener moves from one side of the living room to the other. This is not a problem when reproducing music, but when used with TV the on-screen sounds can become misaligned with the on-screen pictures. "New-type" Ambisonic decoders, described in the Gerzon and Barton 1992 reference, produce a stable centre-front sound image that solves this problem. These decoders can be used with irregular speaker layouts, either five speakers with one at the centre-front or six speakers with two at the centre-front.

Alternatively, it is possible to use an additional channel to stabilise the centre-front sound image, as described in the Gerzon 1992b reference. These newer types of Ambisonic decoder are not yet commercially available.


Sponsor Links
Page Statistics
  • This page was originally created by AndreaE-mail this user at 15:08 on Aug 15, 2005.
  • This page was last modified by Josh WE-mail this user at 21:22 on Apr 24, 2006.
  • This following users have made contributions: AndreaE-mail this user, Josh WE-mail this user.
  • This page was released under the terms of the: CC Attribution-NonCommercial-ShareAlike 2.5.
  • This page has been previously accessed a total of 3236 times.
 
Create an account or log in
User