Introduction

Agnieszka Roginska and Paul Geluso

Immersive Sound

Through sound, vision, touch, smell, and taste, multi-sensory integration into a scene can create an immersive experience. Immersive sound can give the listener an experience of being there through sound. Compared to vision, sound provides a fully immersive experience and can be perceived from all directions simultaneously. In fact, sound has the ability to ground a listener in a fixed location while other sensory information changes simultaneously. Filmmakers are well aware of this effect, often using sound to establish a fixed location in a scene while having the visual perspective change frequently.

For a moment, let’s consider an immersive experience generated using sound alone. The location relationships between the listener, sound source(s) and the boundaries of a room create auditory cues that convey a sense of space. A continuous yet seemingly directionless sea of sound can envelop a listener through the use of environmental elements such as reverberation. Using sound this way, the sense of being immersed can be accomplished through a constructed sound-scape of directional and non-directional sounds surrounding the listener.

For example, New York’s Grand Central station creates a natural immersive audio environment through a combination of close directional sounds such as conversation and foot steps. These combine with farther sound sources such as announcements, and interact with the acoustics of the space to create the reverberant and enveloping environment. The combination of sound sources and the enveloping environment create the immersive auditory experience. A sound source can become an environmental and enveloping sound or remain a point source, depending on the focus and perception of the listener.

The Listening Experience

The natural listening environment can be defined as the acoustic space we occupy during our daily lives. A virtual auditory space is an acoustic environment created through the use of loudspeakers or headphones designed to replace or augment the natural listening environment. Using technology, sounds from our natural listening environment can be captured, processed, stored, and/or transmitted to be reproduced in a virtual auditory space. The natural listening environment and the virtual auditory space both contribute to the listener’s experience.

Natural listening defines the way we hear normally, without the aid of loudspeakers. Very realistic virtual auditory spaces that approximate the natural listening environment can be created with the use of loudspeakers and headphones through various techniques described in this book. The goal of immersive sound may be to recreate a sound environment that is as close as possible to the real world, or it may be to create an experience that augments the real world and can only exist in the virtual space. To create a listener’s virtual experience, natural sounds can be captured or synthesized, processed, and played back using immersive sound reproduction systems.

Listener’s Perspective

In a virtual listening environment, we rely on both analytical and psychoacoustic abilities to make sense of the sounds that are being reproduced using loudspeakers or headphones. Consider a mono recording made with a single microphone played back through a single loudspeaker. In this situation, we can imagine how distant the sound sources are from the microphone and get a general impression of the recording space by analyzing the relative volume, timbre, and room reflections for each sound source captured in the recording. If a second microphone and loudspeaker are added to the system, our auditory system now has two signals to analyze, thus more physical information to process in order to extract spatial information to inform the listener’s perspective.

Figure 0.1

Figure 0.1 The listening experience.

A fixed perspective can be achieved by creating a monophonic or stereophonic sound stage where the virtual sound stage is bounded by the speakers in the frontal plane. Surround sound expands to a panoramic virtual sound stage including the listening environment, or placing the listener on the sound stage surrounded by virtual sound sources. These systems are considered channel based and rely on a priori knowledge about the location of the loudspeakers and their relationship to the listener. Sound field and wave field systems move away from the channel-based model and aim to reproduce the physical wavefront as it would appear in a natural environment by utilizing multiple loudspeakers.

Binaural audio reproduction techniques take advantage of the human natural spatial auditory cues to recreate a virtual auditory environment through headphones or loudspeakers that emulate headphone reproduction. This results in a “you are there”, first-person perspective, in contrast to the loudspeaker “they are here” systems described above.

As our recording and playback systems become more sophisticated, the immersive listening experience is guided by reproduction systems that can better approximate the natural listening environment. Thus, allowing the listener to rely less on their a priori knowledge to make sense of what they hear, and move toward a more natural listening experience. This creates a more compelling and easier to listen to virtual auditory environment.

About This Book

The intended audiences for this book are those enrolled in undergraduate, graduate study, and higher learning institutions in the areas of music technology, recording and production, sound design, sound art and film, as well as audiophiles, game designers, simulation and virtual reality professionals, post-production professionals, and entertainment professionals. We assume readers have a fundamental understanding of the physics of sound, digital audio, recording, and reproduction principles.

Chapters are written by experts in their corresponding field of research within immersive audio. Chapter authors are researchers in academic institutions, research laboratories in the industry, US government agencies. The body of the text is grounded in research and empirical work, with each chapter covering the evolution and historical perspective of the development of a reproduction technology.

The organization of the book proceeds with a chapter by Elizabeth Wenzel, Durand Begault, and Martine Godfroy-Cooper about the physiology, psychoacoustics, and acoustics of spatial hearing, including the perception of spatial sound, binaural cues, perceptual plasticity, distance perception, and environmental context for immersive sound. In Chapter Two, Braxton Boren takes the reader on a historical journey across immersive sound starting with the impact of immersive sound on our prehistoric ancestors, through the use of spatial separation of instruments and choirs as a compositional tool in the 15th century, to modern-age techniques and technologies of immersive sound. In Chapter Three, Paul Geluso introduces stereo loudspeaker systems and discusses techniques of sound capture, reproduction, and methods of stereo enhancement. Chapter Four, written by Agnieszka Roginska, describes the capture, synthesis, and reproduction of binaural sound over headphones, including extended reproduction techniques and applications of binaural sound. In Chapter Five, Edgar Choueiri describes the principles of binaural audio reproduction over loudspeakers using crosstalk cancellation. Continuing in Chapter Six, Francis Rumsey discusses the evolution and principles of surround sound, with speakers located on the horizontal plane. These techniques are extended in Chapter Seven, where Sungyoung Kim describes the methods of surround sound reproduction with height speakers, including the psychoacoustics of height perception, configuration of loudspeakers, and recording techniques.

Nicolas Tsingos discusses the principles of object-based audio in Chapter Eight, where he describes audio objects, their representation, advanced metadata, audio object capture, and rendering. In Chapter Nine, Rozenn Nicol addresses the theory and practice of sound field capture and reproduction, starting from the initial development of the sound field approach to High Order Ambisonics. Wave Field Synthesis is described in Chapter Ten, where Thomas Sporer, Karlheinz Brandenburg, Sandra Brix, and Christoph Sladeczek describe the development of Wave Field Synthesis from one of the earliest examples of immersive sound through the acoustic curtain by Steinberg and Snow in 1934, through the theory and practice of WFS reproduction, and its limitations and applications using modern techniques and signal processing. The book concludes with Brett Leonard describing the applications of extended multi-channel techniques in Chapter Eleven, where he discusses broader concepts and introduces practical mixing techniques for engineers working with immersive sound.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset