Chapter 6.2. Empowering Your Audio Team with a Great Engine

Mat Noguchi, Bungie

Making award-winning game audio at Bungie isn’t just about the using best technology or having the best composers (although that doesn’t hurt). The best technology will ring flat given poor audio, and the best music will sound out of place given poor technology. If you really want your game to sing, you need to put audio in the control of your audio team.

For the past nine years, with a core set of principles, a lot of code, and even more content, Bungie has empowered its audio team to make masterpieces. This gem will explore the audio engine that drives Halo, from the basic building blocks the sound designers use to the interesting ways the rest of the game interacts with audio. We will also take a peek at the post-production process to see how everything comes together.

Audio Code Building Blocks

The sound engine starts with the s_sound_source.

   enum e_sound_spatialization_mode
   {
       _sound_spatialization_mode_none,
       _sound_spatialization_mode_absolute,
       _sound_spatialization_mode_relative
   };

   struct s_sound_source
   {
       e_sound_spatialization_mode spatialization_mode;
       float scale;

       // only valid if spatialization_mode is absolute.
       point3d position;
       quaternion orientation;
       vector3d translational_velocity;
   };

This structure encompasses all the code-driven behavior of sound. You have your typical positional audio parameters, a fade on top of the default volume, some stereo parameters, and a single value called scale. What is scale?

The scale value is used to parameterize data from the game engine to the audio engine. It is normalized to lie within [0, 1], making it simple to use as an input into a function or linear range. Everything that can play a sound in our game exports at least one scale value, if not more. As a simple example, sounds that get generated from particle impacts receive a scale normalized between 0.5 and 1.5 world units/second. A more complex example would be the sounds that play when a Banshee banks sharply and forms contrails at the wing. The actual scale that gets exported is shown in Figure 6.2.1 in our object editor.

An object function from the Warthog for the engine sound.

Figure 6.2.1. An object function from the Warthog for the engine sound.

This is an example of an object function; it takes various properties exported by an object and combines them into a single value that can be sent to the sound system. Incidentally, we drive our shaders in a similar way, although shaders can use more than one input.

In general, simple things such as impacts and effects export a single scale. Objects such as the Warthog and Brute can export a combination of multiple scales.

Parameterizing audio with a single value may seem a bit simplistic. However, as we’ll explore later, we tend to parameterize only a few properties of a sound based on scale, and in almost all cases it makes sense to parameterize multiple properties in a coupled fashion. For spatialized audio, we have a separate distance envelope that we’ll describe in the next section.

Sound Parameterization

Given that we can send the audio engine interesting data from the game, we need to author content to use this data (that is, the scale and distance). The audio designers export .AIFF files, which get converted into the native platform format (XBADPCM for Xbox and XMA2 for Xbox 360), and they attach in-game metadata through our custom game content files called tags. Sound content breaks down into one of two categories: impulse sounds and looping sounds.

Impulse Sounds

For impulse sounds, such as impacts, gunshots, and footsteps, we allow the audio designers to adjust gain and pitch with the scale shown in Figure 6.2.2.

Scale parameter editor.

Figure 6.2.2. Scale parameter editor.

(Side note: Having your data use units that the audio team understands goes a long way to making them feel at home with the data they have to work with!)

For spatialized audio, we also can specify a distance envelope, as shown in Figure 6.2.3.

Distance envelope editor.

Figure 6.2.3. Distance envelope editor.

From the sound source origin to the “don’t play distance,” the sound is silent. From “don’t play” to “attack distance,” the sound scales from silence to full volume. Between “attack distance” and “minimum distance,” the sound plays at full volume. And from “minimum distance” to “maximum distance,” the sound scales from full volume back to silence.

The audio designers use the attack distance primarily for sound LODs. You can hear this for yourself in any Halo 3 level: A sniper rifle firing far away sounds like a muffled echo, while the sniper rifle firing up close has the crisp report of a death machine. See Figure 6.2.4.

Distance envelopes for the sniper rifle gunshot.

Figure 6.2.4. Distance envelopes for the sniper rifle gunshot.

Impulse sounds can also be parameterized based on the total number of instances of that sound playing. For example, when glass breaks, it can form a few or a lot of broken glass particles. A lot of glass hitting a concrete floor sounds much different than a little; attempting to replicate that sound by playing a lot of the same glass impact sound does not work without a prohibitively large variety of sounds.

To combat this, we allow sounds to “cascade” into other sounds as the total number of sounds hits a certain threshold. For glass, the sound tag can specify a set of promotion rules (see Figure 6.2.5).

Broken glass particle promotion rules.

Figure 6.2.5. Broken glass particle promotion rules.

These promotion rules are defined in the order that they should play at run time. For each rule, you can specify which kind of sound to play (for example, few glass pieces, many glass pieces) as well as how many instances of that kind can play before you start the next rule. Each rule can also contain a timeout to suppress all sounds from previous rules.

Using the rules from Figure 6.2.5, if we played five glass sounds at once, we would play four instances of the breakable_glasspieces_single sounds. When the fifth sound played, we would play a breakable_glass_few sound and stop the previous four breakable_glasspieces_single sounds. If we then managed to play four more breakable_glass_few sounds in the same way (such that they were all playing at once), we would play a breakable_glass_many sound, stop the previous breakable_glass_few sounds, and then suppress any future glass sound for two seconds.

Cascading sounds allow us to have an expansive soundscape for particle impacts without playing a prohibitive number of sounds at once.

Looping Sounds

A sound that does not have a fixed lifetime (such as engine sounds, dynamic music, or ambience) is created using looping sounds. Because looping sounds are dynamic, we allow their playback to be controlled with a set of events: start, stop, enter alternate state, and exit alternate state. (More on alternate state in a bit.) Since these events are really just state transitions, we need just two more bits for playing looping sounds: one bit for whether the loop should be playing and one bit for whether it should be in the alternate state. For each event, as well as the steady state of normal playing and alternate playing, the audio designers can specify a sound. In the steady state when a looping sound is playing, we simply keep playing the loop sound. It’s usually authored such that it can play forever without popping. For transition events (start, stop, enter alternate, exit alternate, and stop during alternate), those sounds either can be queued up to play after the loop or can play on top of the currently playing loop.

Looping sound state diagram.

Figure 6.2.6. Looping sound state diagram.

(“Alternate” is really a way of saying “cool.” During the development of Halo 1, the audio director Marty O’Donnell asked for a way to have a cool track for music, so we added the alternate loop.)

Dynamic music is implemented with just a few script commands: start, stop, and set alternate <on/off>.

Vehicle engines are implemented with looping sounds; however, in order to capture more intricacies with how engines sound under various loads (in other words, cruising at a low speed sounds much different than flooring the accelerator), we use something similar to the cascade system to select different sounds to play based on scale: the pitch range. (In fact, as an implementation note, cascades are implemented referencing pitch ranges.)

As the name implies, a pitch range specifies a certain range of pitches to play in (for example, only play this pitch range when the sound is playing from –1200 cents to 1200 cents). There are many playback parameters for that pitch range, such as distance envelopes and relative bend. Relative bend is the bend applied to the permutation playing from this pitch range based on a reference pitch. In the example in Figure 6.2.7, if we were playing the sound with a scale-based pitch of 55 cents, the idle pitch range sounds would play with an actual pitch of –110 cents (pitch – reference pitch). The “playback bends bounds” simply clamps the pitch to those bounds before calculating the actual pitch.

Pitch range editor.

Figure 6.2.7. Pitch range editor.

This is probably more complicated than it needs to be, since we are basically parameterizing the pitch, then using that to select a pitch range, then converting that back into a relative bend to play sounds from that pitch range. But that’s more a historical artifact than anything else, and now the audio designers are used to it.

At run time, you can have multiple pitch ranges from a single loop playing at once (see Figure 6.2.8).

Warthog pitch ranges. Actual gain is displayed as power (gain2).

Figure 6.2.8. Warthog pitch ranges. Actual gain is displayed as power (gain2).

This allows for smooth cross-fading between multiple pitch ranges based on the input scale.

The looping sound system has been powerful enough to add novel uses of sound without additional modifications. For example, in Halo 2, we added support for continuous collisions (for example, the sound a boulder makes rolling or sliding down a hill) from Havok by generating a looping sound at run time whenever we registered an object rolling or sliding; we mapped the normal loop to rolling and the alternate loop to sliding so that a single object transitioning between rolling and sliding would have a smooth audio transition between those states.

This kind of flexibility makes it very easy for the audio designers to collaborate with other programmers without necessarily having to involve an audio programmer. If you can export a scale value, you can easily add either an impulse or a looping sound to whatever it may be.

Mixing

One powerful aspect of Bungie’s audio engine is how well it is integrated into the overall game engine; everything that should make a sound can make a sound, from weapons firing, to objects rolling and bouncing, to the various sounds in the HUD based on in-game events. One daunting aspect of the Halo audio engine is that almost everything makes a sound in some way, which means the audio designers have to make a lot of audio content.

To make it easier to manage sound across the entirety of a game, we assign every sound a sound class, essentially a label we use to define a set of default sound properties, such as distance envelope, Doppler effect multiplier, and so on. The properties in the sound class will be applied to all sounds with that sound class by default, so the audio designers only have to tweak a few sounds here and there.

Example 6.2.1. A non-exhaustive listing of sound classes

projectile_impact
projectile_detonation
projectile_flyby
projectile_detonation_lod

weapon_fire
weapon_ready
weapon_reload
weapon_empty

object_impacts
particle_impacts
weapon_fire_lod

unit_footsteps
unit_dialog
unit_animation

vehicle_collision
vehicle_engine
vehicle_animation
vehicle_engine_lod

music
ambient_nature
ambient_machinery
ambient_stationary
huge_ass

mission_dialog
cinematic_dialog
scripted_cinematic_foley

We also use sound classes to control the mix dynamically at run time. For each sound class, we store an additional attenuation to apply at run time—essentially, a sound class mix. These values can be script-driven; for example, during cinematics, we always turn down the ambience sound classes to silence with the following script call:

  (sound_class_set_gain "amb" 0 0)

We use a simple LISP-like scripting language. With the sound_class script commands, we use the string as a sound class substring match, so this script command would affect the gain (in amplitude) for all sounds that have “amb” in them. If we had a sound class called “lambchop” it would also affect that, but we don’t.

In addition to manually setting the mix, under certain gameplay conditions, we can activate a predefined sound class mix. For example, if we have Cortana saying something important to you over the radio, we’ll activate the spoken dialog mix. These mixes, which are automatically activated, fade in and out over time so that the change in volume doesn’t pop. The scripted mix and dynamic mix are cumulative; it’s simple, but that tends to match the expected behavior anyway.

Post-Production

The bulk of audio production is spent in creating and refining audio content. This is a lot of work, but it’s fairly straightforward: Create some sound in [insert sound application here], play it in game, tweak it, repeat. However, as the project gets closer to finishing, the audio team has two major tasks left: scoring the game and finalizing the overall mix.

Scoring any particular level is a collaborative process between the composer and the level designer. The composer works with the level designer to determine music triggers based on gameplay, progression, or anything else that can be scripted. (The endgame driving sequence of Halo 3 has three triggers: one when you start driving on the collapsing ring, one after Cortana says “Charging to 50 percent!”, and one when you make the final Warthog jump into the ship.) Each trigger can specify what looping sound to play, whether to use the regular or alternate loop, and when to start and stop. The composer can then work alone to determine the appropriate music to use for the entire level. This collaborative effort allows the composer to remain in creative control of the overall score for a level while allowing the level designer to provide the necessary hooks in his script to help create a dynamic musical score.

There is also a chunk of time set aside at the end of production for the audio team to work with finalized content. At this point all the graphics, cinematics, scripts, levels, animations, and so on, are locked down; this allows the audio team to polish without needing to worry about further content changes invalidating their work. Once all the sound is finally in place, the audio team then plays through the entire game in a reference 5.1 studio to adjust the final mix and make sure everything sounds great.

Conclusion

Bungie’s audio engine isn’t just a powerful engine; it’s a powerful engine that has continued to evolve over time. Many of the concepts and features presented in this gem have been around since the first Halo game. Having a mature audio engine means that the entire audio team can iterate the process of making game audio instead of having to reinvent technology from scratch. Many of the innovations in Bungie’s audio engine have come from the audio designers, not just the programmers. In Halo 2, they came up with coupling the environment ambience loops with the state of the weather so that when a level transitioned to rain, so would the ambience. In Halo 3, they suggested the attack portion of the distance envelope to support sound LODs.

In other words, Bungie’s audio engine is not just about technology; it’s about enabling everyone who works on audio to do great things. Any programmer who wants to add sound to their feature just needs to use the s_sound_source. Any audio designer can custom tailor the playback of any sound with a huge amount of flexibility and functionality. And with our mature and proven audio engine, an audio programmer has the framework to add functionality that can be used right away, in infinite variety.

The trifecta of lots of content, a fully integrated sound engine, and an effective audio production process, combined with Bungie’s talented audio team, forms an award-winning game audio experience. The numerous accolades Bungie has received for audio for the entire Halo series show that our approach to game audio works—and works well.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset