2. Dramatic Foundations: Part 1: Elements of Qualitative Structure

The purpose of this chapter and the next is to provide a framework of dramatic theory that can be applied to the task of designing human-computer experiences. They are structured around the fundamental precepts of dramatic form and structure and are based primarily on Aristotelean poetics.1 We will take up each basic idea and then adapt it to the human-computer context, arriving at what may be described as a poetics of interactive form (remember that we defined “human-computer interaction” as enabling and representing actions with human and technological participants). Once we have constructed a theoretical base, we will go on to explore its implications in some selected areas of design.

1. The term “poetics” is used to describe a body of theory that treats a poetic or aesthetic domain.

This approach necessitates that you endure some delay of gratification. You will be forced to wade through a welter of analogies, definitions, and hypotheses before a coherent picture can emerge. Hopefully, the case presented in the first chapter is sufficiently persuasive to lure you into taking the journey. By the end of the next chapter we will be able to pull the various elements together into a useful theory.

Hoary Poetics

People often find it quite peculiar that I turn to a theory that is over two thousand years old to gain insight into a very recent phenomenon. Even those who can be persuaded that artistic and literary theories may be useful in the computer domain have difficulty with what they perceive as an extremely antiquated approach. Why Aristotle? How can it be useful to us today to employ concepts that were defined in the fourth century BCE? Aren’t there more contemporary views that would be more appropriate to the task?

I want to answer the latter question first. Without a doubt there are more recent theorists who have made major contributions to the body of dramatic criticism; the next few chapters will touch on the work of many of them. But none has provided a theory of the drama that is as wide-ranging, complete, and well integrated as Aristotle’s; they haven’t needed to. For most, the Poetics has been a jumping-off place—a body of ideas to tweak and elaborate on. For some, it has been something to bounce off of; many theorists (such as Bertolt Brecht, mentioned in Chapter 1) have persuasively amended Aristotle’s poetics on certain points. But none has presented a fully formulated alternative view of the nature of the drama that has achieved comparably wide acceptance.

A second reason for looking to the Poetics as opposed to more contemporary theories (such as post-structuralism) is that the Aristotelean paradigm is more appropriate to the technology to which we are trying to apply it. In order to build representations that have theatrical qualities in computer-based environments, a deep, robust, and logically coherent notion of structural elements and dynamics is required, and this is what Aristotle provides.

Aristotle (384–322 BCE) was a student and successor of the philosopher Plato. His many works included the Ethics, Rhetoric, Physics, and Metaphysics. Natural Philosophy, as it was called in his day, eventuated in what we now know as science. His work encompassed what we now call both philosophical and scientific thought, and he explored subjects from biology to logic, government to art. He was tutor to Alexander the Great, whose assumption of power in 336 BCE ushered in the Hellenistic Age.

Aristotle worked and wrote in the century after the great blossoming of Greek drama, exemplified by the works of Aeschylus (525–456 BCE), Sophocles (496–406 BCE), Euripides (484–406/7 BCE), and Aristophanes (448–380 BCE). During the brightening days of the fifth century BCE, theatre seemed to spring full-blown from the brows of these early dramatists.

Looking back on that remarkable century, Aristotle set himself the task of understanding where the various forms of poetry, including narrative, lyric, and dramatic, came from and how they work. Aristotle’s work was a response to criticisms of poetry leveled by his teacher, Plato. Plato asserted that the poetic process is fundamentally incoherent and defies explanation; Aristotle described the process of poetic composition in logical terms. Plato complained that drama and poetry did not “inculcate virtue”; Aristotle countered by describing and defending the value of the things that poetry does accomplish:

[Poetry] aims at pleasure, but at the rational pleasure which is a part of the good life; by its representation of serious action it does indeed excite emotions, but only to purge them and so to leave the spectator strengthened; since art represents universals and not particulars, it is nearer to the truth than actual events and objects are, not further from it, as Plato maintained (Kitto 1967).

Aristotle is often referred to as the progenitor of western science because of the methods of observation and inquiry that he employed, as well as his insatiable and far-ranging curiosity. A common objection to his dramatic theory is that it is too prescriptive; the Poetics is mistakenly viewed as a book of rules (this is due, in large part, to the neoclassical critics of the Renaissance, many of whom distorted Aristotle’s work to support their belief that drama should provide explicit moral instruction). The truth is that Aristotle’s goal was to observe, analyze, and report on the nature of the drama, not to generate rules for producing it. His theories may be used productively, not because they are recipes, but because they identify and elucidate drama’s formal and structural characteristics.

The Cultural Backdrop

The occasion for the great Greek tragedies of the fifth and early fourth centuries BCE was the festival of Dionysus, the Greek god of nature, fertility, and celebration. Students of popular culture may recognize Dionysus (also known as Bacchus) as the giddy wine-stained god astride the donkey in the wine-making sequence of Disney’s Fantasia. While revelry was certainly a major part of Dionysus’ gestalt, he was a somewhat more imposing figure than the Disney representation suggests. The spirit he represented was at the wellspring of life; his was the energy on which survival utterly depends.

The Festival of Dionysus was an annual event that celebrated the symbolic death and rebirth of the god and, hence, nature. Several plays were commissioned for performance at each festival as contestants for a prize for the best drama (see Figure 2.1 for a diagram of the Theatre of Dionysus in Athens). The theatrical people who were involved in the production of the plays (including actors, musicians, and costumers) maintained a strong connection to the Dionysian religion, eventually forming a guild whose head was usually a Dionysian priest.

Image

Figure 2.1. The Theatre of Dionysus in Athens, where most of the great Greek tragedies were originally performed. The audience sat in a semicircular arrangement around and above the performance area, called the orchestra. The stage house or proskenion included elaborate facades and stage machinery (such as cranes that could lower “gods” from the “sky”).

Early Greek drama sprang from the intersection of philosophy, religion, civics, and art. The occasion was ostensibly religious, and there is reason to believe that at least some of the actors felt themselves to be “in possession of the god” as they performed in the festival that honored him. The subjects chosen by the great tragic playwrights for theatrical representation at the festival were matters of serious import, depicting the evolution of Greek philosophy through their dramatic treatment of known myths and stories such as the tragedies of Agamemnon, Orestes, and Oedipus. They communicated philosophical and religious ideas and also provided the occasion for the collective experience of emotion.

It’s important to recognize that most of the stories upon which the great tragedies were based were known to most of the audience; people were not going to the theatre to see how the plot turned out. The commissioned works were likely presented in response to the times, and their presentation formed a sort of public discourse. The Chorus in the Greek Theatre was like a mass character representing what might be cast as the citizens’ responses through dance and song. The comedies of Aristophanes were clearly built around current events and issues. Greek drama was the way that Greek culture publicly thought and felt about the most important issues of humanity, including ethics, morality, government, and religion. To call drama merely “entertainment” in this context is to miss most of the picture.2 The Greeks employed drama and theatre as tools for thought and discourse in the Polis.

2. It is interesting to note how our own popular culture reveals vestiges of these values, especially the civic, in some of our films and television shows (e.g., Thin Red Line, All in the Family, or Angels in America). Such productions can engage the whole culture in the consideration of matters of deepest import. Unfortunately, most of our media fare trivializes or ignores such concerns, thereby diminishing us by diminishing what we think about and how we think about it.

Drama: Tragedy, Comedy, and Melodrama

Aristotle distinguished between tragedy and comedy in terms of the central emotions that they are intended to evoke. Tragedy has the power to arouse and purge pity and fear. These emotions are actually spelled out in Aristotle’s Rhetoric. Fear is based on probability—uncertainty and suspense (Rhetoric 1382a, 20–29). Pity is our response to something destructive or painful happening to someone who does not deserve it (Rhetoric 1385b, 10–22.) In tragedy, the protagonist (main character) may have a tragic flaw—a characteristic that is often something admirable; for example, Hamlet’s tortured concern over his father’s death. It may also be a moral or intellectual flaw, or a mistake.3 In tragedy, the purging of these emotions—catharsis—is the emotional release that comes with the ending of the play.

3. The Greeks used the word hamartia to refer to a mistake that is an error in judgment. Literally, the word means “to fall short.” Interestingly, the same word was used in Greek versions of the Old Testament and was translated as “sin.”

Comedy deals with “the ridiculous,” which Aristotle defines as a “mistake or deformity not productive of pain or harm to others” (Poetics 1449a, 30–36). Its power is to deliver pleasure and laughter. Aristotle mentioned that both comedy and tragedy “work” because “it is natural for all to delight in works of imitation.” He continued, “The explanation is to be found in a further fact: to be learning something is the greatest of pleasures not only to the philosopher but also to the rest of mankind, however small their capacity for it; the reason of delight . . . is that one is at the same time learning—gathering the meaning of things. . . .” (Poetics 1448b, 4–24)

Both tragedy and comedy, Aristotle asserted, had their origins in improvisation; comedy began with “phallic songs,” says Aristotle (Poetics 1449a, 10–13). They probably began as village revels. People came in processions through the countryside and other towns brandishing phallic icons and hurling insults, a practice called fleering (e.g., “your mother was a hamster”). Eventually, over time, these comic performances earned a place in the Festival of Dionysus. They were the ancestors of the great Greek Comedies such as those written by Aristophanes.

The Comic form throughout its evolution was disrespectful, taunting, transgressive, and funny, meaning no serious harm. The great comedies of the Greeks, as well as those of the Elizabethan and Restoration periods, were wonderfully structured works of art that utilize the same causal patterns and structural characteristics as great tragedies but for different purposes, often with social and political referents. Aristophanes was a master at this. Among his many political plays, we’re probably most familiar with Lysistrata. Performed in 411 BCE, the play protested the Peloponnesian War by depicting a political movement among women to deny sex to their husbands until they stop fighting. When we look now at transgressive and critical games, we see the descendants of Comedy in interactive form.

Melodrama as a form was not treated by Aristotle, but later scholars (including my mentor, Professor Donald R. Glancy) describe it as a form that is “seemingly serious” but which fails to rise to the level of moral and ethical choice that is characteristic of tragedy. Its power is to arouse and purge pity and terror. Terror is understood as an emotion that is intense but transient. Most often, characters in melodrama evoke sympathy (feeling with—that’s awful for you and I’m glad it’s not happening to me), but not empathy (feeling into—that could be me).

In summary, drama is not equivalent to tragedy. It exists in several forms—tragedy, comedy, melodrama, and various mash-ups.

The Four Causes, or Why Things Are the Way They Are

In science as well as in art, the Greeks of the fifth and fourth centuries BCE were discovering and inventing a way to view a world of unprecedented scope and order through the rapidly evolving tools of philosophy. In exploring the nature of the drama and other arts, Aristotle employed the same conception of causality to which he attributed the forms of living things, and that is a good place to begin.

How does a representation of an action—a play or a human-computer activity—get to be the way it is? What defines its nature, its shape, its particulars? What forces are at work? Lest you be tempted to balk at this excursion into the theory, I want to remind you of the reason for taking it: Understanding how things work is necessary if one is to know how to make them. When a made thing is flawed or unsuccessful, it may not be due to poor craftsmanship. People have designed and built beautiful buildings that wouldn’t stand up, people have written plays with mellifluous words and solid dramatic structure that closed after one night in New Jersey, and people have designed software with lovely screens and loads of “functionality” that leave people pounding on their keyboards in frustration. The reason for failure is often a lack of understanding about how the thing works, what its nature is, and what it will try to be and do—whether you want it to or not—because of its intrinsic form.

The Four Causes in Drama

The four causes are forces that operate concurrently and interactively during the process of creation. While Aristotle also applies them to living organisms, we will restrict our discussion to the realm of made things. We will begin with definitions of the four causes and then apply them, first to drama, and then to human-computer interaction.4

4. I have employed the traditional terminology, not out of a desire to promote philosophical jargon, but because it is quite difficult to find synonyms that do these concepts justice, and also because more casual terminology can lead to confusion downstream.

Formal cause: The formal cause of a thing is the form or shape of what it’s trying to be. So for instance with architecture, the formal cause of a building is the architect’s notion of what its form will be when it’s finished. Those formal properties of “building-ness” (or “church-ness,” or “house-ness,” etc.) that are independent of any particular instance of a building (or church or house) and that define what a building is, serve as one component of the formal cause. They are filtered through the mind of the architect, where they are particularized by various design contingencies (there needs to be sunlight in the morning room, the conference room needs to accommodate a group of fifty, etc.), as well as his or her own values, tastes, and ideas.

Formal causality operates through an idea or vision of the completed whole, which will undergo change and elaboration as the process of creation unfolds; that is, there is a reciprocal relationship between the formal cause and the work in progress. The formal cause for a thing may be muddy or clear, constant or highly evolutionary, but it is always present.

Material cause: The material cause of a thing is what it’s made of. So, to pursue the architecture example, the material cause of a building includes stones or concrete or wood, glass, nails, mortar, and so on. Note that the properties of the materials influence the properties of the structure; e.g., wood is more flexible than steel, but steel is stronger.

Efficient cause: The efficient cause of a thing is the way in which it is actually made. This includes both the maker(s) and the tools. For instance, two buildings with the same architectural plan and the same materials created by different builders with different skills and tools will differ in terms of their efficient cause.

End cause: The end cause of a thing is its purpose—what it is intended to do in the world once it’s completed. In architecture, a building is intended to accommodate people, living or working or playing or performing operas or whatever, according to the kind of building it is.

Now let’s apply these four causes to the theatre:

Formal cause: The completed plot; that is, the whole action—with a beginning, middle, and end—that the playwright is trying to represent. The “whole action” subsumes notions of form and genre and the patterns that define them.

Material cause: The stuff a play is made up of—the sounds and sights of the actors as they move about on the stage. Note that the material of a play is not words, as one might think from reading a script. That’s because plays are intended to be acted out, and there’s more to enactment than words. The enactment is the performance—that which unfolds before the eyes and ears of the audience.

Efficient cause: The skills, tools, and techniques of the playwright, actors, and other artists who contribute to the finished play.

End cause: The pleasurable arousal and expression of a particular set of emotions in the audience (catharsis).

As mentioned in Chapter 1, “pleasurable” is a key word in understanding catharsis; emotions aroused by plays are not experienced in the same way as emotions aroused by “real” events, and even the most negative emotions can be pleasurable in a dramatic context (the success of such film genres as suspense and horror depends on this fact). Various cultures, including the ancient Greeks, have included ideas like civic discourse to the end cause.5 It is safe to say that since emotion depends upon the successful communication of content, then some level of communication is implicit in the end cause. We will explore this aspect further in the discussion of causality and universality in the next chapter.

5. The theatre of Bertolt Brecht is a more modern example. Brecht held that the play was not finished until people acted upon it in their real lives.

The Four Causes in Human-Computer Interaction

How can we define these four causes for human-computer interaction? In this discussion it is difficult to avoid using computer-related terminology, which in many cases is already loaded with connotations that are not always appropriate. Among these terms are “functionality,” “program,” “application,” “representation,” and “agent.”

In computerese, “functionality” refers to the things that a program does—a spreadsheet can make calculations of certain types, for instance, and a word processor can do such things as move text around, display different fonts, and check spelling. Interface designers often describe their task as representing a program’s functionality. But this idea brings us to the tree falling in the forest again. A spreadsheet’s ability to crunch numbers in certain ways is only potential until a person gives it some numbers to crunch and tells it how to crunch them, in fine or gross detail. Thus the definition of functionality needs to be reconceived as what a person can do with a program, rather than what a program has the capacity to do. This definition lands us back in the territory of interaction with human and computer-based agents. It also contains a word we haven’t used yet: “program.”

A program is a set of instructions that defines the potential actions that make up a human-computer activity and their representations. These actions and representations may change as the result of ongoing action (for instance, as the result of capturing or inferring people’s preferences). A program also defines the environment for action and the other objects that inhabit that environment, including their representations and capabilities. Actually, the elements of action and environment and their representations are always the result of more than one program—in most computational devices, many aspects of the “interface” are embedded in the operating system and layers of intermediate software libraries. Of course, the potential of a program is also shaped by the language in which and the hardware for which it is written—what kind of computation it can perform, for instance, the qualities of its display, and its interface affordances.

In theatrical terms, a program (or a cluster of interacting programs) is analogous to a script, including its stage directions. A script is constrained by the physical realities of the kind of theatre in which it is to be performed and the capabilities of the stage machinery and actors. Program code is equivalent to the words of a script (including the theatre’s own brand of jargon; e.g., “move stage left” or “counter-cross”). In his investigations of artificial intelligence, Professor Julian Hilton adds another dimension to this analogy:

The text [of a play] therefore, is a combination of explicit and implicit notational systems which have as their initial purpose the enablement of an event in which performers and audience can share as partners. While obviously the notion of a computer was alien to Shakespeare, that of his theatre as a complex space-time machine was certainly not. . . . (Hilton 1991)

Functionality is equivalent to the script parsed, not by words but by actions. An apparent difference between programs and theatrical scripts is that programs are not intrinsically linear in form, while scripts generally are. At the highest level, this nonlinearity means that programs can cause different things to happen depending upon the actions of their interactors; that is, “authorship” is collaborative in real time (this aspect will be further explored in the discussion of plot, ahead). In summary, then, functionality consists of the actions that are performed by people and computers working in concert, and programs and interface affordances are the means for creating the potential for those actions.

An “application” is generally described as a distinct program designed to deliver a particular set of functionality to interactors, as opposed to programs that are not directly accessible to people, such as those which live deep in the bowels of missile silos and operating systems. Informal taxonomies of applications exist; e.g., applications for document creation and computer-assisted design (CAD) belong to the larger class of productivity applications; drawing, painting, and music programs are often classified as “creativity” applications; and adventure, action, and strategy games are “entertainment” applications. The most important way in which applications, like plays, are individuated from one another is by the particular actions that they represent. Applications are analogous to individual plays; the larger categories are analogous to genres and forms of plays (tragic, comic, didactic, etc.). Style is a more sophisticated concept that is used in both drama and computer applications, especially games.

We have used the word “representation” throughout the first chapter to distinguish the shadowy realms of art and human-computer activity from phenomenal reality. Webster’s defines a representation as “an artistic likeness or image” (and also, incidentally, as “a dramatic production or performance”). The Greek word for artistic representation is mimesis. Both plays and human-computer activities are mimetic in nature; that is, they exhibit the characteristics of artistic representations. A mimesis is a made thing, not an accidental or arbitrary one: Using a pebble to represent a person is not mimetic; making a doll to represent him is. We often use the word “representation” followed by “of” and then the name of some object; e.g., a character is a representation of a person, or a landscape painting is a representation of a place. But in art as in human-computer interaction, the object of a mimesis (e.g., that which it is intended to represent) may be a real thing or a virtual one; that is, a thing that exists nowhere other than the imagination. A play may be a mimesis of events (literally, a series of actions) that are taken from history or that are entirely “made up.” Mimetic representations do not necessarily have real-world referents.

In computerese, two kinds of representations are acknowledged: internal and external representations. For example, a page icon may serve as the external representation of a document. Both the document and the icon have internal representations that consist of the code that defines them—how they look and behave. In keeping with the principle that “the representation is all there is,” however, an internal representation has no value by itself, just as the working script for a performance is likely never seen by an audience. As a program, an internal representation is merely the potential for what may be manifest in the external representation—that which has sensory and functional properties. As it is used in this book, the term “representation” subsumes both aspects.

We have said that human-computer interactions can be defined as representations of actions with agents of both human and computer origin. The word “agents” has a particular meaning in computerese that is a derivation of the more general sense of the word. A computer-based “agent” is defined as a bundle of functionality that performs some task for a person, either in real time or asynchronously. “Bidding agents” on eBay are an example. Agents may be represented as “beings”—that is, as characters—but they need not be. The Aristotelean definition of an agent is the root of both of these permutations: an agent is one who initiates and performs actions. So in any human-computer activity, there is at least one agent—the human who turns on the machine—and if the machine does anything after it boots, then there are at least two. This book uses the more general definition because, as I will argue later in this chapter, computer-based agency is present in all human-computer activities, whether or not it is coalesced into coherent agent-like “entities” in the representation.

Given these definitions, we can now take a run at the four causes as applied to human-computer interaction:

Formal cause: The formal cause of a particular human-computer activity (that is, an extended set of interactions bound together) is the form of what it’s trying to be. Human-computer interaction generally lacks the kind of well-known formal categories that drama offers (comedy, tragedy, etc.), although game genres like “first-person shooter” or tools like a “video editor” have formal characteristics.6 What we can say, however, is that the form of human-computer activity is a representation of action with agents that may be either human, computer-based, or a combination of both. We will discover more of the characteristics of that form as we identify its structural elements and the relations among them.

6. Although application categories like “document creation” or “productivity” are sometimes invoked by designers as if they were formal criteria, I would argue that they are rather part of the end cause, since their definitions are essentially functional rather than formal. As most computer-using writers know, it is still impossible to derive the “canonical” form of a word processor from all of the instances that exist on the market; we can only speak about a word processor’s expected or necessary functionality.

Material cause: The material cause of a human-computer interaction, like a play, is the enactment—that which unfolds before a person’s senses. As plays employ the sights and sounds produced by actors moving about in scenic environments, computers may employ animation, sound and music, text characters, or tactile and kinesthetic effects (e.g., force feedback). In the discussion of structural elements ahead, we will see how these sensory materials are shaped into more sophisticated constructs.

Efficient cause: The efficient cause of human-computer interaction is the skills and tools of its maker(s). Since a given application is probably based, at least in part, on chunks of program code that have been created by other people for other purposes, the computer equivalent of a playwright is usually a group of people. Both theatre and human-computer activity design are collaborative disciplines; both depend upon a variety of artistic and technical contributions. Some of those contributions may have already been produced, as in code libraries or scenery, whose makers may never be met by the production team, but who are nonetheless time-displaced collaborators. In both domains, the quality and nature of these contributions are strongly influenced by the available tools.7 Perhaps the greatest difference between theatre and human-computer interaction is that the human interactor is also part of the efficient cause; that is, interactors are co-authors. We will return to this topic.

7. Theatrical artists increasingly rely on computer-based tools for such tasks as lighting and scene design, lighting execution, moving scenery, designing costumes, storing and simulating dance notation and period movements, and, of course, writing scripts. Theatrical folk express the same frustrations with their tools as graphic designers and other artists who are working in the computer medium itself.

End cause: The end cause of human-computer interaction is what it is intended to do in the world. Thus the end cause obviously involves functionality; word processors had better spit out documents. But experience is an equally important aspect of the end cause; that is, what a person thinks and feels about the activity is part of its reason for being the way it is. In this sense, as Michael Mateas (2004) observes, the interactor co-shapes the end cause as well in terms of the kind of experience she wants. Or, to use Norman’s famous doorknob, the end cause of the doorknob may be different for the person who opens it and the person who locks it. This aspect of the end cause, especially in “productivity” applications, seems trivial to many; it is too often handed off as an afterthought to harried interface designers who follow programmers around with virtual brooms and pails. At the very least, a person must understand the activity well enough to do something. At best, he or she is engaged, pleased, or even delighted by the experience. In this as in many other aspects of well-designed interaction, the world of computer games has been much more effective at producing pleasurable experiences. How much better it is to place the notion of pleasurable experience where it can achieve the best results—as part of the necessary nature of human-computer interaction.

The Six Elements and Causal Relations among Them

One of Aristotle’s fundamental ideas about drama (as well as other forms of literature) is that a finished play is an organic whole. He used the term “organic” to evoke an analogy with living things, insofar as a whole organism is more than the sum of its parts, all of the parts are necessary for life, and the parts have certain necessary relationships to one another. He identified six qualitative elements of drama and suggested the relationships among them in terms of formal and material causality8 (see Figure 2.2).

8. The explicit notion of the workings of formal and material causality in the hierarchy of structural elements is, although not apocryphal, certainly neo-Aristotelean (see Smiley 1971).

Image

Figure 2.2. Six qualitative elements of structure, in drama and in human-computer interactions.

I present his model here for a couple of reasons. First, I am continually amazed by the elegance and robustness of the categories and their causal relations. Following the causal relations through as one creates or analyzes a drama seems to automagically reveal the ways in which things should work or exactly how they have gone awry. Aristotle’s model creates a disciplined way of thinking about the design of a play in both constructing and debugging activities. Because of its fundamental similarities to drama, human-computer interaction can be described with a similar model, with equal utility in both design and analysis.

Figure 2.3 lists the elements of qualitative structure in hierarchical order. Here is the trick to understanding the hierarchy: Each element is the formal cause of all those below it, and each element is the material cause of all those above it. As you move up the list of elements from the bottom, you can see how each level is a successive refinement—a shaping—of the materials offered by the previous level. The following sections expand upon the definitions of each of the elements in ascending order.

Image

Figure 2.3. Causal relations among elements of quantitative structure.

In his essay “A Preliminary Poetics for Interactive Drama and Games,” Michael Mateas proposes two additional lines of causal relations from the player’s perspective. On the side of material casuality Mateas adds “Material for Action,” and on the formal side he adds “User Intention.” In terms of “Material for Action,” Mateas argues that affordances are necessary, but not sufficient. “. . . the interface must ‘cry out’ for the action to be taken. There should be a naturalness to the afforded action that makes it the obvious thing to do” (Mateas, 2004). This, I think, is an excellent heuristic for the deployment of material causation to constrain (or nudge) interactors into directions that are more likely to yield dramatically satisfying experiences. The idea that the player’s intention serves as a force of formal causation also hits the mark. We will explore these ideas further in the section on Human-Computer Interaction as Mediated Collaboration in Chapter 4.

Enactment

Aristotle described the fundamental material element of drama as “spectacle”—all that is seen. In the Poetics, he also refers to this element as “performance,” which provides some basis for expanding the definition to include other senses as well. Some scholars place the auditory sense in the second level because of its association with music and melody; but, as I will argue in the next section, it is more likely that the notion of melody pertains to the patterning of sound rather than to the auditory channel itself.

One probably temporary difference between drama and human-computer interaction is the senses that are addressed in the enactment.9 Traditionally, plays are available only to the eyes and ears; we cannot touch, smell, or taste them. There are interesting exceptions. In the 1920s, for instance, director David Belasco experimented with using odors as part of the performance of realistic plays; it is said that he abandoned this approach when he observed that the smell of bacon frying utterly distracted the audience from the action on stage. In the mid-1960s, Morton Heilig invented a stand-alone arcade machine called Sensorama, which provided stereoscopic filmic images, kinesthetic feedback, and environmental smells; on a motorcycle ride through New York City, for instance, one could smell car exhaust and pizza.

9. Aristotle defined the enactment in terms of the audience rather than the actors. Although actors employ movement (kinesthetics) in their performance of the characters, that movement is perceived visually; the audience has no direct kinesthetic experience. Likewise, although things may move about on a computer screen, a human user may or may not be having a kinesthetic experience. In biology, the relatively recent discovery of mirror neurons in the brains of humans and some higher primates challenge this view. Science has shown that, when observing another individual doing something, “mirror neurons” in the observer’s brain respond as if the observer were taking the same action. This may go a long way toward defining at least some of the physical basis for empathy (see Keysers 2011).

In a much more serious vein, Jerzy Grotowski’s Laboratory Theatre experimented with involving the audience in the production in a variety of ways in the 1960s and 1970s. The point was not so much to expand the sensory palette of the audience, but to create “unself-conscious” participation by the audience in the form of deep emotional engagement. In his masterful book, Towards a Poor Theatre (1968), Grotowski acknowledges that he has two ensembles to direct: the actors and the spectators. In the Laboratory Theatre’s ground-breaking performance of Doctor Faustus, Grotowski had the audience seated at long banquet tables. The audience was “asked to merely to respond as people might at such a function.”

A spate of interactive plays and “mystery weekends” in the late 1980s employed the scheme of having the audience follow the actors around a space, although only as observers and not participants in the action. In one “interactive” play of the period, Tony and Tina’s Wedding, the audience was invited to follow the actors around from room to room (kinesthetic), to touch the props and sit on the furniture (tactile and kinesthetic), and to share in a wedding banquet (taste and smell). Another notable example is Chris Hardman’s Antenna Theatre, an approach where audience members move around a set prompted by taped dialogue and narration that they hear through personal headphones. These works have roots in experimental theatre work in the 1960s and 1970s by such artists as Judith Melina and Julian Beck of the Living Theatre, Robert Wilson, John Cage, and many others. Contemporary performance art shares many of the same origins. It is interesting that the development of interactive theatrical genres has been concurrent with the blossoming of computer games as a popular form of entertainment.

In fact, it is at the areas in which dramatic entertainment and human-computer activity are beginning to converge that pan-sensory representation is being most actively explored. When we examine that convergence, we can see ways in which human-computer interaction has evolved, at least in part, as drama’s attempt to increase its sensory bandwidth, creating the technological siblings of the kind of participatory theatre described in the previous sidebar.

The notion of “interactive movies” that gained popularity in the late 1980s had its roots in both cinema and computer games, and both cinema and computer games combine theatre and technology.10 In drama, the use of technology to create representations goes at least as far back as the mechane of the ancient Greeks. Cinema as a distinct form diverged from drama as the result of the impact of a new performance technology on form, structure, and style. In complementary fashion, computer games can be seen to have evolved from the impact of dramatic ideas on the technology of interactive computing, interactive affordances and graphical displays. Computer games incorporate notions of character and action, suspense and empathy, and other aspects of dramatic representation. Almost from the beginning, they have involved the visual, auditory, and kinesthetic senses (one need only watch a game player with a joystick to see the extent to which movement is involved, both as a cause and effect of the representation).

10. Earlier works, such as productions of Lanterna Magica and the branching movie at the Czech pavilion at the 1967 expo in Canada, were relatively isolated. The idea of interactive movies has been rekindled and transformed into a bona fide trend by advances in multimedia technology. Likewise, there were early experiments in interactive television in the mid-1970s (such as the failed Warner QUBE system). Interactive TV had to await similar technological advances before finally becoming a 1990s buzzword.

At the blending point of cinema and computer games in the 1980s and 1990s were such forms as arcade games like Battle Tech and Poll Position, as well as sensory-rich amusement park installations like Star Tours that used motion platform technology. Such systems involved tactile and kinesthetic senses; some even investigate the inclusion of the other senses as well through both performance technology and direct stimulation to the nervous system.

“Virtual reality” systems, as discussed in Chapter 6, increase intensity through techniques described as sensory immersion. Visual immersion is typically delivered through a wide-angle stereoscopic display; behind the scenes, the computer is generating the scene appropriately with tracking data from the immersant’s movements and gaze. That same tracking data is used in delivering spatialized audio. Through the use of special input devices like specially instrumented gloves and suits, people can move about and interact directly with objects in a virtual world. Interestingly, the first virtual reality systems and applications were developed for nonentertainment purposes like computer-aided design, scientific visualization, and training.

The great days of arcade games tailed off when home game system technology began to include good 3D graphics and specialized controllers, such as the Nintendo Wii, released in 2006. The Kinect, a motion-sensing input device for the Xbox 360 console that also responds to spoken commands, was released in 2010. Such devices enhance kinesthesia and proprioception. They also demonstrate the functional use of gesture and speech, enhancing interaction at the level of language.

The level of enactment is composed of all of the sensory phenomena that are part of the representation. Because of the evolutionary processes described previously, it seems appropriate to say that enactment can involve all of the senses. Sensory phenomena are the basic material of both drama and human-computer interaction; they are the clay that is progressively shaped by the creator, whether playwright or designer, in collaboration with the audience or interactor.

Pattern

The perception of patterns in sensory phenomena is a source of pleasure for humans. Aristotle described the second element of drama as “melody,” a kind of pattern in the realm of sound.11 In the Poetics he says, “melody is the greatest of the pleasurable accessories of tragedy” (Poetics 1450b, 15–17). The orthodox view is that “spectacle” is the visual dimension and “melody” is the auditory one, but this view is problematic in the context of formal and material causality. If the material cause of all sounds (“music”) were things that could be perceived by the eye (“spectacle”), then things like the vibration of vocal cords and the melodies of off-stage musicians would be excluded. Contrariwise, all that is seen in a play is not shaped solely by the criterion of producing sounds or music (although this may have been more strictly true in the performance style of the ancient Greeks than it is today). The formal-material relationship doesn’t work within the context of these narrow definitions of music and spectacle.

11. This element is often translated as “music,” “melody,” or “rhythm.”

In the previous section, we have already expanded “spectacle” into all sensory elements of the enactment. The notion of “melody” as the arrangement of sounds into a pleasing pattern can be extended analogically to the arrangement of visual images, tactile or kinesthetic sensations, and probably smells and tastes as well (as a good chef can demonstrate). In fact, the idea that a pleasurable pattern can be achieved through the arrangement of visual or other sensory materials can be derived from other aspects of the Poetics, so its absence here is something of a mystery. Looking ’up‘ the hierarchy, it could be that Aristotle did not see the visual as a potentially semiotic or linguistic medium, and hence narrowed the causal channel to lead exclusively to spoken language. Whatever the explanation, the orthodox view of Aristotle’s definitions of spectacle and melody leave out too much material. As scholars are wont to do, I will blame the vagaries of translation, figurative language, and mutations introduced by centuries of interpretation for this apparent lapse and proceed to advocate my own view.

The element of pattern refers to patterns in the sensory phenomena of the enactment. These patterns exert a formal influence on the enactment, just as semiotic usage formally influences patterns. A key point that Aristotle made is that patterns are pleasurable to perceive in and of themselves, whether or not they are further formulated into semiotic devices or language; he spoke of them, not only as the material for language, but also as “pleasurable accessories.” Hence the use of pattern as a source of pleasure is a characteristic of dramatic representations, and one that can comfortably be extended to the realm of human-computer interaction.

Language

The element of language (usually translated as diction) in drama is defined by Aristotle as “the expression of their [the characters’] thought in words” (Poetics 1450b, 12–15). Hence the use of spoken language as a system of signs is distinguished from other theatrical signs like the use of gesture, color, scenic elements, or paralinguistic elements (patterns of inflection and other vocal qualities). In the orthodox view, “diction” refers only to words—their choice and arrangement. That definition presents some interesting problems in theatrical forms such as mime as well as in the world of human-computer interaction, many of which involve no words at all (e.g., most skill and action computer games, as well as graphical adventure games and graphical simulations). Are there elements in such non-verbal works that can be defined as language?

When a play is performed for a deaf audience and signing is used, few would disagree that those visual signs function as language. The element of language in this case is expressed in a way that takes into account the sensory modalities available to the audience.12 A designer may choose, for whatever reason, to build a human-computer system that neither senses nor responds to words, and which uses no words in the representation. Hardware configurations without keyboards, speech recognition, or text display capabilities may be unable to work with words.

12. It is interesting that American Sign Language (ASL) is in fact a “natural language” in its own right, and not a direct gestural map of English or any other spoken language. If a language can be constructed from gesture, then it follows that spoken words are not essential elements of language. My non-deaf grandson started signing at about seven months—babies can sign before they can use words effectively.

In human-computer interaction, graphical signs and symbols, nonverbal sounds, or animation sequences may be used in the place of words as the means for explicit communication between computers and people. Such nonverbal signs may be said to function as language when they are the principal medium for the expression of thought. Accordingly, the selection and arrangement of those signs may be evaluated in terms of the same criteria as Aristotle specified for diction, e.g., effective expression of thought and appropriateness to character.

Thought

The element of thought in drama may be defined as the processes leading to characters’ choices and actions—e.g., emotion, cognition, reason, and intention. Understood in this way, the element of thought “resides” within characters, although it can be described and analyzed in aggregate form (the element of thought in a given play may be described as concerned with certain specific ethical questions, for example). Although it may be explicitly expressed in the form of dialogue, thought is inferred, by both the audience and the other characters (agents), from a character’s choices and actions. In his application of a theatrical analogy to the domain of artificial intelligence, Julian Hilton (1991) puts it this way: “What the audience does is supply the inferencing engine which drives the plot, obeying Shakespeare’s injunction to eke out the imperfections of the play (its incompleteness) with its mind.”

If we extend it to include human-computer interaction, this definition of thought leads to a familiar conundrum: Can computers think? There is an easy way out of it; computer-based agents, like dramatic characters, don’t have to think, they simply have to provide a representation from which thought may be inferred.

When a folder on my Macintosh opens to divulge its contents in response to my double-click, the representation succeeds in getting me to infer that that’s exactly what happened; i.e., the “system” understood my input, inferred my purpose, and did what I wanted. Was the “system” (or the folder) “thinking” about things this way? The answer, I think, is that it doesn’t matter. The real issue is that the representation succeeded in getting me to make the right inferences about its “thoughts.” It also succeeded in representing to me that it made the right inferences about mine.

Thought is the formal cause of language; it shapes what an agent communicates through the selection and arrangement of signs, and thus also has a formal influence on pattern and enactment. Language is the material of thought in two senses. First is the perhaps overly limiting assumption that agents employ language, or the language-like manipulation of symbols, in the process of thinking. This assumption leads to the idea that characters in a play use the language of the play quite literally as the material for their thoughts.

I favor a somewhat broader interpretation of material causality; the thought of a play can appropriately deal only with what can reasonably be inferred from enactment, pattern, and language. Most of us have seen plays in which characters get ideas “out of the blue”—suddenly remembering the location of a long-lost will, for instance, or using a fact to solve a mystery that has been withheld from the audience thus far. Such thoughts are unsatisfying (and mar the play) because they are not drawn from the proper material. In ancient Greek theatre, the Deus ex Machina (Latin for “god in the machine”) serves as an excellent example. A god shows up, typically lifted by a crane, to provide the solution to a seemingly unsolvable problem.

Plays, like human-computer interactions, are closed universes in the sense that they delimit the set of potential actions. As we will see in the discussion of action ahead, it is key to the success of a dramatic representation that all of the materials that are formulated into action are drawn from the circumscribed potential of the particular dramatic world. Whenever this principle is violated, the organic unity of the work is diminished, and the scheme of probability that holds the work together is disrupted.

This principle can be demonstrated to apply to the realm of human-computer interaction as well. One example is the case in which the computer (a computer-based agent) introduces new materials at the level of thought—“out of the blue.” Suppose a text messaging system is programmed to be constantly checking for spelling errors and to automatically correct them as soon as they are identified. Yes, you know this one—you want to type “hell” and the program changes it to “he’ll,” unless you know that you can disregard the program’s respectful correction by taking the additional action of deleting its suggestion before the word is completed. If the potential for this behavior is not represented adequately, it is disruptive when it occurs, and it will probably cause the person to make seriously erroneous inferences—e.g., “something is wrong with my fingers, my keyboard, or my software.” The program “knows” why it what it did (“thought” exists) but the person doesn’t; correct inferences cannot be made.13

13. In human factors discourse, this type of failure is attributed to a failure to establish the correct conceptual model of a given system (see Rubinstein and Hersh 1984, Chapter 5). The dramatic perspective differs slightly from this view by suggesting that proper treatment of the element of thought can provide a good “conceptual model” for the entire medium. It also avoids the potential misuse of conceptual models as personal constructs that “explain” what is “behind” the representation; i.e., how the computer or program actually “works.”

Other kinds of failures in human-computer interaction can also be seen as failures on the level of thought. One of my favorite examples comes from early text adventure games. Quite often, the parser did not “know” all of the words that were used in the text representation of the story. So a person might read the sentence, “Hargax slashed the dragon with his broadsword.” The person might then type “take the broadsword,” and the “game” might respond, “I DON’T KNOW THE WORD ‘BROADSWORD’.” The inference that one would make is that the game “agent” is severely brain-damaged, since the agent that produces language and the agent that comprehends it are assumed to be one and the same. This is the inverse of the problem described in the last paragraph; rather than “knowing” more than it represented, the agent represented more than it “knew.” Both kinds of errors are attributable to a glitch in the formal-material relationship between language and thought.

Character and Agency

Aristotle maintains that the object of (i.e., what is being imitated by) a drama is action, not persons: “We maintain that Tragedy is primarily an imitation of action, and that it is mainly for the sake of the action that it imitates the personal agents” (Poetics 1450b, 1–5). In drama, character may be defined as bundles of traits, predispositions, and choices that, taken together, form coherent agents. Characters are the agents of the actions that, taken together, form the plot. This definition emphasizes the primacy of action.

In order to apply the same definition to human-computer interaction, we must first demonstrate that agents are in fact part of such representations, and second, that there are functional and structural similarities between such agents and dramatic characters.

In a purely Aristotelean sense, an agent is one who takes action. Interestingly, Aristotle admits of the possibility of a play without characters, but a play without action cannot exist (Poetics 1450a, 22–25). This suggests that agency as part of a representation need not be strictly embodied in “characters” as we normally think of them; i.e., representations of humans. Using the broadest definition, all computer programs that perform actions that are perceived by people can be said to exhibit agency in some form. The real argument is whether that agency is a “free-floating” aspect of what is going on, or whether it is captured in “characters”—coalesced notions of the sources of agency.

The answer, I believe, is that even when representations do not explicitly include such “characters,” their existence is implied. At the grossest level, people simply attribute agency to the computer itself. “I did this, and then the computer did that.” They also attribute agency to application programs, e.g., “My word processor trashed my file.” They often distinguish between the agency of system software and applications (“My new operating system crashed my app”). They attribute agency to smaller program elements and/or their representations, e.g., “The spelling checker in my word processor found an error.”

In social and legal terms, an agent is one who is empowered to act on behalf of another. In the mimetic world of human-computer interaction, this definition implies that, beyond simply performing actions, computer-based agents perform a special kind of actions; namely, actions undertaken “on behalf of” people. It also therefore implies that some sort of implicit or explicit communication must occur between person and system in order for the person’s needs and goals to be inferred. I think that this definition is both too narrow and too altruistic. There may be contexts in which it is useful to create a computer-based agent whose “goals” are orthogonal or even inimical to those of human agents; for instance, in simulations of combat or other situations that involve conflicting forces. Agents may also work in an utterly self-directed manner, offering the results of their work up to people after the fact. For now, we will use the broader definition of agents to apply to human-computer interaction: “Characters” can initiate and perform actions based upon input from the program or the interactor. Like dramatic characters, they consist of bundles of traits or predispositions to act in certain ways.

Traits circumscribe the actions (or kinds of actions) that an agent has the capability to perform, thereby defining the agent’s potential. There are two kinds of traits: Internal traits determine how an agent can act, and external traits that represent those internal predispositions. People take cues from the external representation of an agent to infer its internal traits. Why? Because traits function as a kind of cognitive shorthand that allows people to predict and comprehend agents’ actions. Inferred internal traits are a component of both dramatic probability (an element of plot, as described in the next chapter) and “ease of use” (especially in terms of the minimization of human errors) in human-computer systems. Part of the art of creating both dramatic characters and computer-based agents is the art of selecting and representing external traits that accurately reflect the agent’s potential for action.

Aristotle outlined four criteria for dramatic characters that can also be applied to computer-based agents (Poetics 1454a, 15–40). The first criterion is that characters be “good” (sometimes translated as “virtuous”). Using the Aristotelean definition of “virtue,” a good character is one that successfully fulfills its function; that is, one that successfully formulates thought into action. A “good” character does (action) what it intends to do (thought). It also does what its creator intends it to do in the context of the whole action. Second is the criterion that characters be “appropriate” to the actions they perform; that is, that there is a good match between a character’s traits and its actions. Characters may surprise us with their actions, but we should be able to see in retrospect that the potential for those actions was present. Third is the idea that a character be “like” reality in the sense that there are causal connections between its thoughts, traits, and actions. This criterion is closely related to dramatic probability. Finally, characters should be “consistent” throughout the whole action; that is, that a character’s traits should not change arbitrarily. The mapping of these criteria to computer-based agents is quite straightforward—be they “applications,” agents in the sense of personified “helpers,” or characters in a computer game.

Finally, we need to summarize the formal and material relationships between character and the elements above and below it in the hierarchy. Formal causality suggests that it is action, and action alone, which shapes character; that is, a character’s traits are dictated by the exigencies of the plot. Including traits in the representation that are not manifest in action violates this principle. Material causality suggests that the stuff of which a character is made must be present on the level of thought and, by implication, language and enactment as well.

An old but good example is the interface agent Phil, who appears in an Apple promotional video entitled “The Knowledge Navigator” (© 1988 by Apple Computer, Inc.). In the original version, Phil was portrayed by an actor in a video format. He appeared to be human, alive, and responsive at all times. But because he behaved and spoke quite simply and performed relatively simple tasks, many viewers of the video complained that he was a stupid character. His physical traits (high-resolution, real-time human portrayal) did not match his language capabilities, his thoughts, or his actions (simple tasks performed in a rather unimaginative manner). In a later version, Phil’s representation was changed to a simple line-drawn cartoon character with very limited animation. People seemed to find the new version of Phil much more likable. The simpler character was more consistent and more appropriate to the action. Microsoft’s paper clip, by comparison, looked too stupid to do anything meaningful.

Plot: The Whole Action

Representations are normally thought of as having objects, even though those objects need not be things that can or do exist in the real world. Likewise, plays are often said to represent their characters; that is, Hamlet is a representation of the Prince of Denmark, and so on. In the Aristotelean view, the object of a dramatic representation is not character, but action; Hamlet represents the action of a man attempting to discover and punish his father’s murderer. The characters are there because they are required in order to represent the action, and not the other way around. An action is made up of incidents that are causally and structurally related to one another. The individual incidents that make up the play of Hamlet—Hamlet fights with Laertes, for instance—are only meaningful insofar as they are woven into the action of the mimetic whole. The form of a play is manifest in the pattern created by the arrangement of incidents within the whole action.

Another definitional property of plot is that the whole action must have a beginning, middle, and end. The value of beginnings and endings is most clearly demonstrated by the lack of them. The feeling produced by walking into the middle of a play or movie or being forced to leave the theatre before the end is generally unpleasant. Viewers are rarely happy when, at the end of a particularly suspenseful television program, “to be continued” appears on the screen. My favorite computer example is an error message that I sometimes encounter: “[your application] has unexpectedly quit.” “Well,” I typically reply, “the capricious little bastard!” Creating graceful beginnings and endings for human-computer activities is most often a nontrivial problem—how to introduce the premise for a game, for example, or how to end a session of video editing. Two rules of thumb for good beginnings is that the potential for action in that particular universe is effectively laid out, and that the first incidents in the action set up promising lines of probability for future actions. A good ending provides not only completion of the action being represented, but also the kind of emotional closure that is implied by the notion of catharsis, as discussed in the next chapter.

A final criterion that Aristotle applied to plot is the notion of magnitude:

. . . to be beautiful, a living creature, and every whole made up of parts, but also be of a certain definite magnitude. Beauty is a matter of size and order. . . . Just in the same way, then, as a beautiful whole made up of parts, or a beautiful living creature, must be of some size, but a size to be taken in by the eye, so a story or Plot must be of some length, but of a length to be taken in by the memory (Poetics 1450b, 34–40).

The action must not be so long that one forgets the beginning before one gets to the end, since one must be able to perceive it as a whole in order to fully enjoy it. This criterion is most immediately observable in computer games, which can often require a person to be hunched over a keyboard for days on end if he or she is to perceive the whole at one sitting, a feat of which only teenagers are capable. In good massively multiplayer games, design can assist the player in finding good intermediate “stopping places” where catharsis is possible, even though the potential of the game is not exhausted and the player intends to return to it. Similar errors in magnitude are likely to occur in other forms, such as virtual reality systems, in which the raw capabilities of a system to deliver material of seemingly infinite duration is not yet tempered by a sensitivity to the limits of human memory and attention span, or to the relationship of beauty and pleasure to duration in time-based arts.

Problems in magnitude can also plague other, more “practical,” applications as well. If achievable actions with distinct beginnings and ends cannot occur within the limits of memory or attention, then the activity becomes an endless chore. Contrariwise, if the granularity of actions is too small, and those actions cannot be grouped into more meaningful, coherent units, the shape of the activity is either a forgettable point or an endless line of chores. These problems are related to the shape of the action as well as its magnitude, the first subject to be treated in the next chapter.

The notion of beauty that drives Aristotle’s criterion of magnitude is the idea that made things, like plays, can be organic wholes—that the beauty of their form and structure can approach that of natural organisms in the way the parts fit perfectly together. In this context, he expresses the criterion for inclusion of any given incident in the plot or whole action:

. . . an imitation of an action must represent one action, a complete whole, with its several incidents so closely connected that the transposal or withdrawal of any one of them will disjoin and dislocate the whole. For that which makes no perceptible difference by its presence or absence is no real part of the whole (Poetics 1451a, 30–35).

If one aims to design human-computer activities that are—dare we say—beautiful, this criterion must be used in deciding, for instance, what a person should be able to do, or what a computer-based agent should be represented as doing, in the course of the action. It also implies that leaving things out can be important in achieving a graceful organic whole.

In this chapter, we have described the essential causes of human-computer activity—that is, the forces that shape it—and its qualitative elements. In the next chapter, we will consider the orchestration of action more closely, both in terms of its structure and its powers to evoke emotional and intellectual response.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset