3
Studying Linguistic Productions

This chapter presents various methods for studying the linguistic productions of speakers, namely through elicitation and repetition tasks. We start by discussing the differences that exist between the ability to produce and to understand language; and we argue that it is necessary to examine these two components of the language faculty separately, in order to have an overall picture of the functioning of a certain linguistic phenomenon. We then present the fundamental methodological differences which separate the observation of linguistic productions in a corpus and the experiments aimed at eliciting such productions. In the rest of the chapter, we introduce the different methods used for generating productions in an experimental context. We start with so-called free elicitation tasks, which imply a minimum level of constraint on productions. Then, we move on to constrained elicitation tasks and finally to repetition tasks, which imply an even greater control over production. In every case, we discuss the possibilities that these tasks offer for the study of language, as well as their limitations. We arrive at the conclusion that these tasks are complementary and that the most reliable method for studying linguistic productions is to combine them.

3.1. Differences between language comprehension and language production

Mastering language involves being able to use it appropriately for communicating with others, as well as decoding and interpreting discourse (spoken and written) produced by others. These two skills, respectively, involve the ability to produce and to understand language. As we shall see in this section, these two elements of the language faculty are nonetheless partially dissociated and should be studied separately in order to obtain an overall picture of the speakers’ linguistic competence.

The dissociation between language production and comprehension abilities is particularly evident during the language acquisition period in the first years of life. Indeed, between birth and the age of 1 year, infants do not really produce language. During their very first months, babies only cry. Then, when they reach 2–4 months, children start producing vowels like “aaaa” with different intonations. It is still necessary to wait until between 6 and 9 months of age for the so-called babbling stage to begin. This period is characterized by the repetition of syllables like “da-da-da” or “goo-goo”, which reproduce certain features of their mother tongue. Finally, it is only around their first birthday that babies produce a few isolated words like “bye-bye” and “no”.

Observing babies’ productions during their first year of life could give the impression that no aspect of language is mastered. However, this is far from being true, as shown by experimental techniques which make it possible to indirectly measure language comprehension in babies. One of these techniques, non-nutritive sucking, consists of measuring differences in suction intensity and rhythm by means of a teat containing sensors. It has shown that babies are already sensitive to many aspects of language before they can speak because they react systematically to changes in stimuli, as revealed by the differences in the intensity and rhythm of their sucking. To quote only a few examples, from birth babies are able to distinguish their mother tongue from other very different languages (e.g. French and Chinese) and can perceive phonetic contrasts, even in languages that are not their own native language. Between the age of 4 and 6 months, babies recognize the differences between even very close languages (e.g. German and Dutch) and can already understand a few isolated words. When they reach 1 year of age, they are able to recognize words heard several weeks earlier, in stories, and detect violations in the word order of their mother tongue. By the time they finally start producing a few words, babies have already developed sophisticated comprehension skills in their mother tongue (for a more in-depth discussion of these early skills, see Rowland (2013)).

The dissociation between language comprehension and language production persists throughout the language acquisition period. In most cases, children understand more than they are able to produce, but the reverse asymmetry also occurs. In particular, the first productions of a word do not imply that children really understand its meaning. For example, children pointing at the dog in their home using the word “dog” give the impression that they understand the meaning of the word. However, children go through a phase known as underextension, during which they assign a linguistic label not to a category (all the dogs in the world) but to a specific referent (the dog in their house). During this period, they do not yet master the meaning of this word, even if their productions are technically correct. These examples illustrate the need to study not only children’s linguistic productions, but also their comprehension of language.

However, the dissociation between comprehension and production is not the prerogative of young children during the language acquisition period. It is also found in foreign language learners, both children and adults. For this reason, so-called receptive and productive language skills are evaluated separately in foreign language assessment tests. In the same way as young children, foreign language learners often have better receptive skills (also called passive skills) than production skills. Production is also sometimes ahead of comprehension in the interlanguage of learners (Ortega 2008).

Another example showing the need to dissociate comprehension and production is that of those suffering from language impairments. As a matter of fact, some aphasia types such as Broca’s aphasia (also called expressive aphasia) primarily affect the productive aspect of language, whereas others such as Wernicke’s aphasia (also called receptive aphasia) primarily trigger problems in language comprehension. The study of those suffering from aphasia also illustrates the fact that language production, like comprehension, can be used for analyzing the different components of language. For example, the inability to carry out a very simple lexical production task, such as naming an object represented in an image can have various causes: problems accessing conceptual information about this object (the object is no longer recognized), an inability to access the phonemes that make up its name (a problem that healthy people also encounter when they have a word on the tip of their tongue) or even a motor inability to pronounce these phonemes.

Finally, note that the difference between language production and comprehension is also present in adults who do not suffer from any language impairments and who speak in their mother tongue. There is notably a big difference between production lexicon, for example, the words used every day for speaking to somebody we know, for writing letters or for teaching a course, and comprehension lexicon, which corresponds to the number of words we can actually understand. Again, production is clearly below comprehension. For example, in his novel Madame Bovary, Flaubert used no more than 14,000 different words (or word types) even when counting the conjugated forms of verbs, plurals, etc., separately, and just over 7,500 words if we only count semantically different words (lemmatized forms). These relatively low numbers might suggest that if the lexicon of a great author is no greater than 10,000–15,000 words, then the lexicon of an average person should be much smaller. But once again, it is necessary to differentiate between the production lexicon, that is, the words that people have the opportunity to use, and those that they are able to understand. In fact, the comprehension lexicon contains at least 40,000 words for someone with a high school diploma and may amount to 60,000–80,000 words for speakers with a college education (Aitchison 2003).

All the examples discussed in this section confirm that language production and language comprehension are clearly dissociated and that these two aspects of linguistic ability should be studied separately, in order to have an overall picture of the linguistic competence of speakers. However, for several decades, only language comprehension was considered an adequate reflection of linguistic competencies. This exclusion of the productive aspect as a component of the language faculty finds its origins in the works of the American linguist Noam Chomsky, and in his definition of I-language. According to Chomsky, linguistics has the task of studying the linguistic representations of speakers, which he denominates the I-language or internal language (see, in particular, Smith (2004) for an introduction to Chomsky’s thought). These representations reflect what people intuitively know about their mother tongue, in other words, what they understand. Contrary to this, according to Chomsky, linguistic productions do not represent competences but are barely performances or implementations of the language faculty. However, the latter are not always representative of competence. Indeed, a person may make mistakes when speaking, for example, by using a word in the place of another, not because he or she does not know the word’s meaning but due to tiredness, stress, etc. Therefore, according to Chomsky, studying linguistic productions offers a biased reflection of the internal language, which should be the only study object for linguists. Recently, the study of linguistic productions has returned to the heart of linguistic research, thanks, in particular, to the development of corpus linguistics (see section 3.2). From an experimental point of view, Chomsky’s objections can be avoided by using quantitative methods, which make it possible to sort isolated occurrences, which are not representative of linguistic competence, from recurring facts. For example, if a person produces the form “he goed” 10 times in a 30-minute interview and never the correct form “he went”, it seems unlikely that these productions are random errors but rather that they reflect the fact that the person does not know the irregular form of this verb.

In summary, the study of language can either relate to the aspect of production or to that of comprehension, but we should keep in mind that the results in one of these areas cannot be generalized to the other. In the following chapters, we present different types of experiments aimed at measuring language comprehension, as this can be done through many experimental paradigms. In this chapter, we focus on the production component and review the pros and cons of the different methods for studying it.

3.2. Corpora and experiments as tools for studying production

Different empirical methods can be used for studying linguistic production. A first important distinction between these methods, which we will study in this section, is the one that separates the observation of corpus linguistics productions from the elicitation of productions within an experimental context.

Corpora are large collections of texts or recordings gathered in an electronic format so as to be representative of a certain type of language. For example, some corpora aim to represent a discourse genre (journalistic, literary corpus or online discussions), types of speakers (adults vs. children, learners vs. native speakers) or linguistic regions (the UK, the US, Australia, etc.). Whatever the type of corpus considered, corpus linguistics aims to study natural linguistic productions from a quantitative perspective. For example, a corpus study could be used for studying the differences in pronunciation of English vowels between speakers from London and New York. Or another study could compare the development in the production lexicon of neurotypical children and children with autism spectrum disorder (ASD), at the same chronological age. The common point between these studies, albeit with totally different themes, is that the language samples produced in the corpora as a study object were collected in their natural context, without any intervention on the part of researchers.

When collecting data from a corpus, the primary aim is to collect spontaneous interactions in the same way that they might have occurred in the absence of a recording. It is not always possible to reproduce entirely natural conditions, because the simple fact of recording the participants can cause them to unconsciously change their behavior, but the goal is to get as close as possible to a natural environment. For example, children are usually recorded at home when interacting with family members. This is a big difference with the experimental contexts, in which the participants do not evolve in their natural environment but in the laboratory, or sometimes in their classroom, in the case of children. Thus, one of the main advantages of studying corpora productions in comparison with the experimental context is their natural character, which better reflects the real skills of people than productions collected in a non-familiar context, in the presence of strangers.

Another advantage of using corpora is that, due to their large size, they make it possible to observe a large number of occurrences of a phenomenon, produced by a large number of different people. Conversely, in an experimental context, it is not possible to have more than a limited number of occurrences produced by each participant, in order to avoid tiredness and learning effects. In addition, the number of participants in a study is often limited for practical reasons. However, there are many occurrences which can only be observed in a corpus for frequent linguistic phenomena, for example, the use of basic vocabulary or frequent verbal tenses such as the simple past or the future. For rarer linguistic phenomena, such as the use of a specialized lexicon or the use of infrequent verbal tenses, it is very likely that even a large corpus will not make it possible to find many occurrences. Conversely, in an experimental context, it is possible to encourage participants to produce infrequent elements by constraining the production context. For example, it is possible to ask the participants to continue a sentence which can only be completed by the subjunctive form, or to name objects represented in images that correspond to rare words. This experimental method makes it possible to collect more occurrences of rare phenomena than the use of a corpus.

In addition to testing rare linguistic phenomena, the experimental method also has another advantage compared to the observation of natural phenomena in a corpus. If an element does not appear in a corpus, for example, if children produce no passive sentences, it is not possible to conclude that they do not know the passive form. These children may simply not have had the opportunity to produce passive forms during the recordings, although they are capable of doing so. In other words, the lack of evidence in the corpus of children producing passive sentences does not suffice to conclude that they avoid this form because they cannot master it. On the other hand, in an elicitation context where children are invited to complete the transformed sentence in (1) to keep its meaning, it would no longer be possible to avoid the passive form:

(1) The cat chases the dog.

The dog _____________ by the cat.

Thus, the experimental method makes it possible to determine whether people are capable of producing a specific linguistic form or not, whereas corpora only make it possible to observe whether a certain form is used or not, and how often it is produced. This difference may have a significant impact on the conclusions of a study.

Let us examine an example that illustrates this problem. Royle and Reising (2019) studied the ability of children with specific language impairment (SLI) and children without language impairment – matched on age or on the mean length of utterance (MLU) – to produce correct agreements between the elements within noun phrases, both in the context of natural observation recorded in a corpus, and during an elicitation task. In the elicitation task, children had to make a puzzle and name the pieces. This task was designed to elicit the production of complex noun phrases, combined with adjectives (“the little house”, “the big blue house”, etc.). The same children were recorded during natural interactions in the context of play. The results showed different errors under the two conditions. During spontaneous interactions, children with specific language impairment (SLI) essentially omitted elements of the noun phrases, such as determiners. The elicitation task, on the other hand, revealed specific difficulties in adjective agreement. A generally high level of agreement errors was also found. This difference reflects the fact that children tend not to produce adjectives or, more generally, complex noun phrases in spontaneous speech. Thus, the elicitation task made it possible to reveal linguistic difficulties in children with SLI which were not apparent during natural interactions.

Another inherent limitation in corpus linguistics is that the identification of some linguistic phenomena in a corpus requires manual processing of data, which is very time-consuming. As a matter of fact, only words can be searched automatically in a corpus, and these searches must be refined to eliminate the irrelevant occurrences of homonyms. For example, let us imagine a study aiming to identify all the uses of relative sentences in a corpus. One idea might be to look for all the occurrences of relative pronouns, such as “who” or “which”. However, this research would not be enough to identify the relevant occurrences, since these words are also used as interrogative pronouns. It would therefore be necessary to sort all the search results manually and stick to the relevant occurrences. Imagine a search aiming to determine the different ways in which requests are formulated. This time, a word search would be of little use, as there is no conventional link between the form and function of speech acts. To summarize, research on corpora may become very complex and time-consuming in cases where the phenomenon investigated is not associated with an unambiguous linguistic form that can be automatically queried. Experimental research makes it possible to circumvent these problems by formulating a task encouraging participants to specifically produce the element under study.

Furthermore, the meaning that a person tried to convey during the discussions recorded in a corpus may be ambiguous. For example, when a young child uses a name designating an object which is not present in the immediate context, as, for example, the word “island”, it is difficult to know whether the word is being used to convey the appropriate concept or not. This problem is all the more important since children regularly produce underextensions and overextensions of meaning, as we have already pointed out (on this, see, for example, Bloom (2000)). The intended meaning in elicitation tasks which involve image description does not pose the same ambiguity problems.

Finally, because of the ambiguity inherent in corpus data, there is the question of how many spontaneous occurrences make it possible to conclude, with absolute certainty, that an element has been acquired. After the first occurrence? After three occurrences during the same recording? Elicitation experiments allow more precise control over the production context and thus make it possible to determine, with certainty, whether a person is capable of producing a certain linguistic form repeatedly or not. On the other hand, these experiments imply additional difficulties for the participants, such as the need to understand the task and to carry it out in an unnatural production context. Due to these additional difficulties, elicitation tasks generally indicate a lower level of competence compared to the observation of productions in a corpus, and thus provide a conservative image of the linguistic level of the participants.

In sum, the study of linguistic productions can be done either through the observation of a corpus or by experimentally eliciting productions thanks to the use of specially designed tasks. Both methods have advantages and disadvantages which we have discussed in this section. We should also observe that in many cases, these two approaches can provide complementary points of view, which are very useful. For example, before deciding to create an elicitation task, the frequency of a certain linguistic phenomenon in different contexts or discourse genres can be assessed by means of a corpus.

In the rest of this chapter, we will focus on the presentation of different elicitation tasks aimed at experimentally eliciting linguistic productions. We will see that these tasks are placed on a continuum, spanning from a very low level of control, in the case of free elicitation tasks, to a higher level of control, in the case of constrained elicitation tasks, reaching a maximum level of control in repetition tasks.

3.3. Free elicitation tasks

In order to overcome some limitations associated with the natural observation of data in a corpus, and to complete their production database, researchers sometimes resort to the free elicitation technique, which consists of orienting the productions by placing the participants in a previously set context. These elicitation tasks often take the form of interactive games for obtaining certain dialogue-related elements, or the description of films, images or the retelling of memories, in order to collect monologues.

The great advantage of this technique is that it makes it possible to preserve the naturalness of corpus data to a large extent, since the participants are free to produce language samples without the intervention of the researchers in charge of data collection. Unlike the observation of corpora discussed previously, this method involves a form of experimentation in that it makes it possible to manipulate the production contexts in order to study their influence on the type of linguistic productions. This technique offers many advantages.

First of all, it makes it possible to generate linguistic elements which are rarely found in corpora and whose low frequency hinders a quantitative analysis of data. For example, by asking people to describe events taking place in a video, it is possible to test their ability to retell a series of events and to study the use of verbal tenses, for instance. Some studies have used Charlie Chaplin’s silent films as stimuli for eliciting production. Many studies (e.g. Berman and Slobin (1994)) used a story without any text captions in the form of a 24-vignette series called Frog, Where Are You? (Mayer 1969) for elicitation tasks with children and learners. This story has become a classic of elicitation studies and data exists in many different languages, available via the CHILDES online database (MacWhinney 2000). In addition to films and stories without text, another medium to encourage the production of nouns for designating specific objects is to have participants play with objects or cards containing images of such objects. For example, participants may be instructed to describe these cards with sufficient precision so as to allow someone else to identify the correct card. By using a card deck representing similar objects, for example, a red car and a blue car, it is possible to test the production of complex nominal phrases.

Another advantage of free elicitation over corpus observations is that it is easier to control the meaning the speaker intended to produce. The observation of corpora, where the context is not controlled, leaves ample room for interpretation. On the other hand, when an object is represented on a card, the target word is clearly identified and naming mistakes can also be easily identified.

At a syntactic level, free elicitation tasks do not always enable participants to produce the structures targeted in the study. In fact, there are often several ways of freely expressing the same proposition, and the avoidance strategies we mentioned in relation to corpus data may also appear in free elicitation tasks. In order to specifically test certain complex syntactic structures, which tend to be avoided in everyday spoken language, such as subordinate clauses or passive constructions, constrained elicitation tasks seem to be better suited, as we will discuss in further detail in section 3.4.

Furthermore, free elicitation tasks do not provide representative data on the actual production frequency of certain words or syntactic structures. This is why it is preferable to use them for supplementing, rather than for replacing the analyses of spontaneous productions in a corpus, as Evers-Vermeul and Sanders (2011) did when studying the productions of subjective causal relations in young Dutch children. In the literature, different types of subjective causal relations are often separated into two sub categories. An important distinction separates the relations involving speech acts, as in (2), and so-called epistemic relations, which imply arguments and conclusions derived from them, as in (3):

(2) Lend me your umbrella, because I lost mine.

(3) Perhaps it will rain, since everyone has taken their umbrella.

Evers-Vermeul and Sanders (2011) wanted to find the order in which young Dutch children begin to produce these two types of causal relations. The children took part in a free elicitation task in which they had to pick either a character from among many on printed cards, and then convince a doll that they had made the right choice, or give instructions to the doll so that it placed stickers in certain places on a picture. While the first task provided a context favoring the production of epistemic relations, the second focused on a context encouraging the production of relations involving a speech act (giving orders to the doll). The experiment was carried out with children of two age groups, the first of about 4 and a half years old and the second, with a group of 6-year-olds. The results showed that children in both age groups were able to produce both types of causal relations. In addition, contextual manipulation biased the type of production, since the children systematically produced more epistemic-type relations in the argumentation task and more speech act relations in the directive task.

This experiment indicates that context plays an important role in the production of causal relations and this should therefore be taken into account. On the other hand, it does not provide enough information about the age from which children are able to produce these two types of causal relations, as even the youngest children were able to produce both types. In order to answer the question of when these productions begin, it was necessary to complete the experiment with an analysis of children’s spontaneous productions in a corpus. This analysis revealed that there is a difference in the age of the first productions, since children first produce relations involving speech acts, on average, several months before producing epistemic relations. On the other hand, in the corpus analysis, the important role of context on the type of relation produced could not be established. The elicitation task therefore provided an important additional element for understanding when and how children produce different types of causal relations. This study thus illustrates the advantages of combining corpus data and a free elicitation task.

Another great advantage of free elicitation tasks is the low level of linguistic constraint they impose, which makes them easy to implement in different languages. These tasks therefore make it possible to reveal the impact of differences in encoding between languages on the way people speak about the same events. For example, von Stutterheim and Nüse (2003) compared the way in which English and German speakers and narrators retell the events of a short, seven-minute silent film, while it is being played. The study showed that English speakers divided the action of the film into many very specific events, whereas German speakers divided the sequences into fewer events, of a more global nature. Thus, for the same linguistic input, the division into events seems to be done differently between the two languages. In addition, many differences were also observed in the description of the same event. While in English, verbs alone were very common (he falls, he jumps, etc.); in German, the point of arrival or the direction of the action was mentioned more often (he jumps from the cliff, he falls to the ground, etc.). This elicitation task thus made it possible to show that people speaking close languages, such as German and English, divide the flow of visual information according to different criteria, on the basis of varying elements in the events of their story, and that they provide different time perspectives for such events. These conclusions could be drawn thanks to the possibility offered by free elicitation in collecting natural language under similar conditions of production, between different languages. Thus, free elicitation is a technique particularly suitable for collecting data at the discourse level, as participants can choose how to order their stories by themselves.

To summarize, free elicitation tasks make it possible to generate the production of infrequent elements, to test fine semantic distinctions, as well as to assess the lexical or syntactic level of productivity between different groups of participants, of varying levels, ages or languages. These tasks can either involve identical production conditions, or manipulate the production context in order to study the role of the different contexts on speakers’ productions. Free elicitation tasks are also not very repetitive and can be used on several occasions without producing a training effect or tiredness. However, their low level of control does not ensure that a specific linguistic structure or vocabulary will be produced.

3.4. Constrained elicitation tasks

The constrained elicitation tasks we describe in this section are used for quantitatively studying the ability of different groups of people to produce certain linguistic elements, while systematically controlling the different factors involved in such productions. These methods involve the use of a specific protocol so as to prevent examples being given of the structure or word in question. Unlike the repetition tasks we will present in the following section, these tasks do not provide a model of the structures to be produced but work only as incentives for producing such structures.

For example, in one of the classic experiments on lexicon acquisition, Berko (1958) showed a picture of a small invented animal to children, telling them that it was a wug. She then showed them another image in which two of these animals were drawn, saying to them: “Now there is another one. There are two of them. There are two…”. The children’s task was to complete her sentence. Most children aged between 4 and 7 years gave the right answer: wugs. In a very ingenious way, this experience made it possible to show that young English-speaking children are already able to use the morphology of their mother tongue in a productive way, with words never encountered before.

Other elicitation experiments aim to test the mental lexicon of adults. For example, these experiments can involve measuring the latency necessary for producing different words. This method has made it possible to show that words constructed morphologically do not always take longer to be produced than words with the same frequency, having the same number of syllables, but not constructed morphologically (Bonin 2013). These experiments show that all morphologically constructed words are not assembled on the spur of the moment, at least when they are derivational suffixes of frequent words.

Many other constrained elicitation experiments relate to the field of syntax, since it is at this linguistic level that such tasks become most interesting. Some experiments produce a certain grammatical form as a prompt, for example a passive sentence, in order to determine whether this form will prompt speakers to use passive constructions spontaneously for describing other events, more often than they would do if there had been no prompt. These experiments made it possible to show that the speakers have a tendency to reproduce recently heard syntactic forms, whether in their mother tongue or in another language. For example, Kim and McDonough (2008) asked university students with Korean as their mother tongue and different levels of English proficiency, to interact in English with an English speaker. During these interactions, participants had to describe a series of cards. The English-speaking person’s cards contained 20 sentences with passive verbs and 20 sentences with active verbs. The participants’ cards contained only verbs, half of which were the same verbs as those appearing on the cards of the English-speaking participant and the other half were different. The results indicated that learners produce more passive sentences when using a verb that has just been used in the passive voice, confirming their sensitivity to this priming effect, as with native speakers (Branigan et al. 1995).

Constrained elicitation experiments are also used for testing the development of many complex syntactic structures (see McDaniel et al. (1998) for a review). For example, it is possible to test children’s ability to produce certain structures by asking them to transform a sentence into a question. In this type of experiment, however, it is important for the context to be plausible and to help children understand the need to produce the expected form. For example, in some experimental paradigms, children are asked to act as an intermediary between a person and a doll who cannot hear properly. The experimenter stands in a different place in the room, in relation to the child and the doll and asks the child to help her talk with the doll, by asking her questions. An example of a question elicitation task could be as follows:

(4) I don’t know whether she likes eating French fries. Ask her.

In order to elicit the production of negative sentences, one possibility would be to ask the children to say the opposite of what the experimenter said. Whatever the format chosen, we should ensure that the instructions provided encourage the participants to produce the expected form. This can be achieved by means of a pretest with older children or with adults. Indeed, it is possible that participants, children and adults, may prefer an alternative strategy. For example, when participants have to say the opposite of sentence (5), the answer given could imply a lexical opposite (6), rather than the expected negative sentence (7):

(5) Julian is kind.

(6) Julian is mean.

(7) Julian is not kind.

In this case, it would be wrong to conclude that children are not capable of producing negations. This problem is all the more significant since, in an elicitation task, it is absolutely necessary to avoid providing a model of the expected answer, as in the case of repetition tasks. It is therefore not possible to provide a first example of morphological negation. To avoid this type of problem, one solution would be to choose adjectives that have no salient lexical opposite, such as gifted, or qualifiers having no antonyms, such as American.

In addition to their usefulness for testing the mastery of syntax by children or by non-native speakers, elicitation tasks can also be used for studying discursive and sociolinguistic phenomena in native speakers. For example, Kehler and Rohde (2013) tested the links between the type of coherence relation uniting sentences, such as causality, goal or temporality, and the type of referential expression chosen for designating a discourse referent. To do this, participants had to insert an argument after sentences, as in (8) and (9):

(8) Luke lent Peter a book. He _____________.

(9) Luke lent Peter a book. _______________.

This experiment enabled them to observe that the presence of a pronoun influenced the choice of coherence relation used for continuing discourse. In fact, when a pronoun was present, the majority of participants chose to continue discourse with a causal relation (“he wanted to read it”) or an elaboration relation (“he often liked the same books as him”). On the other hand, when the pronoun was not imposed, as in (9), and the participants chose to use a full noun phrase (Luke or Peter), the sentence they produced involved a different discourse relation, implying either a result (“Peter loved it”) or a goal (“Peter used it for preparing a presentation”). This elicitation task made it possible to bring to light the constraints which associate the different aspects of textual coherence (referential expressions and coherence relations).

In the field of sociolinguistics, constrained elicitation experiments are useful for determining the dissemination of linguistic traits. This method was used, in particular, by Avanzi et al. (2016) for mapping the dissemination areas of lexical and grammatical regionalisms of French spoken in Europe. Thanks to the use of an online questionnaire, data from more than 10,000 French-speaking Europeans were collected. This questionnaire contained a task that represented a form of constrained elicitation. Indeed, for every word tested, the participants read a definition of the word (10) or contextual information (11) associated with an image:

(10) What do you call this object, on which clothes are dried?

(11) In winter, in order to keep our feet warm, we put on our_________ ?

In the case of words for which several regional variants were documented in dictionaries, participants were asked to choose from a word list. If none of the suggested words seemed to suit them, it was possible to check an “other” box and insert their own word. For other words with a supposedly more general distribution, participants had to indicate the frequency with which they used them on a scale from 0 (never) to 10 (very often). In the case of syntactic expressions, they had to choose, from a closed list, the expression they would use the most spontaneously in such a situation. This method of eliciting production from a closed list of possibilities made it possible to show that certain regionalisms, listed in dictionaries, are used progressively less and less, whereas certain words presented as regionalisms in dictionaries have an area of such wide dissemination that the qualifier of regionalism is no longer appropriate.

In sum, in this section, we have seen that the constrained elicitation method makes it possible to obtain targeted linguistic productions, in order to answer research questions in fields as varied as lexicon, syntax, discourse and sociolinguistics. The main advantage of this method is that the experimental context attached to it ensures a sufficient number of linguistic productions for carrying out a quantitative analysis of data. Furthermore, this method makes it possible to manipulate independent variables and to avoid interference of confounding variables, which sometimes obscure corpus data. However, we should be careful to make sure that the material used for these tasks does not include other factors of complexity, apart from the one being tested (excessively long sentences, rare words, etc.). Its main drawback, which applies to all experimental methods, is the unnatural nature of production contexts, which do not always reflect what people do spontaneously. In the context of sociolinguistic research in particular, the participants might be tempted to answer by following conventions rather than in relation to their actual practices, which often escape consciousness. Furthermore, constrained elicitation requires a certain level of linguistic competence, making this method inapplicable to children younger than 3 years (Eisenbeiss 2010). In general, children and learners obtain lower linguistic development scores in constrained elicitation tests compared to spontaneous production data. It is therefore necessary to compare different contexts of production, as much as possible, by combining different methods in order that the analyses accurately reflect the real linguistic competence of speakers.

3.5. Repetition tasks

To conclude the presentation of production tasks, in this section, we introduce the method displaying the highest degree of linguistic constraint on production: the repetition task. As its name suggests, the repetition task involves asking participants to repeat either a word or a sentence after it has been presented. These tasks are based on the observation that linguistic repetition is not a simple imitation task, but requires the ability to process the stimulus. In the case of sentences, numerous studies in the field of language acquisition have shown that it is not possible for young children to properly repeat sentences which are not yet part of their grammatical system (e.g. Bloom et al. 1974), that is, sentences that they would not be able to produce by themselves. This inability is illustrated in an amusing way in the following dialogue between a father and his child, retold by Pinker (1994, p. 281):

Child: want other one spoon, Daddy.

Father: you mean you want the other spoon?

Child: yes, want other one spoon, please, Daddy.

Father: can you say “the other spoon”?

Child: other... one… spoon.

Father: say “other”.

Child: other.

Father: “spoon”.

Child: spoon.

Father: “other... spoon.”

Child: other... spoon. Now give me other one spoon.

Sentence repetition tasks make it possible to accurately test which elements are still problematic for children. For example, it is possible to test the role of semantic and syntactic representations of children in their ability to interpret relative clauses, by modifying its lexical head (McDaniel et al. 1998 p. 57). In example (12), the lexical head has a precise semantic meaning, whereas this is not the case in (13). The role of syntax can be tested by alternating a sentence with a lexical head such as in (13) and without it, such as in (14).

(12) Max bought the toy Paul chose.

(13) Max bought the thing Paul chose.

(14) Max bought what Max chose.

A comparison of children’s repetition abilities makes it possible to determine whether it is the semantic or syntactic factors that appear to cause problems for young speakers, while in the phase of acquiring relative clauses.

Repetition tasks can be used for testing many aspects of syntax, such as constituency structure, as in the example above, as well as constraints associated with word order. For example, Lust and Wakayama (1989, cited by McDaniel et al. 1998) used this method with Japanese children to test the repetition of sentences with an unmarked SOV order in Japanese (15) and a right-dislocated order (16). Most mistakes made by these children, in repeating sentences with right dislocation, corresponded to an attempt to restore the canonical order of words. This experiment shows that young children already integrate the constraints of syntactic linearity of their mother tongue:

(15) Rion-to tora-gahashiru (“Lion(s) (and) tiger(s) run”).

(16) Hashiru-yousagi-to kame-ga (“Run, rabbit (and) tortoise”).

In some cases, it is also possible to provoke repetition of incorrect sentences, in order to determine whether children and learners are already sensitive to certain aspects of lexicon and syntax. Children and adults tend to correct mistakes when repeating a sentence. This paradigm can be used for testing irregular inflected forms (“you goed” instead of “you went”) as well as agreement mistakes (“two big horse” instead of “two big horses”). This type of paradigm has also made it possible to show that children distinguish words that are mistakenly repeated due to a fluency problem, as in (17), from words having an intentional repetition, as in (18):

(17) He is, he is very kind.

(18) He is very, very kind.

In addition to syntactic structures, repetition tasks can focus on words. An example of a widely used paradigm is the repetition of non-words, that is, words which could exist, according to morphophonological rules in a language, but which do not exist in the lexicon, such as degate or galpin in English. This task tests the ability of people to process the phonological component of words. It is often used in research on language impairments, since the inability to repeat non-words, which reflects a deficit in phonological working memory, is one of the linguistic markers typical of specific language impairments (e.g. Coady and Evans (2008)).

In other experiments, children are asked to repeat the last word they heard while reading sentences with regular interruptions. This type of paradigm makes it possible to determine which linguistic elements young children consider to be words, without having to resort to a metalinguistic task requiring an explicit reasoning on language. Through this type of task, Karmiloff-Smith and her team (Karmiloff and Karmiloff-Smith 2003) showed that by the age of 4 years, the majority of children already consider both content and functional words as words, and that this rate comprises almost all children by the age of 5 years. Yet, at this age, children are unable to answer explicit questions about what a word is.

We should also point out that certain word or sentence reading tasks may resemble a form of repetition of written prompts. However, reading tasks are mainly used for testing elements related to the linguistic signals produced (such as phonology or speech prosody), rather than as a way of indirectly studying the linguistic representations of speakers. These tasks are used, in particular, to accurately determine the different pronunciations of a phoneme depending on the speaker’s geographic region. This method has the advantage of making it possible to control the effects of the phonological environment on pronunciation, for example, by testing oppositions between phonemes placed at the beginning of a word, between open or closed syllables, etc. Schwab and Avanzi (2015), for example, sought to determine whether French speakers from French-speaking Switzerland and Belgium had a slower articulation speed than French speakers from France. Speech excerpts from two speech contexts were compared. The first context was a reading task, whereas the second one was a conversation excerpt. Results showed that articulation speed varied significantly from region to region (people living in French-speaking Switzerland tend to have a slower articulation speed), as well as the fact that the speech context played an important role, since the articulation speed of syllables was faster in conversations than in reading. This study thus provides an additional illustration of the need for combining data from elicitation tasks with natural data.

In summary, repetition tasks can be used as tools to assess the linguistic representations of speakers. They are valid at the lexical and syntactic levels. Due to the limitations of working memory, it is difficult to use this method for testing elements beyond the sentence level. This type of task also has the advantage of being applicable to children between 1 and 2 years old (McDaniel et al. 1998) since the latter develop imitation abilities from a very early age. On the other hand, studies have shown that learners do not correctly imitate sentences that they are able to spontaneously produce correctly (e.g. Bernstein Ratner (2000)). The main difficulty for applying a repetition task is finding the right level to test the skills of a certain group of speakers. If the task is too simple, all sentences will be repeated correctly. If the sentences are too complex, the task will no longer make it possible to draw a distinction between the different structures or words tested. As with any experimental task, use of a repetition task also requires rigorous control of the experimental material. Just to give an example, it is necessary to ensure that the different sentences are equivalent in terms of the number of syllables they contain, word frequency, etc. Finally, we should bear in mind that sometimes, when the participants do not repeat a sentence correctly, it is not always easy to explain such mistakes. Indeed, repetition mistakes do not necessarily imply a lack of competence, but can sometimes reflect a limitation in the ability to process information, which may lead to replacing a certain structure by a simpler one. In this case, the incorrect repetition would reflect a problem of performance more than of linguistic competence. Again, to limit this bias, it is necessary to diversify the research strategies, in order to benefit from the advantages of each of them.

3.6. Conclusion

In this chapter, we started by introducing the differences between language comprehension and language production, and argued that these two components of linguistic ability should be investigated in parallel. We saw that language production skills are often more limited than comprehension skills in all groups of speakers but the reverse asymmetry also exists. We then focused on the important methodological difference between the observation of production in a corpus and experimentally elicited production. We have seen that corpus data have the advantage of being natural and able to contain very large samples of language, but that the data resulting from elicitation tasks are more suitable for studying rarer linguistic phenomena, syntactic differences or subtle differences in meaning.

Among the various tasks that can be used to experimentally provoke linguistic productions, we presented free elicitation, constrained elicitation and repetition tasks. We established that free elicitation tasks can be used in addition to corpus data to increase the number of occurrences of a certain linguistic phenomenon, as well as for testing the role context plays in production. This method is particularly suitable for testing discursive phenomena. Constrained elicitation makes it possible to test the ability to produce precise words or syntactic structures in a quantitative manner, and within a controlled context. Finally, repetition can also be applied to words and sentences, making it possible to indirectly assess the way in which speakers process and understand these elements.

3.7. Revision questions and answer key

3.7.1. Questions

  1. 1) List three arguments that justify the need to study language production and comprehension separately.
  2. 2) What are the main advantages of experimenting with language productions rather than observing them in a corpus?
  3. 3) What would be the most appropriate experimental method for eliciting productions in the following research questions?
    1. a) What are the syntactic and semantic constraints at work in the acquisition of relative clauses?
    2. b) Are learners able to formulate indirect speech acts in a foreign language?
    3. c) Are adults able to produce different types of relative clauses?
  4. 4) What are the common points between free elicitation and the observation of productions in a corpus?
  5. 5) List and explain three methodological constraints related to the development of a constrained elicitation task.
  6. 6) What are the main advantages and disadvantages of a repetition task, compared to a constrained elicitation task?

3.7.2. Answer key

  1. 1) A first argument showing that these two components of the language faculty are dissociated and should be studied separately is that children and learners do not develop linguistic competences at the same rate in the field of comprehension and language production. A second argument would state that language impairments may affect the competences of speakers in one area while preserving another, such as Broca’s aphasia, which essentially affects language production. Finally, a third argument comes from corpus studies and experiments showing that the size of the mental lexicon for language production and language comprehension is very different.
  2. 2) One of the great advantages of elicitation tasks is that they make it possible to control the context in which language productions take place. As we saw in the chapter, context often has great importance on the quantity and quality of the language produced by participants. Furthermore, elicitation tasks make it possible to collect a large number of occurrences of rare linguistic phenomena by forcing people to produce the targeted elements. These phenomena cannot be analyzed on the basis of corpus data, due to the small number of occurrences found there. Finally, elicitation tasks allow you to have a better grip on what the participants’ intentions are when they produce certain words or sentences. Indeed, in these tasks, participants must produce words or sentences for describing an image or a video. It is thus possible to check that the words or sentences are used in an appropriate way. On the other hand, in a corpus, it is not always possible to determine what a speaker intended to communicate.
  3. 3) a) For this study, a repetition task seems the most appropriate, since it would make it possible to accurately test whether very fine syntactic or semantic variations in stimuli have an impact on the way in which the participants reproduce them.
    1. b) This study would require the use of a free elicitation task in order to give participants enough freedom to use various structures for expressing requests. Indeed, a constrained elicitation task would bias the results, by pushing the participants to use certain formulations, which might not correspond to the way in which the requests are spontaneously produced.
    2. c) This study could be carried out with a constrained elicitation methodology, for example, by presenting the beginning of a sentence until the relative pronoun (“I like the man who...” or “I like the car that...”), and then asking the participants finish the sentence. These different prompts would make it possible to compare the way in which the participants complete relative clauses with a subject pronoun (who) and with an object pronoun (which/that). Another study, in which the relative pronoun is not included, would make it possible to determine whether the participants prefer to complete a sentence with a subject or an object relative clause.
  4. 4) Like corpus data, free elicitation tasks have the advantage of providing relatively natural outputs, since the interference of researchers remains very low. The latter consists only of placing the participants in a certain linguistic situation. This is why these two methods are well suited to the study of spontaneous speech but more limited for eliciting repeated productions of a specific element. Rare elements often cannot be studied quantitatively using these methods.
  5. 5) First, constrained elicitation tasks involve the need to find a context in which the targeted production is mandatory. For example, to trigger the production of a relative clause, it is not enough to start a sentence with a noun phrase and ask the participants to complete it (e.g. “the little girl...”), since other options, simpler than the relative clause, are possible and will probably be chosen (e.g. “the little girl with red hair” rather than “the little girl who has red hair”.) It is also essential to check that the targeted productions are those which are produced spontaneously, by carrying out a pretest with other populations than with the one that will be tested, for example, native speakers in the case of studies on learners, or adults in the case of studies on children. Second, these tasks involve checking that all the elements of the prompts have a suitable level of difficulty. This level must also be constant between the different experimental items. For example, word frequency of the words used and sentence length should enable participants to understand the task and what is expected of them. Third, for these tasks to be valid, it is advisable to avoid modeling the expected answer. It is therefore not possible to give examples of the expected structure, which limits the use of this task with some populations, such as young children.
  6. 6) The advantages are that repetition tasks make it possible to test younger children and learners at a less advanced level than constrained elicitation tasks, because these are simpler. In addition, they make it possible to test very fine grained factors, such as the semantic and syntactic differences in sentences, or the alternation between phonemes. Their disadvantages are that item difficulty must be very well calibrated. If they are too simple, the participants will easily reach a maximum score, and if they are too complex, the participants will fail for the wrong reasons. Furthermore, mistakes made during repetition tasks are not always easy to interpret. Finally, these tasks are not at all natural and do not provide information on what the participants would spontaneously produce.

3.8. Further reading

The different types of elicitation tasks are presented in detail in the book by Gass and Mackey (2007), which places them in the context of research in language acquisition and learning. Menn and Bernstein Ratner (2000) also provides very complete references on the different methods for studying the linguistic productions of different populations. Eisenbeiss (2010) provides a more concise introduction to the analysis of elicited and spontaneous productions and clearly presents their advantages and limitations. In the field of language acquisition, the work of McDaniel et al. (1998) contains several chapters dedicated to methods for testing the syntactic productions of children. These methods are presented in a very concrete way with lots of methodological advice. In the field of sociolinguistics, Schilling (2013) discusses the methodological aspects related to the creation and analysis of surveys.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset