6
Practical Aspects for Designing an Experiment

In this chapter, we take you step by step through the different practical aspects of designing an experiment, as well as the resources needed for every stage. We first see how to look for scientific sources and access the bibliographic resources required for developing the research question. We then return to the conceptualization and formulation of the research question and the operational hypotheses. The different stages involved in building the experiment itself will then be described one after the other. We primarily address the choice of experimental design and the constraints linked to the different types of design, before discussing the key aspects of experiments in linguistics: the linguistic items used in the experiment. We then describe the different stages which mark out the typical course of an experiment, and discuss the ethical principles that have to be respected while conducting experiments on human participants.

6.1. Searching scientific literature and getting access to bibliographic resources

The first crucial step in the implementation of experimental research is the definition of the research question. It is from this definition that all the subsequent stages will ensue: the choice of method, the observed indicators, the experimental design, the linguistic material employed and finally statistical analyses. This is why it is necessary to devote time and reflection to it, in order to arrive at a well formulated and clearly delimited research hypothesis, which will in turn lead to a well-designed experiment. Many problems may arise from an incomplete or ungrounded research hypothesis, and these can be avoided by careful work prior to the implementation of the experiment itself.

The first steps in research are often guided by a general problem which is somehow related to the researcher’s personal interest, for example, the perception of different accents, language acquisition in bilingual children or the connection between language and thought. Sometimes, it is also possible to formulate a specific question intuitively, based on prior knowledge or daily observations. In these different cases, before embarking on experimental research, it is necessary to perform a thorough review of existing literature, going through the studies and the accumulated knowledge in relation to the topic of interest. For researchers who are at their beginning, the literature review phase should make it possible to get a general idea of a specific research domain, in order to delimit the research topic to a specific hypothesis, which can be investigated experimentally.

In order to carry out a review of the scientific literature, different types of sources can be consulted. These can be monographs, that is, scientific works produced by a single researcher, textbooks, such as this one, or collaborative works, in which every chapter has been written by an expert or a group of experts on the subject. Put together, these sources provide an overview of the research problem. Scientific articles are the sources that generally describe specific experimental research in the most detail, since book chapters and monographs tend to focus on offering a general overview of a field. This is why scientific articles represent eminently useful sources for preparing an experimental study.

Nevertheless, it is advisable to start with the general works and articles, which present the different aspects of a topic and summarize the knowledge acquired so far, in order to clearly define the particular aspect that will be investigated. Thereafter, once the aspect that will be examined is set, the literature review may enter a new phase, in which we go through scientific articles more specifically related to the chosen aspect.

Relevant scientific literature can be identified through specific search engines, such as Google Scholar1, or bibliographic databases such as Web of Science2, PsycINFO3, JSTOR4, Linguistics and Language Behavior Abstracts/LLBA5, or Bibliography of Linguistic Literature Database/BLLDB6.

The first queries, performed using general keywords, often result in a considerable number of scientific articles likely to be relevant. Let us take the example of the tip-of-the-tongue (TOT) phenomenon. When typing tip-of-the-tongue into a search engine like Google Scholar, more than 560,000 entries are retrieved. It is therefore essential to quickly determine other keywords, making it possible to narrow the query to a more specific problem. For example, we could add the keyword bilingualism in order to limit our query to TOT phenomena among bilingual speakers. Despite this addition, the results are still numerous: around 20,000 in this case. It is possible to further restrict the results by carrying out an advanced search, in which various fields can be selected, such as the general subject, title, language, date of publication or even the type of publication, which can also be combined with the Boolean operators AND, OR or NOT (if we wish to exclude some keywords).

It is very useful to quickly get acquainted with the different search engines, in order to use their properties effectively. Most of these allow you to look for a specific expression, by phrasing it between quotes. In this case, rather than getting access to all the sources including the different keywords of the expression, we only obtain those sources where the expression itself appears. For example, for the expression “Tip-of-the-tongue”, the query includes all the sources including such an expression, but leaves out the sources with words tip or tongue when these appear isolated. A query carried out on the titles containing the keywords “tip-of-the-tongue” AND “bilingual” in Web of Science only yields five entries and thus makes it possible to target only the most relevant sources for the subject.

The access to bibliographic databases and to scientific articles is often restricted to people or institutions with a paid subscription. In order to access them, it is necessary to identify ourselves as a member of a university benefiting from the required subscription. When we do not have this type of affiliation, different solutions exist to still have access to a source. The preliminary query by searching for keywords can be performed on Google Scholar, which shows the links to the documents associated with the source and can be accessed without a subscription where these exist. An alternative solution is to turn to Unpaywall7, a project linking the original publications to their open access versions where these exist. By October 2019, Unpaywall had a database of more than 24 million scientific articles for free access, which could either be browsed through the database search or by adding an extension to Firefox or Chrome browsers. If, despite this, the sources are still not accessible, another possibility is to consult the personal web pages of the authors, or their ResearchGate8 or Academia9 profiles, on which the articles are sometimes made available. Finally, it is also possible to contact the authors directly to request a copy of their publication. ResearchGate also allows you to request private versions of the articles via the website’s interface.

6.2. Conceptualizing and formulating the research hypothesis

A helpful literature review should not only lead to an overall vision of the problem studied, but also offer a good understanding of the methods used when investigating it. Particular attention should be devoted to the dependent and independent variables observed, as well as to the manner in which these variables have been operationalized.

It is quite interesting to observe that when screening existing literature in a certain field, it can either simplify the problem considered or, on the contrary, make it more complex. Depending on the case, the information acquired during the documentation phase can easily be brought together in order to build a precise research problem, or lead to completely reformulating the initial hypothesis, instead. Generally, different sources offer various insights into the same problem, and it quickly becomes necessary to try to narrow the complexity inherent in a research field, to a specific aspect which seems to be preeminent and that can be studied in an experiment. It is indeed impossible to exhaustively study a whole subject, even a specific one, in a single study. However, the accumulation of studies, each focusing on a specific facet of the problem, makes it possible to construct a comprehensive view of the subject.

Once a specific problem has been identified on the basis of the literature, different scenarios may arise. First of all, it is possible that the literature review gives rise to a new idea, which has not yet been studied by empirical research. It is also possible that different studies which resorted to multiple methods ended up revealing conflicting results. In this case, the research question may aim to understand the cause of these conflicting results, for example, by suggesting studying it by means of a new method. It could also be that an explanatory variable has not yet been examined, and gives rise to a research question based on this new variable. Finally, the validity of existing research can be called into question by new knowledge. This could lead to an attempt to replicate known results in order to verify their quality.

Whatever the initial situation and the reasons for research, the next step will consist of formulating the research question, as well as the hypothesis based on the literature review. We have already described the notion of research hypothesis in Chapter 1. We stressed the fact that it has to be empirically testable. In other words, every research hypothesis has to propose a directional relationship between an independent variable and a dependent variable, at least. It also has to operationalize these variables, by specifying the indicators used for measuring them.

The operationalization of the variables aims to ensure the validity and the reliability of the experiment, two concepts presented and discussed in detail in Chapter 2. At the operationalization stage, we have to maximize the chances of the chosen dependent variable to help us measure the process we want to observe and to reveal the connections between independent and dependent variables. At this stage, one way to do so is to rely on measurements used in previous studies. However, there are cases where the new study seeks to call into question the results found in the literature. In this type of situation, the use of a different type of measurement is evidently necessary. The choice of the new type of measurement should nonetheless be based on accumulated knowledge from existing literature. The first possibility would be to turn to a type of measurement whose effectiveness has been proven for testing similar phenomena to the one that will be examined. For instance, if the study aims to criticize the use of acceptability judgments, for evaluating the way in which different speech acts (e.g. requests, promises) are acquired, we could replace this type of measurement by a more implicit one. In order to assess the acquisition of pragmatic skills in children, we could resort to action-based tasks, which do not pose the same constraints as acceptability judgments, but whose effectiveness has already been demonstrated (Pouscoulous et al. 2007). A second possibility would be to choose a type of measurement that has not yet been applied to the phenomenon studied, but which seems appropriate in virtue of the processes it aims to shed light on.

Let us now turn more specifically to the definition of independent variables. As a reminder, independent variables correspond to the causes that will be manipulated in an experiment, in order to observe their effects on the dependent variable(s). For these variables, it is not only the type of measurement, but also the different conditions – or modalities – that need to be defined. As we have seen in Chapter 2, these conditions have to differ by the presence and the absence of the independent variable. They also have to make it possible to maximize the probability of observing the expected effect. For example, to test the hypothesis that frequent words are processed more quickly than less frequent words in a lexical decision task (and imagining that this effect has not already been widely demonstrated in the literature!), it might be necessary to build groups of words differing widely in terms of frequency. Then, if the effect is confirmed using these groups of words, it could be a good idea to refine the study by reducing the frequency difference between the groups of words, in order to offer a more accurate vision of the frequency effect.

Once the operational variables and the modalities of the independent variables have been defined, the hypothesis can be formulated clearly and precisely. At this point, it is appropriate to think about the different external variables, not examined directly in the experiment, but which could influence the independent and the dependent variables (see section 2.8). These external variables may be related to participants or to items. In order to illustrate the research hypothesis and to identify the external variables that need to be controlled, it may be useful to draw a diagram of the dependent and independent variables, on how to operationalize them and of the potential external variables (see Figure 6.1).

Schematic illustration of the operational hypothesis and examples of external variables which could influence the dependent variable.

Figure 6.1. Diagram of the operational hypothesis and examples of external variables which could influence the dependent variable. The variables above the dependent variable are related to the experiment items, and the ones below to the participants10

On the basis of a list of external variables, it is then possible to determine those that will be considered as confounding and those that could vary randomly. As a reminder, a confounding variable is a variable whose modalities vary systematically with those of the independent variable. These variables must necessarily be controlled in order to ensure the internal validity of the experiment. If we go back to the example above, the word frequency is most likely related to word length (long words are less frequent than short words), and word length is likely to influence the lexical decision time. This is why word length is probably a confounding variable. In order to control it, we have two possibilities. We can either present only same-length words throughout the experiment or balance the conditions in terms of word length. In this case, it would be necessary to include, in each frequency condition, the same number of different-length words (e.g. 10 bisyllabic words, 10 three-syllable words and 10 four-syllable words). Likewise, other lexical characteristics could be related to frequency and may have an impact on the response time. Among other things, we could consider the word part of speech, or its number of phonological and orthographic neighbors. Other external variables, this time concerning the participants, could also come into play, such as their vocabulary size or their decision-making speed.

In the same way as the variables investigated in the experiment, confounding variables are sometimes identified on the basis of intuition, but most of the time by consulting the literature. Such information can be found not only in the general results of the studies, but also in the Method section of scientific articles, where an exhaustive description of the material and control checks carried out on the variables is provided. Besides, it can often be useful to discuss research hypotheses, their operationalization and the variables to control with other people, either experts or less experts in the field, in order to detect potential problems, before implementing the experiment.

6.3. Choosing the experimental design

Once the research hypothesis has been formulated, the experimental design can be defined. Experimental designs can be categorized following different criteria: the number of independent or dependent variables and the assignment of participants to the different conditions of the independent variable. The number of dependent variables mainly stems from the method. Offline methods generally result in only one dependent variable, such as the score on an acceptability scale, or the number of correct answers in a recall task. On the other hand, in addition to the response itself, online methods measure the response time, thus introducing two dependent variables. These dependent variables can then be treated independently from one another, or one of the two variables can be considered as independent in the analyses. For example, it would be possible to create two conditions, one for YES replies and one for NO replies, and to analyze the differences in reaction times between these conditions. We will not discuss this possibility in further detail, since the points we will develop for univariate designs (having only one dependent variable) can be easily transferred to multivariate designs, as long as each dependent variable is analyzed separately.

6.3.1. One independent variable

When a single independent variable is examined, there are two main types of experimental design, depending on whether the participants take part in one or in all of the conditions. In a between-subject design (or independent measures), every person takes part in only one condition, whereas in a within-subject design (repeated-measures design), every participant takes part in all the conditions.

The choice of the experimental design depends first on the type of independent variable examined. When the independent variable cannot be manipulated by researchers and corresponds to an inherent characteristic of the individual, such as mother tongue or intellectual abilities, we can only use a between-subject design. A person’s characteristic can correspond to only one of the modalities of the variable. When the independent variable can be manipulated by the researcher, it is possible to implement either a between-subject or a within-subject design.

Different factors come into play when choosing a between-subject or a within-subject design. The first important factor is the control of external variables. In a between-subject design, where the conditions do not contain the same individuals, it is necessary to ensure that the characteristics of the individuals that may influence the dependent variable, are distributed evenly between the conditions. Imagine an experiment comparing sentence comprehension in a spoken or a written modality. An external variable could be the intellectual level of the participants, measured through their IQ score. If the participants in the spoken condition have a higher IQ than those in the written condition, then the results of the experiment could be influenced by the IQ level in addition to the modality of presentation.

In order to control external variables, one solution could be to assign participants to the conditions randomly. For an assignment to qualify as random, every person should have the same chance of being assigned to a condition and that assignment should not depend on the characteristics of other participants. In this case, the assignment criterion should be the chance factor, so that every time a person showed up, a coin is tossed to decide which condition they should be included in. In our example, a person with a high IQ would have the same chance as a person with a lower IQ of belonging to each condition. In real life, such a random assignment is difficult to implement, since it could lead to a highly unequal number of participants between conditions. It is therefore preferable to use block randomization, in which all the conditions appear once within a block, before moving on to the following block. Within each block, the order of the conditions is random. For example, for a design with three conditions, the first three participants could be assigned to conditions 1, then 3, then 2, the next three to conditions 2, 3, then 1, and so on. We cannot rule out the fact that even by using such randomization, the participants of the different groups may systematically differ on some points. However, this method is effective, particularly for large samples.

Another way of controlling external variables, in the case of a between-subject design, would be to assign similar individuals to each condition. In this case, the relevant characteristics have to be defined beforehand, so that the groups of participants are equivalent. Going back to our example, it would be desirable for the two groups of participants not to differ in terms of IQ. Of course, it would be impossible to only recruit individuals with the same IQ score. What would be more feasible, is to make sure that the IQ scores of the two groups belong to a similar range and on average, individuals in the first group have a similar IQ to those in the second group. This type of assignment is only doable when the criteria that need to be taken into account are few and easily measurable. When the number of criteria increases, it becomes very difficult, not to say impossible, to recruit similar participants. For example, we can easily imagine the difficulty of finding homogeneous groups of 20-year-old bilingual French–Portuguese participants with a similar IQ. When multiple constraints must be met, it is necessary to turn to a within-subject design.

In a within-subject design, participants work as their own control, since their characteristics follow them from one condition to another. In our example, IQ would no longer be a problem, since a person with a high or a low IQ would be tested in both conditions of modality presentation. As a consequence, the effects of modality presentation could no longer be attributed to the characteristics of the participants. In this respect, within-subject designs resolve the difficulties associated with building even groups of participants. On the other hand, they show a risk of spill-over effects, that is, participating in one condition may influence the responses given in a condition presented afterwards.

There are different kinds of spill-over effects. One of them, the order effect correponds to the fact that the order of participation in the different conditions could become an involuntary factor within the experiment. In short, depending on the order in which the conditions are presented, the results may differ. For example, this effect was shown in the case of scalar implicatures. Children derive more scalar implicatures linked to words such as some, namely some but not all, when the conditions with the words some and all are alternated, producing a contrast effect which encourages the derivation of the implicature, when compared to an experiment in which all the items with some are presented in a block, followed by another block containing all the items with all (Skordos and Papafragou 2012)11. Another spill-over effect is that of learning, in the specific case where participation in one condition improves performance in the other condition. In the example we are analyzing, this would mean that simply carrying out the comprehension task in one modality condition, could improve the performance in the second condition. This would, of course, be the case if the sentences were the same in the two conditions. It is for this very reason that different sentences should be presented. We will return to this in the section on experimental material. This could also be the case if the participants developed specific strategies in the first condition, which could then be reused in the second condition. The effect of fatigue is yet another example of a spill-over effect producing opposite results to those from the learning effect. As conditions go by, performance levels decrease, due to the fatigue or weariness participants start to experience.

In order to overcome these different spill-over effects, two counterbalancing solutions can be implemented. The first way of counterbalancing would imply randomly modifying the order of presentation of the different items in the experiment, all conditions combined. The second way of counterbalancing would be to modify the order of presentation of the items within the conditions, and to modify the order of presentation of the conditions themselves. For an independent variable with two modalities, half of the participants would first take part in condition 1, then in condition 2, whereas the other half would take part in condition 2, then in condition 1. For example, half of the participants would complete the task in the spoken condition before moving on to the written condition. As the number of modalities for the independent variable increases, it very quickly becomes difficult to counterbalance the conditions. With three modalities, there would be six possible orders, and 24 possible orders for four modalities. It is therefore necessary to resort to partial counterbalancing, also known as Latin square design. In this type of design, rather than presenting all the possible combinations, we choose a portion of these so that each condition appears in a possible position, as illustrated below.

Table 6.1. Combination possibilities for 4 conditions in a Latin square design

Condition 1Condition 2Condition 3Condition 4
ABCD
DABC
CDAB
BCDA

We should nonetheless note that there are situations where counterbalancing conditions is not convenient. For example, this would be the case for an independent variable involving a process difficult to cancel once activated. For example, Gillioz et al. (2012) studied the influence of different factors on the construction of emotional inferences. One of these factors corresponded to the simulation process, by which the readers imagined being characters of a story, in order to understand it from the inside. In this experiment, the simulation was manipulated through a within-subject design, by giving no specific instructions to the participants in the first part of the experiment, before specifically asking them to follow a simulation strategy in the second part. Here, a counterbalancing of the simulation conditions was not feasible, since adopting a simulation strategy during reading is difficult to cancel at a simple request. Therefore, a decision was made to present the conditions in the same order, at the risk of unintentionally inducing order effects into the results. Another solution would have been to manipulate this variable between the participants, while being careful to build even groups, as previously explained.

In summary, it is possible to build between-subject or within-subject designs. In order to control the external variables, within-subject designs should be used instead of between-subject designs whenever this is possible.

6.3.2. Several independent variables: factorial designs

An experiment can also be used to simultaneously test the role of several independent variables, by using a factorial design. This type of design makes it possible to test the hypotheses related to each independent variable, as well as to observe the joint influence of the independent variables, for instance, to determine whether the influence of a variable depends on the modality of another variable.

The simplest factorial design contains two variables, with two modalities each. It can be presented in a simplified way as a 2x2 design. In this type of design, the modalities of the two variables are combined to produce four conditions. Imagine that you want to extend the study about the modality of presentation (spoken or written) on sentence comprehension, by adding another independent variable, such as sentence complexity. The experiment should present spoken and written sentences in order to study the first variable. In addition, complex sentences and simple sentences should be used for studying the second variable. Combining these variables would result in the four conditions described below:

Table 6.2. Combination of independant variable modalities for creating conditions

Simple sentencesComplex sentences
Spoken modalitySimple sentences presented orallyComplex sentences presented orally
Written modalitySimple sentences presented in writingComplex sentences presented in writing

In a factorial design, every independent variable can correspond to a between-subject or within-subject measurement. To continue with our example, one possibility would be to follow a 2x2 design with independent measurements, in which participants only take part in one condition. In this case, the two independent variables would be between-subjects. Alternatively, it would be possible to follow a 2x2 repeated-measures design, in which the participants take part in all the conditions. Here, the independent variables would both be within-subjects. Finally, it would be possible to set up a mixed 2x2 design, in which one of the variables is between-subject and the other within-subject. In this case, participants would see two of the four conditions presented above, either a single type of sentence presented in spoken and written modalities, or the two types of sentences presented in a single modality.

When using factorial designs, different effects can be observed: the main effects and the interaction effects. The main effects correspond to the general effect of an independent variable on the dependent variable. In the example above, since there are two variables, two main effects can be observed. The first main effect would correspond to the presentation modality (spoken vs. written) effect on sentence comprehension, regardless of the type of sentences involved. For example, it may be that in general, participants better understand written sentences. The second main effect would correspond to the type of sentence effect on comprehension, regardless of the presentation modality. It is probable that simple sentences are generally better understood than complex ones. Finally, the interaction effect corresponds to the effect of one variable depending on the modality of the other variable. In the case of our example, a possible interaction effect would be that simple sentences are equally well understood in the spoken and written modalities, whereas complex sentences are better understood in writing than when spoken. Thus, the presentation modality effect would depend on the complexity of the sentence, since it would only be observed in the case of complex sentences.

In a factorial design, the number of modalities for every variable may vary, as well as the number of independent variables examined. For example, in a 2x3 design, two variables would be manipulated, one with two modalities and the second with three modalities. A 2x2x2 design would include three variables with two modalities each. Manipulating more than three variables in the same experiment is, however, not recommended, since the effects of interactions from such designs can become very complex to interpret.

6.4. Building the experimental material

The experimental material is a key element in every experiment. In the field of linguistics, given the variety of phenomena and processes investigated, the material sometimes corresponds to words, sentences or short texts whose length may vary. In addition, when the research hypothesis concerns individual differences such as the language level, for instance, it is necessary to measure such individual characteristics using questionnaires or specific tasks. Numerous resources concerning these questionnaires or these tasks, their validations, as well as their use for different purposes, are provided in the scientific literature. For this reason, we will not develop these aspects here, but will focus instead on the essential elements to be taken into account when developing the linguistic material used in the experiment. In this section, we will first present the general characteristics of the experimental material, as well as some useful resources for creating experimental material in linguistics. We will then approach the notions of experimental items and filler items. We will finally discuss the notion of lists, which have central importance in many linguistic experiments.

6.4.1. Experimental items

The nature of the experimental material developed for an experiment depends on the independent variables formulated in the research hypothesis, as well as on the task chosen. In most of the examples discussed in the previous chapters, we have seen that the participants had to judge, recall, read or even react to linguistic stimuli belonging to one and/or the other conditions of the experiment. These linguistic stimuli, called items, are selected so as to manipulate the independent variable and to control the external variables.

Before going further, it is important to understand the concept of an item. An item is an element for which a response is recorded in an experiment. For example, in a lexical decision task, an item corresponds to a series of letters that the participants have to categorize either as a word or a non-word. In an acceptability judgment task, an item could be a sentence whose acceptability participants have to judge on a scale; in a comprehension, reading or recall task, an item may correspond to a sentence or a short passage. Experiments contain many items, in order to collect a reliable measurement of the process investigated. Testing multiple items, as well as testing multiple participants, decreases the portion of error that is attributed to the specific characteristics of the items or the participants.

Experimental items are constructed so as to vary the properties investigated, while keeping the other properties as stable as possible. For example, an experiment on scalar implicatures could contrast two conditions: on one hand, the sentence “some kids like chocolate”, which gives rise to the implicature, and, on the other hand, the sentence “all kids like chocolate”, represents another condition. In the same way that the use of a repeated-measures design decreases the biases related to participants, repeating an item through all the conditions may decrease the biases this could entail. For this reason, items are developed so that they can appear in the different conditions whenever possible. Later, we will discuss how to distribute the items across the different conditions, but it is already useful to take this point into account when building the experimental material.

To illustrate these principles, let us go back to the example on the influence of presentation modality and complexity on sentence comprehension. Sentence complexity could be operationalized by the presence of a relative clause with a subject pronoun (less complex), or an object pronoun (more complex), for instance. In order to manipulate the presentation modality, half of the sentences would have to be presented orally and the other half in writing. The dependent variables of this experiment could be the number of correct answers given to verification questions, following the presentation of the sentence, as well as the response time needed to provide such answers.

Following these criteria, the experimental items of this experiment could take the form:

  1. (1a) The woman who follows the man carries an umbrella.
  2. (1b) The woman whom the man follows carries an umbrella.
  3. (2a) Elegantly, the courtier who adores his sweetheart picks her up to kiss her.
  4. (2b) Elegantly, the courtier whom his sweetheart adores, picks her up to kiss her.

The two sentences in each pair clearly differ on the role of the relative pronoun. However, the two pairs of sentences also differ in structure, which can be problematic. The second pair of sentences is more syntactically complex than the first one. It also contains less frequent vocabulary, which could entail comprehension difficulties. In this case, the items would vary in relation to other complexity aspects than those investigated (the role of the relative pronoun), which increases the possibility of external variable involvement. In order to avoid the intrusion of unwanted external variables, it is preferable to build a homogeneous set of items, by establishing criteria relating to their structure, their style, language register, etc. applicable to all items.

In experiments where the material corresponds to sentences or texts, compliance with these criteria can be checked by means of a pretest, where people are asked to give their opinion about some of the material’s features. These people can be colleagues or people with the same general characteristics as the participants in the experiment. Performing a pretest may sometimes seem superfluous depending on the criteria applied to the items, but it is important to keep in mind that the judgment of researchers, albeit informed and justified, is not always shared by others, especially by the participants. For example, imagine an experiment on the influence of the emotional connotation of a text on the type of information drawn from it. In order to carry out this study, it would be necessary to choose or build texts conveying positive or negative emotions, as well as texts conveying neutral information. The judgment of a single person would be problematic in this case, because many parameters can influence the evaluation of the emotions conveyed by a text. These parameters may differ from respondent to respondent, but they may also differ in their relative importance regarding the attribution of emotions to texts. In this case, it would be compulsory to pretest the items and to obtain the evaluations of different people, in order to make sure that the experimental material is valid.

When the experimental material is a word list, numerous databases are available for obtaining the different relevant characteristics of the words. Different researchers and research groups have drawn up lists of existing databases for preparing experimental material. For example, this is the case for websites such as The Language Goldmine12, Experimental Linguistics in the Field13, the Postdam Research Institute for Multilingualism14 or OpenLexicon15, which we encourage you to consult, in order to become familiar with the various types of available information for creating experimental material. These websites also list resources specifically tailored to the choice of suitable non-words or standardized images following different criteria.

Lexical databases list many objective word characteristics, such as the number of syllables, phonemes and orthographic neighbors, or their frequency in the language. Different frequency indicators are often accessible, one being calculated on the basis of a corpus of texts and the other on the basis of a corpus of film subtitles, which better reflect word frequency in the spoken modality. Studies have shown that the frequencies drawn from the subtitle corpus predicted word reading time better than their written frequency (New et al. 2007; Brysbaert and New 2009).

It is also possible to access more subjective data, such as the age of word acquisition, their familiarity, their emotional valence, their concrete nature, their imagery value or their subjective frequency. These different characteristics are assessed on the basis of judgments performed by native speakers. As a result, this data is only available for a limited word sample in certain languages.

In English, we can turn to databases such as CELEX16 (Baayen et al. 1995) or the MRC Psycholinguistic Database17 (Coltheart 1981). The latter gathers information drawn from different sources, in order to provide data relating to 26 different linguistic properties. Subtlex-US18 (Brysbaert and New 2009) and Subtlex-UK19 (van Heuven et al. 2014) contain frequencies based on film subtitles. Subtlex databases also exist for Dutch (Keuleers et al. 2010), Chinese (Cai and Brysbaert 2010), German (Brysbaert et al. 2011), Greek (Dimitropoulou et al. 2010), Spanish (Cuetos et al. 2011), Italian (Crepaldi et al. 2015), Portuguese (Soares et al. 2015), Polish (Mandera et al. 2014) and French, the latter being accessible on Lexique320 (New 2006).

For 10 years, data relating to naming and lexical decision times for tens of thousands of words and non-words in different languages have been collected and made available to researchers. This data is accessible via the Lexicon Project in different languages; for example, in English, they can be found in The English Lexicon Project21 (Balota et al. 2007) or in the British Lexicon Project22 (Keuleers et al. 2012), in French in the French Lexicon Project (Ferrand et al. 2010), in Dutch (Brysbaert et al. 2016, Keuleers et al. 2010), in Chinese (Sze et al. 2016) or in Spanish (Aguasvivas et al. 2018).

Databases specifically related to children’s lexicon are also available in different languages. In addition to the Subtlex databases which also include information on TV programs for children, statistics based on text corpora for children such as the American Heritage Word-Frequency Book (Carroll et al. 1971), or corpora of interactions with children like the CHILDES database (MacWhinney 2000) are available. TheChildFreq tool23 (Bååth 2010) also makes it possible to search the CHILDES database in order to retrieve information relating to the interactions of American and British children.

However, it is possible that the research question involves examining a variable for which there is no published standard. It is then up to the researchers to select and validate the chosen words by implementing a pretest. For example, such a pretest was necessary for a study aimed at determining the influence of the context on the activation of color in mental representation, conducted by Connell and Lynott (2009). In this experiment, the participants read sentences describing objects whose typical color or alternative color was implied by the context. For example, one sentence described either a bear in the forest, which activated the brown color, or a bear at the North Pole, which activated the white color. Another story described either a ripe banana, which activated the yellow color, or an unripe banana, which activated the green color. In order to build this experimental material, it was first necessary to identify the objects, animals or plants which could take up different colors. A pretest was then conducted to determine the typical color of these objects and their alternative color, which were activated in most participants.

6.4.2. Filler items

Once the experimental items have been prepared, the filler items can be created so that the experiment is complete. The experiment includes this type of item for two reasons. First, they are essential for all tasks requiring a YES/NO response from the participants. In these tasks, the independent variable is generally manipulated for the items associated with the YES response. For participants to have the opportunity to answer NO to some of the elements presented to them, filler items associated with the NO response need to be added. In a lexical decision task, for example, filler items are non-words. In a verification task, they correspond to the elements which have not been presented in the text.

Secondly, filler items make it possible to conceal the real purpose of the task from participants. We have already mentioned the fact that it is necessary to test naive people, for the results to not be biased. By concealing the goals of the task, we aim to reduce the risk of participants suspecting which independent variables are being manipulated. For example, in the study by Connell and Lynott (2009) presented above, filler items were sentences describing objects which had no typical color associated with them. In the filler items, it is also possible to manipulate a completely different variable other than the one actually being investigated, in order to divert the participants’ attention. For example, in the study by Zufferey et al. (2015b) on the understanding of connectives by learners, some of the filler items included obvious grammatical errors, such as an incorrect subject–verb agreement. The number of filler items should generally be equal to or greater than the number of experimental items (Havik et al. 2009; Jegerski 2014).

6.4.3. Other aspects of the material

Experiments on language comprehension, implementing reading paradigms such as self-paced reading or eye-tracking, require participants to be presented with comprehension questions after reading some items, usually fillers. These comprehension questions aim to ensure that participants carry out the task properly by following the instructions, reading and trying to understand the sentences or short texts presented to them. Reading time or eye movement measurements obtained from participants who did not really process the text, but simply pressed the response keys only to move forward within the experiment, would be purposeless and may even risk preventing the demonstration of the effect. Comprehension questions are generally simple questions, expecting a YES/NO type of response. It is important to note that the degree of simplicity of these questions may influence the natural reading process, as participants sometimes develop strategies for answering these questions (Havik et al. 2009; Jegerski 2014).

6.4.4. The concept of lists

As discussed above, we encourage the use of repeated-measures designs, since they make it possible to assign the same participants to different experimental conditions. In the same way as participants, the items selected for the experiment represent only a sample of all possible items. These items also have their own characteristics which can influence the results, depending on the words or the types of sentences chosen. In order to control this influence, within-item designs should be implemented whenever possible. In other words, the same item should appear in all conditions. However, in most experiments, it is preferable to avoid having a participant see the same item in more than one condition, so as to prevent the familiarity effect. Indeed, the answer given during the second run risks being influenced by the fact that the item has already been seen and processed. For every item to be presented in each condition and for every participant to see every item only in one condition, it is necessary to organize items as lists. Every participant should see no more than one list during the experiment.

Let us go back to our fictitious example of an experiment on how the presentation modality might impact sentence comprehension. For this experiment, we imagine that we have created 40 items (40 sentences having the same structure). In order to set up a within-subject and within-item design, all of the items should appear once in their written form and once in their spoken form. Every participant should also be exposed to items in their spoken form and others in their written form. It would therefore be necessary to set up two lists of items. The first would contain items 1–20 in the written form and items 21–40 in the spoken form, whereas the second list would contain items 1–20 in the spoken form and 21–40 in the written form.

If we add the variable related to sentence complexity to this experiment, and we want to follow a within-subject and within-item design, every item and every participant should be confronted with the four possible conditions (all variables combined). In this case, we would have to create four lists according to the following model:

Table 6.3. List possibilities for an experiment

List 1List 2List 3List 4
Spoken-complexItems 1–10Items 31–40Items 21–30Items 11–20
Spoken-less complexItems 11–20Items 1–10Items 31–40Items 21–30
Written-complexItems 21–30Items 11–20Items 1–10Items 31–40
Written-less complexItems 31–40Items 21–30Items 11–20Items 1–10

6.4.5. Number of items to be included in an experiment

An important question when creating experimental items concerns the number of items that must be included in the experiment. It is impossible to answer this question in a definite manner, as it depends on the effect size, the task implemented and the characteristics of the material. The experiment by Connell and Lynott (2009), for example, contained only 10 items, due to the very rare specificities of the words used, namely the fact of representing an object with a clearly defined typical/atypical color. Conversely, some studies implementing lexical decision tasks include more than a hundred, and sometimes even several hundred items (e.g. Carreiras et al. 2005; Perea et al. 2015). For a long time, choosing the number of items was decided on the basis of what was regularly done in a specific field. Recently, it has been suggested to target a minimum of 1,600 observations per condition when measuring reaction time in repeated-measures designs (Brysbaert and Stevens 2018). This number of observations is the product of the number of participants and the number of experimental items per condition, representing 40 participants seeing 40 items, or 20 participants seeing 80 items, for example. The criteria to be considered when choosing these numbers depends on the task: for simple tasks, processing 80–100 experimental items and 80–100 filler items poses no problem. For more complex tasks, the accumulation of items may induce fatigue effects which should preferably be avoided. In such cases, it might be better to test fewer items on more participants.

6.5. Building the experiment

The need to randomize the order in which items are presented makes it difficult to collect data without using experiment presentation software or a dedicated web interface. These tools also make data collection easier, since the responses are recorded as a ready-to-use database. Some of these software or interfaces require a license, which can be expensive for institutions or individuals; this is the case of EPrime (Psychology Software Tools, Pittsburgh, PA) and Qualtrics (Qualtrics, Provo, UT), just to mention a few examples. The software PsychoPy (Peirce et al. 2019) and the online interface PsyToolkit (Stoet 2010, 2017) offer free and rather easy to use alternatives. We will not develop the characteristics of each of these interfaces in detail, but we encourage those interested to directly consult their documentation, which is available online. Many examples of experiments are also available. However, in order to be able to program an experiment, it is necessary to get a good representation of the stages involved. We will describe these steps later in this chapter.

6.5.1. Instructions

Every experiment begins by clearly explaining to the participants how the task will unfold and what is expected from them. As we will see below, a task is made up of different trials, which are repeated a certain number of times and for which the participants have to perform the same action. For example, in a lexical decision task, a trial corresponds to the categorization of a string of letters into words and non-words. The instructions have to make it clear to participants how to give their answers. It is also essential to ask participants to keep their fingers on the answer keys throughout the experiment, in order to be able to react as quickly as possible. For a lexical decision task, the instructions could be as follows:

In this experiment, you will perform a lexical decision task. This means that we will present you with strings of letters and that you will have to decide, for each of them, whether they form an existing word in English or not.

The experiment will proceed as follows. First, you will see the message “Are you ready?” on the screen. When you are ready, you can press the YES key. This will bring up a fixation point in the center of the screen, for half a second. Please fix this point. The fixation point will then be replaced by a string of letters. At that moment, you will have to decide as rapidly and as accurately as possible, whether this string of letters is a word that exists in English or not. If it is a word, press the YES key. If it is not a word, press the NO key. The experiment will consist of 10 training trials, then 60 trials. It should last about 15 minutes. At the end of the experiment, you will see a message indicating that the experiment is over.

Please place your forefingers on the YES and NO keys and keep them on these keys throughout the experiment. From the moment you start a trial, be sure to give your answer rapidly and accurately. You can take a break at any time when you see the message “Are you ready?”.

If you have any questions, you can ask the person in charge of the experiment now.

6.5.2. Experimental trials

After presenting the instructions, the task begins. As we saw above, a task is divided into trials, each corresponding to an item. Depending on the task, the trials vary, but some characteristics remain the same, namely how items are presented and how responses are recorded. For online tasks, it is customary to precede every trial with a message such as “Ready to continue?”, to which participants reply YES in order to start the trial. This enables participants to prepare for the task and allows them to take a break when needed during the experiment. As soon as the trial begins, the elements are presented and responses to the different elements are recorded. When the experiment requires the recording of reaction times or reading times of single words, the presentation of the item (word or phrase) is generally preceded by a fixation point at the location where the item will appear, in order to attract the attention of the participants and reduce variations in the data collected. When recording the reading time of sentences or text segments, the use of a fixation point is not compulsory. On the other hand, in eye-tracking experiments, the accuracy of the measurement is very important and every trial begins with a fixation point. Figure 6.2 illustrates the different types of trials for some of the tasks presented in previous chapters.

The construction of experimental trials also requires defining how long the items will be presented. Sometimes, the participants set their own pace, as in self-paced reading tasks, in which pressing a key determines the progression in the experiment. In other cases, for example, in experiments based on a priming effect, it is necessary to precisely define the duration allocated to the presentation of the prime, and the time lapsed until the presentation of the target.

In order to prevent possible spill-over effects, the presentation of the items has to be randomized. In other words, for every participant, the order for presenting the items is established randomly. It is then highly unlikely that two participants will see the items in the same order. Randomization also makes it possible to avoid some items being systematically presented at the beginning or at the end of the task, which could lead to learning or fatigue effects, or cause the processing of a certain item to be regularly influenced by the one preceding it.

Schematic illustration of experimental trials in different tasks in experimental linguistics.

Figure 6.2. Illustrations of experimental trials in different tasks in experimental linguistics

Depending on the number of items and the type of experimental design, the number of trials may be very high, which could induce loss of attention or fatigue in the participants, and thus jeopardize the quality of the data collected. One solution to this problem may be to divide the trials into blocks, so that the experiment can be segmented into shorter portions. The use of blocks can also be convenient for presenting all the items in one condition, before moving on to another condition. For example, to study oral and written comprehension, it is preferable to present the sentences in only one modality before moving on to the other, since having to constantly switch from one modality to the other could pose additional difficulties to participants. When using blocks, it is advisable to try to counterbalance not only the order of presentation of the items within the blocks, but also the blocks themselves.

For participants to get used to the task, the experiment begins with the presentation of training items, during which the experimenter can verify the correct understanding of the instructions and re-explain them if necessary. The experiment ends when all the items have been presented. At this point, the experimenter usually thanks the participants and answers any questions they may have. It may also be useful to ask the participants for feedback on their perception of the experiment and to survey their intuitions on the nature of the question being investigated, in order to determine whether their behavior may have been influenced in a way that could affect the experiment’s quality.

6.6. Data collection

For data collection to take place, it is necessary to recruit participants. Very often, participants are university students who voluntarily take part in studies proposed in their field, or who take part in them in exchange for credits or a sum of money. Recruitment is simply done by posting ads which briefly present the research project and provide contact details for enrollment.

When the research question requires participants with a specific profile, such as a certain L2 proficiency level or certain cognitive skills, two solutions may be contemplated. The first is simply to test voluntary participants and then build groups based on their individual differences, once the data has been collected. This method, which is simple to implement, nonetheless entails various risks. First of all, it is likely that the groups obtained do not have a similar size, which is something that may cause problems when analyzing the data. It is also possible that the participants’ individual differences do not make it possible to establish clearly different groups, or may produce groups with high scores, or on the contrary, very low scores regarding the characteristic of interest. For example, Gillioz et al. (2012) grouped their participants depending on their level of empathy, in order to verify the potential influence of that variable on their comprehension of emotions while reading. Since the participants were for the most part students in psychology, the groups, while differing from each other, still presented rather high scores compared to the standard in the general population. As the results of the task on emotional inferences did not show any differences between groups, it was difficult to determine whether empathy had an influence on emotion comprehension (or not), or whether the groups tested did not make it possible to shed light on such influence. In order to avoid these problems, it may be useful to set up a preliminary selection of participants, by testing for the desired variable and then only including the people which correspond to the desired criteria of the study. For example, this can be useful for recruiting participants who share a certain linguistic profile.

For some research questions, it may also be necessary to recruit people from specific populations, such as people with autism spectrum disorder (ASD) or people with aphasia. It is therefore necessary to contact the competent institutions and associations in order to be able to gain access to these people. In general, access to such populations is rather restricted and requires important administrative work. Even more than with neurotypical participants, the ethical principles developed in the next section must be guaranteed in the experiments involving these populations.

The rest of data collection may take different forms depending on whether the participants are tested by meeting them in person, or remotely, by means of a questionnaire, or a task which can be completed on the Internet.

In the first case, data collection begins with the welcoming of the participants, where they are presented with the study and the task to be completed. This step has two main purposes. The first is to make participants feel at ease for the rest of the procedure. The second is to obtain their consent (see the next section on research ethics), which is a prerequisite for any empirical research with humans. Once the participants have provided their written consent, their demographic data is usually collected. In general, information such as gender, age, mother tongue and educational level is relevant for linguistic studies. The information collected at this stage will make it possible to describe the participants when the results are communicated afterwards. Depending on the research question or method, other types of information may be required. This is the case for laterality (left-handedness/right-handedness) for experiments measuring response times, because the participants must provide the YES responses with their dominant hand. During eye-tracking experiments, it is useful to check the participants’ visual acuity and to take note in case glasses or lenses are worn.

The standardization of the procedure is extremely important in order to grant the quality of the results. Care must be taken to present every participant with the same test conditions. In case data collection is always carried out by the same person, it is very important to maintain the same attitude with all the participants and to ensure that the instructions are identical for everyone, being specially trained in advance and making sure to use the same formulations. When several people are in charge of the data collection, every experimenter should test participants in all conditions, in order to not induce bias associated with the experimenter, or, at least, to keep such bias under control.

Collecting data in the laboratory has the advantage of being able to observe the participants, to monitor their behavior during the experiment and to interact with them in order to determine their impressions. This allows us to enrich the data collected with observations made during the completion of the task. It is also important to keep track of the participants’ involvement, by logging relevant information for every subject (such as time, the experimenter, any problems encountered during the task or any other observation that may be useful later). For example, it is useful to record the cases where participants show fatigue, or find the task difficult to perform, or even find out the variables that are being manipulated in the experiment. However, laboratory data collection has the disadvantage of being costly in terms of time, material resources and staff.

Whenever possible, one solution is to turn to the Internet, specifically to the various online data collection platforms. In this case, recruiting participants can be done in a much broader way, through social media, for example. It is also possible to ask each participant to forward the link of the experiment to one or more acquaintances, which quickly generates a snowball effect that will help when acquiring new data. Finally, there are also websites linking researchers with participants. For example, this is the case for Amazon Mechanical Turk24 or Prolific25, the latter being specifically intended for research. In order to recruit participants, it indicates the number and characteristics of the desired people. It is thus easier to access French-speaking women between 30 and 45 years old, for example. People recruited through these platforms are paid to participate in the experiment, which means there has to be a budget available.

Online studies also have the advantage of being able to test a higher number of participants as well as a wider variety of people. Consequently, they are more generalizable and have a higher ecological validity than laboratory studies (Reips 2000). Several studies have recently shown that studies carried out on the Internet offer results quite similar to studies carried out in the laboratory (Reimers and Stewart 2007; Schubert et al. 2013; Kim et al. 2019), opening new avenues for this form of experimentation.

6.7. Ethical principles

Before finishing this chapter devoted to the practical aspects of experimentation, we will present the different facets of ethics involved in experimental research. Ethical questions arise at different stages of the research process, not only during the conceptualization phase, but also when recruiting participants and when publishing results. We will develop these various aspects below, without dwelling on the principles of integrity relating to the general principles of research. Readers who are interested can turn to the codes of conduct drawn up by various research institutions, such as, the European Federation of Academies of Sciences and Humanities (ALLEA).

In experimental linguistics, it is generally essential to recruit participants in order to obtain the data which will allow us to answer research questions. Most participants take part in experiments which do not involve significant risks or benefits to their health, which could be the case in other disciplines, such as medicine. It is, however, necessary to respect certain ethical principles in order to ensure the respectful treatment of participants. Most importantly, this requirement implies their right to confidentiality. Data protection is a legal obligation, and researchers have to determine in advance how the data will be anonymized and then stored, who will have access to it and the way in which it will be used in publications or public presentations. Data confidentiality is essential to the trust between all those involved in the study and must be ensured throughout the research process.

Another important element related to ethics concerns the well-being of the participants during and after the study. We must therefore ensure that the participants do not leave in a degraded condition, compared to their initial condition. In the majority of linguistic studies, the only risk that participants face is getting bored during the experiment. However, in some cases, the research question may relate to a characteristic of the participants, such as intelligence, memory, specific skills or an impairment, such as ASD or dyslexia, for example. The evaluation of these characteristics should then be done in a neutral manner, to avoid judging the participants, or categorizing them openly, for having a higher or lower level concerning these characteristics. When it is necessary to perform experiments with particular populations, such as children, illiterate people or people suffering from ASD, every precaution must be taken to avoid unpleasant moments for them.

The well-being of participants can also be endangered when research is manipulating the conditions in which language is produced or understood. Going back to an example discussed in Chapter 1, it might be interesting to examine the influence of stress on articulation rate, which might require placing participants in stressful and less stressful conditions. This study would therefore need to find a way to stress some of the participants, without this stress having an excessively negative impact on them. At the end of the experiment, it would then be essential to eliminate the stress induced by the manipulation, either by debriefing the participants or by offering them a moment of relaxation before they go home. Another example of research, which could affect the well-being of participants, is that of Eilola et al. (2007), presented in Chapter 5, in which participants were presented with emotionally loaded words, including words with negative connotations and taboo words. The presence of such words can offend some sensitive people, and it is therefore necessary to warn them that the experiment includes such material. This enables participants to make an informed decision whether or not they consent to getting involved in the experiment.

Finally, it is important to ensure the equality of participants between conditions, when these can influence their reality in one way or another. For example, in order to study the effectiveness of a language learning method that it is very likely to offer better results than other methods, it is necessary to inform the participants about this difference. Once the study is finished, it would be desirable to offer a catch-up to participants in the more “disadvantaged” group.

In summary, ethical questions arise at different levels of a research project. Ethical principles are first taken into account by the researchers at the conceptualization stage of research and are then generally submitted to an Ethics Committee, which decides on the respect of the ethical principles for the suggested research project. If these are considered adequate, the Committee gives the green light and the study can be carried out.

Any scientific research complying with ethical principles compulsorily has to collect the free and informed consent of participants. This implies that participants can freely decide to take part in a study, in an informed manner. In other words, participants must receive complete and honest information about the study, the task to be completed and the potential positive or negative consequences of this task. Moreover, the participants must be able to decide to participate freely, without any external constraints linked to an advantage or loss of an advantage. It is also essential to inform participants about the possibility of ending their participation at any time during the study or even after, by requesting the withdrawal of their data. In order to attest to the participants’ consent, researchers must collect their signature on a written document. This document should generally:

  • – adequately present the content of the research project to participants;
  • – present the task that the participants will have to complete;
  • – present the risks, side effects and possible benefits associated with participation;
  • – mention the total freedom to participate in the study, the possibility of withdrawing at any time and the procedure to follow in the event of withdrawal;
  • – provide contact information for further details on the study.

6.8. Conclusion

In this chapter, we have reviewed various practical aspects which are useful for creating an experiment. First, we introduced the sources we can consult for formulating a research question, as well as the means of accessing these. We have seen that the research question needs be operationalized by defining the levels of the independent variables, as well as indicating how the dependent variable will be measured. External variables that could influence the results also have to be determined at this stage in order to choose the appropriate experimental design. This experimental design may include independent (between-subject) or repeated (within-subject) measurements. We have described the advantages of repeated-measures designs, which enable a better control of external variables, as well as their limitations, which must be taken into account when building the experiment. These may refer to spill-over effects, which can be controlled by counterbalancing the conditions and/or by randomizing the items’ order of presentation. Item lists are essential in repeated-measures designs and we have shown how to build them. We then described factorial designs, involving several independent variables, as well as different effects (main and interaction) which can be observed in this type of design. In the second part of the chapter, we discussed the important elements that need to be respected when building experimental material and we presented resources for selecting this material. We have seen that the material is made up of experimental items and filler items, which allow the task to be carried out while concealing its objectives. We then discussed the various stages of the experiment itself, how to recruit participants and how to collect data. We concluded the chapter by describing the ethical principles inherent in research on human beings and the main elements to be observed in this context.

6.9. Revision questions and answer key

6.9.1. Questions

  1. 1) Transform the following hypothesis into an operational hypothesis, then schematize it by including the external variables that you consider the most important, in terms of items and participants: “A person’s accent influences the credibility of what he or she says.”
  2. 2) Choose how to counterbalance the conditions in the following situations:
    1. a) An experiment studying the influence of sentence complexity on the comprehension of anaphora in children.
    2. b) An experiment studying the influence of a concurrent task in working memory (participants must remember strings of letters in parallel with the reading task) on the construction of predictive inferences while reading.
  3. 3) An experiment aims to study the influence of grammatical gender on the representation of common nouns. Based on Borodistky et al. (2003; see section 2.5), you ask French-speaking participants to choose associations of common nouns and first names, which can either be of the same gender or a different one.
    1. a) How do you choose common nouns and first names? What are the variables to be controlled?
    2. b) Choose the words to create a dozen pairs in French.
    3. c) Create lists to implement a repeated-measures design. Every item should be presented in the different conditions without the participants seeing the same item several times.
  4. 4) Write the instructions for a self-paced reading experiment comprising of 40 items. Every item corresponds to a 5-sentence short story describing a situation in everyday life, and whose fourth sentence is the target sentence. Stories are sometimes followed by questions.
  5. 5) Write the free and informed consent form for that same experiment.

6.9.2. Answer key

  1. 1) There are different ways of approaching this question. Here, we will follow the assumption that people give less credibility to statements made by a person speaking with a foreign accent than by a person without an accent. To study this question, we suggest recording statements made by speakers with or without a foreign accent, to present to participants and ask them to assess, on a scale from 1 to 10, the truthfulness of such statements. By comparing the scores obtained in the different conditions, it would be possible to determine a connection, if existent, between foreign accent and credibility. Let us imagine that we will test native American-English speakers.

For this study, we can identify different variables to control. First of all, at the item level, the veracity of the information to be evaluated should be checked. It would be appropriate to present true and false statements, so that participants can respond to a full range of credibility. Secondly, the structure and complexity of the items should be kept constant, so that these parameters do not influence the truthfulness judgment. Thirdly, it might be necessary to present as many statements as possible, that are equally uttered by male and female speakers, in order to control a potential influence of the speaker’s gender on credibility. It is possible that the participants consider men more credible than women, due to the existence of certain stereotypes in society. The last very important aspect to check would be the speaker’s accent. It would be appropriate to vary the accents in order to generalize the results.

As far as participants are concerned, the main variable that could interfere with the variables being investigated in the study, relates to their general attitude towards people with a strong foreign accent. This is undoubtedly influenced by their general attitude towards foreigners, and it might be useful to measure this variable in order to take it into account when analyzing the results. The attitude towards speakers with foreign accents may also be influenced by a subject speaking one or more languages, as bilingual people probably tend to be more tolerant of a foreign accent than monolingual people. Finally, the level of mistrust in relation to a statement certainly varies from one person to another, and this would also be a variable to be kept under control.

The best solution for this experiment would be to build a within-subject design, that is, a design where every participant sees all the conditions of the experiment, and a within-item design, in other words, a design where every statement is presented either with or without a foreign accent.

The operational hypothesis, as well as the variables to be checked, can be schematized as follows:

Schematic illustration of the operational hypothesis with the variables to be checked.
  1. 2) a) The conditions of this study are related to the complexity of the sentences presented. This means that children see more complex sentences and less complex sentences. In order to counterbalance the conditions, the best solution would be to present the sentences in the two conditions randomly, without separating the conditions themselves. Every child could see one or more complex sentences, before seeing one or more less complex ones, and then see complex sentences again, and so on.
    1. b) In this experiment, the conditions relate to the manipulation of the working memory. This implies that, for half of the sentences in the experiment, participants only have to read them, whereas for the other half of the sentences, a working memory task has to be performed in parallel with the reading task. In this case, the most appropriate technique would be to separate the experiment into two blocks, depending on the working memory condition. The order of presentation of the blocks would then have to be alternated among participants. Furthermore, the different sentences should be presented in one block or the other, and the presentation order of the sentences should be random.
  2. 3) a) In order to answer this question, it is necessary to identify which variables can influence the memorization of a pair, consisting of a common noun and a first name. One possible variable is whether the common noun represents a living being, or not. The pair COW-AGATHE seems intuitively easier to remember than the pair SPOON-AGATHE. It would therefore be appropriate to decide to only test pairs with inanimate objects. A second variable that can influence the memorization of pairs is the frequency or length of the different words included in the task. For French common nouns, we can consult Lexique to find out their frequency and their length. Concerning first names, there are statistics provided by the national statistical institutes, on the classification of first names during the last decades. These may help to control to what extent participants are familiar with such names (depending on their age group).
    1. b) The pairs should be made up of male and female common nouns and male and female first names. For the example, we searched on Lexique for common nouns with a length oscillating between three and four syllables, and a frequency between 10 and 100 appearances per million (in books and movies). We chose the following nouns: ambulance, batterie, caméra, cigarette, pharmacie, télévision, ascenseur, canapé, escalier, hélicoptère, magazine and pantalon. In order to choose the appropriate first names, we consulted the classification of the most widely spread first names in France for the period 1995-2000, on the National Institute of Statistics and Economic Studies26 website. We retained the first names Manon, Camille, Pauline, Marie, Chloé, Sarah, Thomas, Clément, Maxime, Lucas, Quentin and Julien.
    2. c) In order for each item to be presented in each condition, it should appear once associated with a male name, and once with a female name. It is therefore necessary to create two lists, which should each present half of the male and female common nouns associated with a male first name, and the other half with a female first name. One possibility would correspond to the following lists:

List 1:

ambulance-Thomas, batterie-Maxime, caméra-Julien, cigarette-Manon, pharmacie-Camille, télévision-Chloé, ascenseur-Clément, canapé-Lucas, escalier-Quentin, hélicoptère-Pauline, magazine-Sarah, pantalon-Marie.

List 2:

ambulance-Manon, batterie-Camille, caméra-Chloé, cigarette-Thomas, pharmacie-Maxime, télévision-Julien, ascenseur-Pauline, canapé-Sarah, escalier-Marie, hélicoptère-Clément, magazine-Lucas, pantalon-Quentin.

  1. 4) In this experiment, you will read short stories that are five sentences long, describing situations in everyday life. The goal is to read these stories in a natural way, as you would have if you were at home. Before each story, you will see the message “Ready to continue?”. When you are ready, press the YES button to bring up the first sentence in the story. Read the sentence, then press YES to go to the next sentence and so on until the end of the story. Some stories will be followed by simple questions about the story. If such a question appears, you must answer the question as quickly and as accurately as possible, by pressing either YES or NO. It is very important to read each story without stopping. If you want to take a break during the experiment, it is possible to do so when you see the message “Ready to continue?”. Please keep your fingers on the YES and NO keys during the whole experiment, so that you can easily progress through the stories and answer the questions. The experiment will start with some training stories. If you have questions, you can ask the experimenter. The experiment will last between 20 and 30 minutes.
  2. 5) In this study, we are interested in the process of reading comprehension. Please read the explanation of the experiment you are going to take part in, as well as the risks and benefits it may present, before deciding to participate.

If you agree to take part in this study, you will complete a reading task presented on a computer screen. It will take between 20 and 30 minutes.

You will not get any direct benefit from this experiment, but it will allow us to improve our knowledge about the comprehension processes at work while reading texts. As compensation, you will receive 10 Euros.

There is no direct risk associated with your participation in this experiment, except that of feeling bored. Participation involves an investment of 20–30 minutes of your time.

You are free to accept or refuse to take part in the study. You can now choose not to take part in it. If you choose to participate, you can still withdraw from the study at any time, without any need for justification. If you take part in the study and decide to withdraw from it following your participation, you can ask for your data to be deleted. In all cases, the 10 Euros compensation will be given to you.

All the data obtained during the experiment will be treated in strict confidence. You will only be identified by a randomly assigned number, and neither your name nor any means of identification will appear anywhere. No data identifying you will be used in the publications or presentations which result from this study.

At any time, you may ask questions or request further details from Ms. X, Address, Phone No.

6.10. Further reading

Gonzalez-Marquez et al. (2007a) present the structure of a scientific article, explain how to read such sources and detail the stages involved in the literature review. Abbuhl et al. (2013) develop the advantages and limitations of different experimental designs, as well as the particularities of research carried out on children. These two sources also review other general principles to take into account when developing an experimental study. Jegerski (2014) presents the construction of the materials used in self-paced reading experiments in detail, such as experimental items, filler items and comprehension questions. Kim et al. (2019) present the advantages of studies conducted in the laboratory or on the Internet, as well as the results of their study comparing these two methods, with a task involving choice reaction time. For the ethical principles associated with scientific research, there are various documents published by the national research societies. It is relevant to refer to the specific recommendations of the country where the study is conducted.

  1. 1 http://scholar.google.com.
  2. 2 http://apps.webofknowledge.com.
  3. 3 http://psycnet.apa.org/search/basic.
  4. 4 http://www.jstor.org
  5. 5 https://search.proquest.com/llba.
  6. 6 http://www.blldb-online.de.
  7. 7 https://unpaywall.org.
  8. 8 https://www.researchgate.net.
  9. 9 https://www.academia.edu.
  10. 10 The idea for this diagram comes from Pascal Gygax’s Research Methodology course at the University of Fribourg (Switzerland).
  11. 11 In this study, the order of presentation of the conditions was controlled in order to assess the spill-over effect. In this sense, it was an independent variable in the experiment. Had this not been the case, and had the items only been presented in one of the order conditions, the conclusions might not have reflected reality.
  12. 12 http://languagegoldmine.com/.
  13. 13 https://experimentalfieldlinguistics.wordpress.com/experimental-materials/.
  14. 14 https://www.uni-potsdam.de/en/prim/labs-experiments/resources-software-databases/online-databases.html.
  15. 15 https://chrplr.github.io/openlexicon/ and http://www.lexique.org/shiny/openlexicon/ for online research.
  16. 16 http://celex.mpi.nl/.
  17. 17 https://websites.psychology.uwa.edu.au/school/MRCDatabase/uwa_mrc.htm.
  18. 18 http://www.lexique.org/?page_id=241.
  19. 19 http://crr.ugent.be/archives/1423.
  20. 20 www.lexique.org.
  21. 21 https://elexicon.wustl.edu/index.html.
  22. 22 http://crr.ugent.be/programs-data/lexicon-projects.
  23. 23 http://childfreq.sumsar.net/.
  24. 24 https://www.mturk.com.
  25. 25 https://www.prolific.co.
  26. 26 https://www.insee.fr/fr/statistiques/3532172.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset