7

Teacher perceptions of the introduction of student evaluation of teaching in Japanese tertiary education

Peter Burden

Abstract:

In these times of widespread educational change in Japan and uncertain futures for many teachers, the degree of acceptance and perceived validity of student evaluation of teaching (SET) using non-empirical methods is relatively low. Teachers simply do not believe that such evaluations result in improving learning and teaching. Twenty-two English language teaching (ELT) teachers who were working either part-time, under a limited term contract, or had tenure in Japanese universities volunteered to take part in a research project which investigated teachers’ perceptions of how they were affected by SET survey introduction in their respective tertiary institutions. Utilising a qualitative, case-study approach through in-depth interviews, participants suggested the need for more teacher involvement, and more dialogue between teachers to discuss the results to aid the reflective process to enable change and to eliminate competitiveness. For SET to become an integral part of reform, there must be a dynamic relationship between the individual and institutional needs.

Key words

student evaluation of teaching (SET)

case-study approach

reflective process

Introduction

The many strands to the complex web of educational reform suggest that Japan has entered an ‘epoch-making phase in the history of higher education’ (Arimoto, 1997: 206). The long economic recession known as the ‘lost decade’ (Yonezawa, 2002: 25), coupled with population decline has encouraged governmental introspection, reconsiderations of social identity, and a Ministry of Education (MEXT) initiated rush for reform.

There has been an expansion of university reforms including the introduction of SET which, while a topic of considerable debate in many countries, has been little explored in the Japanese context. Twenty-two English language teaching (ELT) teachers who were working either part-time, under a limited term contract, or had tenure in Japanese universities, volunteered to take part in this research project, which investigated teachers’ understanding of how they were affected by the introduction of SET in their respective tertiary institutions. Little has been written questioning the introduction of evaluation in Japan, and even less research has been channelled into gaining an understanding of teachers’ perspectives.

Following a global emphasis on ‘quality’ in education (Leckey and Neill, 2001), a government shift towards deregulation to cope with global competition in the new century has led to sweeping changes in the ways universities are organised and administered. Universities, operating during a 42.3 per cent decline in the birthrate between 1992 and 2012 (Goodman, 2005), are suffering plummeting enrolments and it is estimated that approximately 40 per cent of private colleges in Japan are facing financial crises (Yamada, 2001) and possible bankruptcies (Hooghart, 2006).

Yet over 72 per cent of 18-year-olds are entering post-secondary education (MEXT, 2004), so Japan has entered a ‘post-massification’ phase (Arimoto, 1997: 204), where consumerism as a form of market orientation has led to the ‘popularisation of higher education’ (MEXT, 2004). Demand for institutional accountability has led to a buyer’s market, where students are ‘courted customers’ rather than ‘supplicants for admission’ (Kitamura, 1997: 147). Thus, SET through surveys is seen as indispensable and as one vehicle through which accountability can be addressed.

This is necessary to justify value from public investment and to uphold the quality of university education, where less homogeneously skilled students with diverse study backgrounds are free to enter.

Coupled with strong public criticisms that the quality of education is falling (MEXT, 2004), and facing perceived declines in student competence, learning ability, learning motivation and lecture-taking ability (Yamada, 2001), visible, concrete and accessible performance measurement systems offer a response to societal clamour for results leading to the promotion of a ‘more traditional, back to basics approach with an emphasis on memorisation of information’ (Motani, 2005: 319). As part of this results-oriented milieu, publication of evaluation results has been compulsory since 1999 (MEXT, 2004). However, ensuring that evaluation feedback is collected effectively should be an important priority. While evaluation should be seen as ‘an agent of supportive program enlightenment and change’ (Norris, 2006: 578), the rhetoric of evaluation, with its numerous English terms and acronyms such as faculty development (FD) and good practice (GP) used in official university policy documents, is little understood by school administrators (Tsurata, 2003). The lack of official policy on timing, administration or explicit statement of a summative or formative purpose either for universities or for teachers has further complicated the introduction of SET.

Similarly, in MEXT policy, there is little indication of a remedial path for teachers who receive poor evaluations, only the suggestion of extrinsic rewards such as the introduction (in some unspecified future) of an awards system, bonuses for ‘outstanding’ teachers, and, conversely, punitive or ‘appropriate’ measures for ‘incompetent’ teachers, including teaching suspensions (MEXT, 2001). If the purpose of SET is improvement of teaching, ‘technically sound’ (Stronge, 2006: 9) evaluation requires that the basic principle of ‘utility’ must be in place so that useful, informative, timely and influential information is provided to teachers and administrators and findings are valid and reliable. Yet in many Japanese institutions, evaluation of teaching is performed using student evaluation of teaching surveys. They are used as the sole tool to evaluate teachers’ performance ‘focus[sing] on the abilities of teachers’ (MEXT, 2001), which can be used as a convenient tool for dismissal.

The introduction of student evaluation of teaching (SET) in Japanese tertiary education

Student evaluation of teaching can come in many formats and vary in the demands made of students in terms of response. Each university produces its own evaluations, which are administered across the subject range and on different campuses and sections (such as junior colleges). MEXT does not make explicit either the timing or the content of evaluation in its policy, but universities typically interpret the timing to mean either before or after summative testing in the last class of each of the two 15-week semesters.

MEXT (2004) documentation states that 100 per cent of 99 national universities, 82 per cent of public universities (61), and 92 per cent of private universities (456) had implemented self-evaluation and self-monitoring by 2001, to give an overall figure of ‘about 90 per cent’ for the use of SET. This form of evaluation often utilises Likert type 1–5 scales (ranging from ‘very poor’(1) to ‘very good’(5)) and questions are coupled usually, but not always, with a final general characteristic of ‘overall satisfaction’ of the course and ‘effectiveness’ of the instructor.

Many schools require the students to complete anonymously closed item questionnaires twice, once on a machine-readable card which is used for data analysis by the administration and is the basis for summative scores, and once on a paper form. Only the latter includes an open-ended section for comments from students. For teachers to read hand-written comments, which may provide useful formative information, the administration has to return all the paper evaluation forms; many universities are reluctant to do this, according to the participants. After analysing the data, school administrators produce a set of quantitative results for each subject area to show the MEXT that they are offering quality education to the students.

The present study

Research focus

This study investigated the perceptions of, and the reactions to the introduction of SET, 22 university English Language teaching (ELT) faculty working in five universities in western Japan. The research sought insights into:

image what teachers thought was the purpose of introducing SET

image teachers’ experiences of the timing of introducing SET, given the lack of information regarding its administration

image how much voice teachers felt they had in SET administration

image whether teachers have gained useful feedback for improvement.

The study also sought to understand teachers’ attitudes to the evaluation process. It is believed that the findings of this study will instigate further research focusing on the issues and questions which it identified.

Data collection method

Recognising that effective or good teaching is contextual, the author heard of concerns among English language teaching (ELT) colleagues when SET surveys began to be administered at the end of each semester. In particular, some students did not want to focus their study on ‘communicative’ English, and expressed this in their first evaluation. For ELT faculty, students often display what McVeigh (2002) has described as an apathetic attitude which manifests itself by a loss of interest, once they pass through the academic gates and into the English classroom. Does evaluation through a single data source – SET – represent the multidimensionality of teaching or just a narrow dimension of ‘liking’ or ‘disliking’ English?

At the end of the Japanese academic year, an introductory email was sent to local members of a nationwide language teaching association (JALT), saying that the author sought teachers’ views of their experience of university-driven evaluation. An initial semi-structured interview of around an hour was arranged with teachers who expressed an interest and were willing to volunteer time. Verbatim transcriptions of the initial interviews were returned to each participant to encourage further reflection. Subsequent interviews were arranged, at which issues raised in the initial interviews were discussed.

The participants

To get a balanced picture of a cross-section of teachers holding different types of appointment, the following were sought for interviews:

image full-time tenured teachers

image limited (or fixed-) term contracted teachers

image part-time local teachers.

The ages of the 22 participants ranged from early 30s to late 50s, and their teaching experience in the tertiary sector ranged from one year to close to 30 years. All the participants chose pseudonyms, which were adopted in this study. Although gender was less of a concern, a range of perspectives from both male and female teachers was sought to aid credibility (Rubin and Rubin, 2005). Perhaps as a reflection of the teaching profession’s demography, willing part-time local and native English speaking teachers of both sexes were found, but not a single full-time contracted local teacher was found. While Japanese full-time teachers of English are tenured, most native English speakers are lower-status, limited-term, contracted teachers. The number of contracted and part-time English language teachers who participated in this study may reflect hiring trends at Japanese universities.

Data analysis

The interviews were audiotaped and the data from the initial interviews transcribed verbatim by the author. Lincoln and Guba’s (1985) ‘constant comparative method’ (341) and Rubin and Rubin’s (2005) ‘responsive interviewing’ (202) guided the analysis. Following transcription, the data were ‘unitised’ (Lincoln and Guba, 1985), meaning the text was analysed in terms of units of information that were the basis for defining categories. The ‘push forward’ technique (Kvale, 1996: 100) was used to aid data analysis, where the meanings of expressions used at the time of the interview were clarified to aid interpretation. ‘Evaluation’, ‘rating’, ‘assessment’ and ‘checking up’ were often used in the interviews as if they were near synonyms. For example, it was unclear what one of the participants meant by ‘checking up’, which she had used to describe how she felt when finding evaluation forms in her university mail box. She said:

S: … The first time I felt, ‘Well, this is an amazing way to check up on staff’.

P: Can you elaborate a little more on that? What do you mean by ‘check up’?

S: Oh, wow. By this I mean that the questionnaire is used by the administration to find out how well the teacher is performing in the classroom. I think that they are trying to find out if the teacher is punctual, is well organised, speaks clearly, or enthuses the students to study and do homework and keeps the students happy in class. I can see only one reason for the administration requiring this information. They want it so they can get rid of under-performing teachers. They could also use the information to promote excellent teachers. I don’t see any evidence either in the questions, or in the way the evaluations are administered, that indicates that the purpose is for a teacher’s own personal development.

Also slightly problematic in the early stages, as Wengraf (2001) acknowledges, were the inbuilt power relations in the interview, in that it was imagined that the referent (student evaluation of teaching or teachers) was initially understood by participants but there was a definite degree of unperceived discrepancy. To reduce feelings of vulnerability and inflated self-presentation in the early stages of an interview (Silverman, 2000), some of Rogan and de Kock’s (2005: 633) ‘conversation techniques’ were useful to indicate the author’s wish to learn more about the participant’s views. These included motivating the participant to provide information by emphasising the professional significance of taking part, and supporting the experience of the participants by sharing professional stories. The following section discusses the findings of this study, based on teachers’ perceptions.

Discussion of findings

What did teachers believe was the purpose of student evaluation?

In earlier studies in America, Ryan et al. (1980), Ory and Braskamp (1981), and Simpson and Siguaw (2000), for example, found that faculty were clear about how SET was used in personnel decisions. This was, however, not the case in the present study. One teacher, for example, made the point that:

It has never been made clear to me how the evaluations are used, who sees them, how the information is stored, who has access to them and for how long they are stored. The confidentiality and access issues are similarly important. It has never been made clear to me whether the focus of the evaluation is the teacher or the course.

Participants felt that, with no explanation from the administration, the ‘purpose’ was unclear, while lack of autonomy and freedom reduced classes to ‘lock-step methods’, instead of ‘enhancing student opportunities for learning’, which heightened tensions among teachers. For participants in the present study, uncertainty only added to anxiety among some teachers, who already saw themselves in vulnerable positions due to the tying of evaluation outcomes to job retention, salary and even the ability to stay in Japan. As one teacher noted:

There’s a lot of stress and you don’t know if your contract is going to be renewed. If they need to shed staff they’ll find a way to interpret these results so they can. That’s scary for everyone because there is no criteria as to how they’re used, how results are evaluated and interpreted. So for me, that’s the scariest thing.

If evaluation has an accountability purpose: ‘the progress students make in their learning is as important to know from the perspective of accountability as the level of accomplishment reached’ (Schalock, 1998: 242), given the collective nature of university-based learning. Participants felt that student perceptions of their own learning or improvement should be measured instead of giving feedback on observable and tangible elements. Another participant suggested that evaluation focuses on teaching processes through an isolated sample of performance rather than the outcomes of teaching. There is a tendency to equate limited but important knowledge about one aspect of teaching with effective teaching in general. SET surveys only measure one aspect, how satisfied students are with the processes of teaching (Abrami et al., 1997).

Teachers who participated in this study generally felt that the link between teacher evaluation and actual course improvement was at best tenuous and, with no explanation from the evaluating body, the purpose was unclear. While teachers initially suggested they did not know, they often qualified this by suggesting the purpose was ‘assessment’ or ‘retention’, as a way of ‘watching over’ or to ‘get a detailed view’ of teachers, while participants voiced concerns over the quality and timing of the data and indicated an ongoing lack of clarity in Japanese universities. They also felt that such a form of evaluation was used for performance management purposes.

Teachers believed that ratings were primarily used for reasons removed from their teaching, while lingering suspicion among participants about ratings was that results were just ‘stored in the office’ for some future time when a teacher became ‘politically unacceptable’. As evaluation has been very recently introduced, the evaluating body seems not to have formulated a clear structured policy of its use, increasing teacher fears and cynicism over its purpose. This was suggested by a number of teachers who participated in this study.

Teachers’ experiences of the timing of evaluation

As Alderson (1992) noted, if evaluation is left to the end of a course, there is no opportunity to use it to inform and influence teaching, and it fails to be utilised in every aspect of the programme. The universities in this study had adopted the single semester system of around 15 weeks, where the evaluation is expected to be handed out either in the last week or towards the end of the semester. Some schools stipulated the timing, and at least one university asked the teachers to carry out the evaluation on the day when attendance was expected to be highest. In order to ensure consequential validity in summative evaluation, students need to realise that their opinions do matter. If some teachers inform students that a purpose of teaching evaluation is to determine salary, promotion, tenure, or retention issues, this will tend to produce more favourable ratings (Cashin, 1995), as students may rate in a more responsible manner as opposed to venting personal animosities. At present students are not made part of the process, and the timing of evaluation whereby teachers receive inadequate feedback, while students do not receive any information at all for their efforts, creates a situation where evaluation is reduced to a ‘consumer index rating done after the fact’ (Braskamp and Ory, 1994: 8). Student evaluation of teaching at the end of a course offers no chance for teachers to make changes while the students are still involved. One teacher observed that:

By the end of a semester it’s a chore. They’ve done so many. I’d say those students who’ve enjoyed the class write freehand comments and those that didn’t probably didn’t write anything. Or if they’re forced to choose 444 or 333. Just an average score and not go out on a limb. Lazy kids might throw the average and give a 3, but in the end that’s why everybody gets the same score for everything. Nobody is bothering to say they loved or hated the class.

Others have observed their schools insisting on students filling out the same form on every instructor; students could find themselves completing the same form up to 14 times in a week. A lack of ‘benefit’ to student investment of time and thought (Dunegan and Hrivnak, 2003: 282) can lead to questionnaire inertia. The unclear purpose outlined above meant that teachers could have given out the forms at the end of the final class, which might suggest that ratings were an afterthought or something unimportant. As teachers felt pressured for class time, allowing just five minutes or so with little time to complete, student input would have been cursory at best. Teachers also often could not explain the evaluation rationale to students, which may have lead to perfunctory administration on the day.

Teacher voice in evaluation

Participants felt that evaluators and teachers did not have shared understandings of SET, and that teachers were excluded from the debates and critique on how teaching should be improved. One teacher noted:

If it was a fair evaluation, I would be involved in it. If it’s to influence my teaching and my syllabus design and what I do in the class then the questions should come from me. If they come from me, I’m more likely to take notice of them and make changes.

As evaluation does not come from the individual, there is little sense of responsibility for the continuing improvement and refinement of work, and a concomitant lack of autonomy in teacher performance. Participants find the link between teacher ‘evaluation’ and actual course ‘improvement’ at best tenuous and, with no explanation from the administration, the ‘purpose’ becomes unclear leading to an unclear ‘destiny’, which heightens tension while reducing autonomy and freedom (Braskamp et al., 1984). The lack of feedback data leads participants to question the purpose of the evaluation. As one said: ‘The purposes of the student evaluation do not seem to be very clear. This fuels misgivings within the profession, but we have little knowledge about the destiny which shapes our ends in this case.’

Many teachers said that they had no choice in administering evaluation, were not consulted in the design of the questions, and did not understand the questions. Concern over data use implied that students were empowered to influence teachers’ careers. Teachers were wary of ranking teachers in league tables and felt that it emphasised ‘winning and losing’. This threat led teachers to believe they were evaluated by an inaccurate mechanism, and may at some point rationalise manipulating the data or ‘beating the system’, especially if teachers felt that students attached more weight to activities occurring near the time of evaluation – a form of rating error known as the ‘recency effect’ (Dickey and Pearson, 2005). The use of such negative practices, as suggested by a number of participants, was attributed to pressure from summative use of feedback (Ryan et al., 1980; Yao et al., 2003).

Evaluation has contributed to a competitive environment in which teachers rarely gain feedback which allows them reflection on their teaching and discussion with their colleagues. Evaluation concentrates on outcomes and so leads to a decline in cooperative thought, and can discourage or destroy teamwork within and between departments. One teacher referred to her ‘personal growth’, which she gets from talking to other teachers in her own time, and how she learns from other teachers. She did not: ‘look to evaluation for help. I see this as an administration thing that’s part of my job. I work for the university and that’s what they want me to do.’ Others talked of their ‘practical knowledge’, such as one teacher who said that teachers: ‘have to watch what other teachers do and listen to the students and if you want to know whether a teacher is effective or not you need to know a lot more than the answers to a few questions’.

Another supported this idea by suggesting that teachers need time and opportunity to ‘bat ideas back and forth’. Experience is seen as essential, especially ‘practical experience which is why input from other teachers is very important and observation is very important although I tremble at the thought of it’. Those questioning evaluation were not belligerent, but believed that improvement implicit in evaluation should encourage dialogical relations to enable all participants whether teachers, students, parents or administrators to work together to understand learning and teaching (Gitlin and Smyth, 1989).

In Japan, student surveys mainly promote competition among teachers, which cuts them off from dialogue crucial to teaching and reinforces views that the administrative hierarchy knows more about the worth of teachers than teachers do themselves. The emphasis is on the ‘serfs’ (Scriven, 1981: 245) being evaluated by those in the ‘castle’ who are above such things themselves.

This lack of ownership is compounded by teachers getting average or above average scores, and so the absence, or shallowness, of feedback coupled with homogeneous rating scores leads many teachers to believe the scores come from questions which are poor, or inappropriate to their teaching situation. This can lead to ambivalence about ‘scores’, with one participant remarking that: ‘so far my scores have been fairly good so if there is any accountability judgment based on the scores, I’m not particularly worried’. Another recalled how relieved she was to get above average scores, which for her indicated there was nothing she needed to do.

Feedback for improvement

At the time of the interviews around six weeks into the second semester, some participants still had not received feedback from the first semester and so evaluation lacked utility to inform practice. Teachers questioned the lack of transparent purpose for students who were not inclined to complete evaluations in a meaningful way, while constant repetition of the same form without any rewards for students almost guaranteed diminished input.

However, teachers reported that students revealed their frustration at having to learn English. Destructive criticism lacks precision and consideration and can lead to ‘reactance’ (Taut and Brauns, 2003: 252), or teachers’ built-up anger, tension, resistance, avoidance and conflict. Native English speaker teachers who were not fluent in Japanese believed that comments in English were addressed to them, while information for the ‘office’ was in Japanese, leading to a lack of feedback in potentially useful open comments. Teachers also suggested that evaluation questions were redundant or irrelevant to everyday practice, as participants could learn more from the daily personal interactions with students which they used to improve their teaching.

An overall message coming from the interviews is that using SET as the sole criterion for evaluating teachers is flawed. As one participant said:

Students should be given every opportunity to give feedback to teachers about their teaching. If they cannot, then the teacher is missing a vital perspective on the effectiveness of lessons taught. However, this should be balanced with the views of one’s colleagues. I feel that if the survey were balanced with some form of peer review, such as classroom observation and feedback, then it would be a more valuable exercise. Evaluation only by one’s students seems a dangerous path for education and educators and worrying for the future development of Japanese education.

Teachers, and often administrators, were uncertain of the purpose of the evaluation, which was not explained. They were often just expected to administer it without any consultation or input into the questions. Even if the evaluation were intended for formative development, many teachers did not gain any new knowledge as they questioned the value of the source of information, students’ ability to evaluate, and the ability of administrators to recognise ‘good’ teaching. The lack of dialogue may militate against good relations among teachers. Participants believed evaluation was imposed on teachers to diminish their sense of responsibility and autonomy within their profession.

Implications: how can evaluation be improved?

If teachers disagree with the aims of evaluation as imposed by administrators, they are likely to ignore or frustrate any recommendations resulting from the evaluation. D’Apollonia and Abrami (1997: 1205) concluded that student ratings should be used to make only ‘crude judgments’, must not be ‘over interpreted’, and recommend ‘comprehensive systems of faculty evaluation be developed, of which student ratings are only one, albeit important component’. One key principle is encouraging a balanced relationship between school goals and individual teacher’s professional growth and improvement (Stronge, 2006), so that evaluation is welcomed when teachers:

image (a) accept that changes may occur and are needed

image (b) are willing to risk failure when attempting to make changes, in spite of the current conditions being perceived as more rewarding or predictable

image (c) accept the evaluation findings as guidelines for decision-making, even if they might contradict existing values and beliefs.

So how can teachers feel good about evaluation? One way, as Taut and Brauns (2003: 250) note, is the recognition that the greater the difference between the teachers or the evaluating body’s concept of social reality, the greater will be teachers’ resistance to the results. Unless teachers believe that the information they receive has value and the source is to be respected, they may simply dismiss it (Centra, 1993). There must be ‘fitness of purpose’, in that evaluation is carried out respecting the perceptions of teachers and students while enriching and supporting principles of equity, autonomy, and diversity.

Evaluation must recognize the contextualised nature of learning. Stakeholders must have an understanding of how culture, experience, and receptivity influence students’ evaluation. Evaluation must, therefore, reflect the multidimensional problems and possibilities posed by individual learners (Darling-Hammond and Snyder, 2000). Improvement should be seen in terms of ‘teacher growth’, which can be ‘inhibited as a result of evaluation that is overly threatening, poorly conducted, or inadequately communicated’ (Duke and Stiggins, 1990: 119). Facilitating growth includes an acknowledgement of credibility of evaluator as a source of feedback, as well as a recognition of quality of ideas contained in the feedback, a persuasiveness in the evaluator’s rationale for improvement and, most importantly, a usefulness in suggestions for improvement.

An important element of the link between knowledge and behaviour is a sense of efficacy. Efficacy requires a responsive environment that allows for rewards of performance attainment. Individuals must value goals and goals must challenge individuals or task performance will be devalued (Wise et al., 1985). An increased efficacy will result from a convergence between teachers and administrators in accepting the goals and means for evaluating performance. To this end, there must be higher levels of personalised interaction between teachers and the evaluation administration body, while expectancy models of motivation must be recognised for students to participate actively in evaluation (Chen and Hoshower, 2003).

There has been little examination of why institutions are evaluating beyond prescription, that evaluation must be carried out and without a focus on all the stakeholders. Participants in this study highlighted the need for more teacher involvement, and more dialogue between teachers to discuss the results to aid the reflective process for change and to remove competitiveness. For SET to become an integrated component of reform, there must be a dynamic relationship between the individual and institutional needs (Stronge and Tucker, 1999). Also, to ensure consequential validity in summative evaluation, students need to realise that their opinions do matter. If some teachers inform students that a purpose of teaching evaluation is to determine salary, promotion and tenure, this will tend to produce more favourable ratings as students may rate in a more responsible manner if they are made a part of the process. It should also be made clear how important the students’ opinions are, and how their opinions impact on non-tenured teachers and on elective classes. In the current climate, teachers receive inadequate feedback and students do not receive any information at all for their efforts, creating a situation where evaluation is reduced to a ‘consumer index rating done after the fact’ (Braskamp and Ory, 1994: 8), as summative, end of a course evaluation offers no chance for teachers to make changes while the students are still involved.

Feedback should be fast, detailed and made public, while discussion between stakeholders would raise awareness of both teaching styles. The belief that student ratings are the sole basis for judgments is widespread in Japan, contradicting the recommended use of multiple sources. As evaluation should encourage change in performance, peer review would enable teachers to learn from each other, while self-evaluation would encourage deeper reflection, without ‘condemning’ teachers, as suggested by one participant.

Many participants felt tension as they were unable to explain adequately the evaluation rationale to their students, which then may have influenced their SET scores. Participants also suggested that mid-semester evaluation should be introduced, which assumes a formative purpose of evaluation for teaching improvement during the lifetime of the course. Teachers believed that there needs to be a change in the school ethos to a constructive climate where opinions are freely exchanged without threat or competitiveness. According to many interviewed teachers, one-shot, end-of-semester ratings devalue the process and demean the students’ input. Using the same SET for all courses ‘guarantees it will be unfair for everyone’ (Emery et al., 2003: 44) and, instead, evaluation methods should reflect different educational goals and celebrate diversity, while rejecting the view that there is only one way to teach. This is particularly true of ‘outcomes’-based evaluation which a number of participants proposed as an important way forward. In Japan, there is a ‘truncated view’ of learning (Giroux, 1987: 45). Instead of certainty and control where knowledge is consumed, asking ‘What is good learning?’ is perhaps the crucial question as it cannot be assumed that ‘good teaching’ necessarily produces ‘good learning’.

To this end, participants pointed out that SET which utilises Likert scales or similar is only one way to capture learner beliefs; thus a fuller picture needs a wider set of evaluation procedures drawing a distinction between prescriptive, a contextual, summative evaluation and collaborative approaches that show richness and diversity, while giving learners as well as faculty more voice. Participants further questioned whether the variables found on ratings were included because they were important in ‘effective teaching’, or simply because they happened to be observable and therefore measurable. However, as evaluation in Japan is not accompanied by other information that would allow users to make sound decisions, this has led to a trivialisation of teaching, as teachers are evaluated on aspects which do not relate to teaching.

At a time when less homogenously skilled students with diverse attitudes to study are entering tertiary education in Japan, SET has been introduced as a way of gauging student views of their learning. Underpinning this is a view of students as consumers, and that teaching with low market value will lose attractiveness in the marketplace. Universities in Japan need to look at more comprehensive, institution wide evaluation, instead of evaluating one part of their activities through SET. To conclude, the punitive nature of evaluation means that minimum standards are encouraged, while good or excellent results are seldom commented on or taken into account. This creates ratings that are not diagnostic but have a negative remedial purpose. Administrators have not outlined how improvements should be implemented, nor addressed conceptions of excellence in teaching. To improve instruction, the evaluation device should identify particular areas of difficulty, but initiatives at best point to broad areas of concern such as faculty/student interaction, without suggesting any cause or diagnosis of perceived weaknesses.

References

Abrami, P., d’Apollonia, S., Rosenfield, S., The Dimensionality of Student Ratings of Instruction: What We Know and What We Do Not New York: AgathonPerry R., Smart J., eds. Effective Teaching in Higher Education: Research and Practice. 1997:321–365.

Alderson, J. Guidelines for the Evaluation of Language Education. In: Alderson J., Beretta A., eds. Evaluating Second Language Education. Cambridge: Cambridge University Press; 1992:274–304.

Arimoto, A. Market and Higher Education in Japan. Higher Education Policy. 1997; 10(3):199–210.

Braskamp, L., Brandenburg, D., Ory, J. Evaluating Teaching Effectiveness: a Practical Guide. Thousand Oaks: Sage; 1984.

Braskamp, L., Ory, J. Assessing Faculty Effectiveness. San Francisco: Jossey Bass; 1994.

Cashin, W. IDEA paper No. 32. Student Ratings of Teaching: The Data Revisited. Kansas State University, Center for Faculty Evaluation and Development: Manhattan, KS, 1995:1–9.

Centra, J. Reflective Faculty Evaluation: Enhancing Teaching and Determining Faculty Effectiveness. San Francisco: Jossey-Bass; 1993.

Chen, Y., Hoshower, L. Student Evaluation of Teaching Effectiveness: an Assessment of Student Perception and Motivation. Assessment and Evaluation in Higher Education. 2003; 28(1):71–89.

d’Apollonia, S., Abrami, P. Navigating Student Ratings of Instruction. American Psychologist. 1997; 52(11):1198–1208.

Darling-Hammond, L., Snyder, J. Authentic Assessment of Teaching in Context. Teaching and Teacher Education. 2000; 16:523–545.

Dickey, D., Pearson, C. ‘Recency Effect in College Student Course Evaluations’, Practical Assessment, Research and Evaluation, 10(6). available from http://pareonline. net/getvn. asp?v=10&n=6, 2005.

Duke, D., Stiggins, R. Beyond Minimal Competence: Evaluation for Professional Development. In: Millman J., Darling-Hammond L., eds. The New Handbook of Teacher Evaluation. Newbury Park: Corwin Publications; 1990:241–256.

Dunegan, K., Hrivnak, M. Characteristics of Mindless Teaching Evaluations and the Moderating Effects of Image Compatibility. Journal of Management Education. 2003; 27(3):280–303.

Emery, C., Kramer, T., Tian, R. Returning to Academic Standards: a Critique of Student Evaluations of Teaching Effectiveness. Quality Assurance in Education. 2003; 11(1):37–46.

Giroux, H. Theory and Resistance in Education: a Pedagogy for the Opposition. South Hadley, MA: Bergin and Garvey; 1987.

Gitlin, A., Smyth, J. Teacher Evaluation: Educative Alternatives. Lewes: The Falmer Press; 1989.

Hooghart, A. Educational Reform in Japan and its Influence on Teachers’ Work. International Journal of Educational Research. 2006; 45:290–301.

Kitamura, K. Policy Issues in Japanese Higher Education. Higher Education. 1997; 34:141–150.

Kvale, S. InterViews. Thousand Oaks: Sage; 1996.

Leckey, J., Neill, N. Quantifying Quality: the Importance of Student Feedback. Quality in Higher Education. 2001; 7(1):19–33.

Lincoln, Y., Guba, E. Naturalistic Inquiry. Newbury Park: Sage; 1985.

McVeigh, B. Japanese Higher Education as Myth. New York: M. E. Sharpe; 2002.

MEXT. Educational Reform Plan for the 21st Century. The Rainbow Plan. available from www. mext. go. jp/english/topics/21plan/010301. htm, 2001. [[accessed 2 March 2005]].

MEXT. FY2003 White Paper on Education, Culture, Sports, Science and Technology. available from http://www. mext. go. jp/english/news/2004/05/04052401. htm, 2004. [[accessed 2 March 2005]].

Motani, Y. Hopes and Challenges for Progressive Educators in Japan: Assessment of the “Progressive Turn” in the 2002 Educational Reform. Comparative Education. 2005; 41(3):309–327.

Norris, J. The Why (and How) of Assessing Student Learning Outcomes in College Foreign Language Programs. The Modern Language Journal. 2006; 90:576–583.

Ory, J., Braskamp, L. Faculty Perceptions of the Quality and Usefulness of Three Types of Faculty Information. Research in Higher Education. 1981; 15(3):271–282.

Rogan, A., de Kock, D. Chronicles from the Classroom: Making Sense of the Methodology and Methods of Narrative Analysis. Qualitative Inquiry. 2005; 11(4):628–649.

Rubin, H., Rubin, I. Qualitative Interviewing: the Art of Hearing. Thousand Oaks: Sage; 2005.

Ryan, J., Anderson, J., Birchler, A. Student Evaluation: the Faculty Responds. Research in Higher Education. 1980; 12(4):317–333.

Schalock, H. Student Progress in Learning: Teacher Responsibility, Accountability, and Reality. Journal of Personnel Evaluation in Education. 1998; 12(3):237–246.

Scriven, M. Summative Teacher Evaluation. In: Millman J., ed. Handbook of Teacher Evaluation. Beverly Hills: Sage; 1981:244–271.

Silverman, D. Doing Qualitative Research. Thousand Oaks: Sage; 2000.

Simpson, P., Siguaw, J. Student Evaluation of Teaching: an Exploratory Study of the Faculty Response. Journal of Marketing Education. 2000; 22(3):199–213.

Stronge, J. Teacher Evaluation and School Improvement. In: Stronge J., ed. Evaluating Teaching: a Guide to Current Thinking and Best Practice. Thousand Oaks: Corwin Press; 2006:1–23.

Stronge, J., Tucker, P. The Politics of Teacher Evaluation: a Case Study of New System Design and Implementation. Journal of Personnel Evaluation in Education. 1999; 13(4):339–359.

Taut, S., Brauns, D. Resistance to Evaluation: a Psychological Perspective. Evaluation. 2003; 9(3):247–264.

Tsurata, Y. Globalisation and Japanese Higher Education. In: Goodman R., Phillips D., eds. Can the Japanese Change their Education System?. Oxford: Symposium Books; 2003:119–151.

Wengraf, T. Qualitative Research Interviewing. Thousand Oaks: Sage; 2001.

Wise, A., Darling-Hammond, L., McLaughlin, M., Bernstein, H. Teacher Evaluation: a Study of Effective Practices. The Elementary School Journal. 1985; 86(1):61–120.

Yamada, R. University Reform in the Post-massification Era in Japan: Analysis of Government Education Policy for the 21st Century. Higher Education Policy. 2001; 14:277–291.

Yao, Y., Weissinger, E., Grady, M. ‘Faculty Use of Student Evaluation Feedback’, Practical Assessment, Research & Evaluation, 8(21). available at http://pareonline. net/getvn. asp?v=8&n=21, 2003.

Yonezawa, A. The New Quality Assurance System for Japanese Higher Education: Its Social Background, Tasks and Future. Research in University Evaluation. 2002; 2:23–33.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset