An investigation of the gender differential performance on a high-stakes language proficiency test in Iran

Karami, Hossein

doi:10.1007/s12564-013-9272-y

An investigation of the gender differential performance on a high-stakes language proficiency test in Iran

Published: 13 July 2013

Volume 14, pages 435–444, (2013)
Cite this article

Asia Pacific Education Review Aims and scope Submit manuscript

Hossein Karami¹

348 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

There has been a growing consensus among the educational measurement experts and psychometricians that test taker characteristics may unduly affect the performance on tests. This may lead to construct-irrelevant variance in the scores and thus render the test biased. Hence, it is incumbent on test developers and users alike to provide evidence that their tests are free of such bias. The present study exploited generalizability theory to examine the presence of gender differential performance on a high-stakes language proficiency test, the University of Tehran English Proficiency Test. An analysis of the performance of 2,343 examinees who had taken the test in 2009 indicated that the relative contributions of different facets to score variance were almost uniform across the gender groups. Further, there is no significant interaction between items and persons, indicating that the relative standings of the persons were uniform across all items. The lambda reliability coefficients were also uniformly high. All in all, the study provides evidence that the test is free of gender bias and enjoys a high level of dependability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Examination of factorial structure of Iranian English language proficiency test: An IRT analysis of Konkur examination

Article 30 July 2021

The Power of General English Proficiency Test on Taiwanese Society and Its Tertiary English Education

The effect of a national education policy on language test performance: a fairness perspective

Article Open access 15 February 2015

Notes

The EduG software along with its manual are freely available from http://www.irdp.ch/edumetrie/englishprogram.htm.

References

American Educational Research Association (AERA), American Psychological Association (APA), National Council on Measurement in Education (NCME). (1999, 2002). Standards for educational and psychological testing. Washington, DC: American Psychological Association.
Bachman, L. F., Lynch, B. K., & Mason, M. (1995). Investigating variability in tasks and rater judgments in a performance test of foreign language speaking. Language Testing, 12, 238–257. doi:10.1177/026553229501200206.
Article Google Scholar
Bachman, L. F., & Palmer, A. S. (2010). Language assessment in practice: Developing language assessments and justifying their use in the real world. Oxford: Oxford University Press.
Google Scholar
Bolus, R. E., Hinofotis, F. B., & Bailey, K. M. (1982). An introduction to generalizability theory in second language research. Language Learning, 32(2), 245–258. doi:10.1111/j.1467-1770.1982.tb00970.x.
Article Google Scholar
Brennan, R. L. (1983). Elements of generalizability theory. Iowa City: American College Testing Program.
Google Scholar
Brown, J. D. (1999). The relative importance of persons, items, subtests and languages to TOEFL test variance. Language Testing, 16, 217–238. doi:10.1177/026553229901600205.
Google Scholar
Cardinet, J., Johnson, S., & Pini, G. (2010). Applying generalizability theory using EduG. New York: Routledge.
Google Scholar
Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements. New York: Wiley.
Google Scholar
Farhady, H., & Hedayati, H. (2009). Language assessment policy in Iran. Annual Review of Applied Linguistics, 29, 132–141.
Article Google Scholar
Gebril, A. (2009). Score generalizability of academic writing tasks: does one test method fit it all? Language Testing, 26(4), 507–531. doi:10.1177/0265532209340188.
Article Google Scholar
Kane, M. (2003). Generalizability Theory. International Journal of Testing, 3(1), 95–100. doi:10.1207/S15327574IJT0301_6.
Article Google Scholar
Kane, M. (2011). The errors of our ways. Journal of Educational Measurement, 48(1), 12–30. doi:10.1111/j.1745-3984.2010.00128.x.
Article Google Scholar
Karami, H. (2011). Detecting gender bias in a language proficiency test. International Journal of Language Studies, 5(2), 167–178.
Google Scholar
Karami, H. (2012a). An introduction to differential item functioning. International Journal of Educational and Psychological Assessment, 11(2), 59–76.
Google Scholar
Karami, H. (2012b). The relative impact of persons, items, subtests, and academic background on performance on a high-stakes language proficiency test. Psychological Test and Assessment Modeling, 54(3), 211–226.
Google Scholar
Karami, H. (2012c). The development and validation of a bilingual version of the Vocabulary Size Test. RELC Journal, 43(1), 53–67.
Article Google Scholar
Karami, H. (2013). The quest for fairness in language testing. Educational Research and Evaluation, 19(2&3), 158–169.
Article Google Scholar
Karnameh Haghighi, H. K., & Akbari, N. (2005). An investigation into women’s social demand for higher education in Iran. Pezhohesh Zanan, 3(1), 69–100. (in Persian).
Google Scholar
Khosrokhavar, F., & Ghaneirad, M. (2010). Iranian Women’s Participation in the Academic World. Iranian Studies, 43(2), 223–238. http://dx.doi.org/10.1080/00210860903542093.
Kunnan, A. J. (1992). An investigation of a criterion-referenced test using G-theory, and factor and cluster analysis. Language Testing, 9(1), 30–49. doi:10.1177/026553229200900104.
Article Google Scholar
Lee, Y. W. (2006). Dependability of scores for a new ESL speaking assessment consisting of integrated and independent tasks. Language Testing, 23(2), 131–166. doi:10.1191/0265532206lt325oa.
Article Google Scholar
Lumley, T., & O’Sullivan, B. (2005). The effect of test-taker gender, audience and topic on task performance in tape- mediated assessment of speaking. Language Testing, 22(4), 415–437. doi:10.1191/0265532205lt303oa.
Article Google Scholar
Lynch, B. K., & McNamara, T. F. (1998). Using G-theory and Many-Facet Rasch measurement in the development of performance assessments of the ESL speaking skills of immigrants. Language Testing, 15(2), 158–180. doi:10.1177/026553229801500202.
Google Scholar
Mehran, G. (2009). Doing and undoing gender: Female higher education in the Islamic Republic of Iran. International Review of Education, 55, 541–559.
Article Google Scholar
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: American Council on Education & Macmillan.
Google Scholar
O’Loughlin, K. (2002). The impact of gender in oral proficiency testing. Language Testing, 19(2), 169–192. doi:10.1191/0265532202lt226oa.
Article Google Scholar
O’Sullivan, B. (2000). Exploring gender and oral proficiency interview performance. System, 28(3), 373–386. doi:10.1016/S0346-251X(00)00018-X.
Article Google Scholar
Rezai-Rashti, G. (2011). Iranian women’s increasing access to higher education but limited participation in the job market. Middle East Critique, 20(1), 83–98.
Article Google Scholar
Rezai-Rashti, G., & Moghadam, V. (2011). Women and Higher Education in Iran: What are the implications for employment and the “marriage market”? International Review of Education, 57, 419–441.
Article Google Scholar
Ryan, K., & Bachman, L. (1992). Differential item functioning on two tests of EFL proficiency. Language Testing, 9, 12–29. doi:10.1177/026553229200900103.
Article Google Scholar
Sato, T. (2012). The contribution of test-takers’ speech content to scores on an English oral proficiency test. Language Testing, 29(2), 223–241. doi:10.1177/0265532211421162.
Article Google Scholar
Sawaki, Y. (2007). Construct validation of analytic rating scales in a speaking assessment: Reporting a score profile and a composite. Language Testing, 24(3), 335–390. doi:10.1177/0265532207077205.
Article Google Scholar
Shabani, E. A. (2008). Differential item functioning analysis for dichotomously scored items of UTEPT using Logistic Regression. Unpublished master’s thesis, University of Tehran, Iran.
Shavarini, M. K. (2005). The feminization of Iranian higher education. International Review of Education, 51(4), 329–347.
Article Google Scholar
Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park: Sage.
Google Scholar
Shavelson, R. J., Webb, N. M., & Rowley, G. L. (1989). Generalizability theory. American Psychologist, 44(6), 922–932. doi:10.1037/0003-066X.44.6.922.
Article Google Scholar
Swiss Society for Research in Education Working Group. (2010). EDUG user guide. Neuchatel: IRDP.
Google Scholar
Takala, S., & Kaftandjieva, F. (2000). Test fairness: A DIF analysis of an L2 vocabulary test. Language Testing, 17(3), 323–340. doi:10.1177/026553220001700303.
Google Scholar
Van Moere, A. (2006). Validity evidence in a university group oral test. Language Testing, 23(4), 311–440. doi:10.1191/0265532206lt336oa.
Google Scholar
Xi, X. (2007). Evaluating analytic scoring for the TOEFL^® Academic Speaking Test (TAST) for operational use. Language Testing, 24(2), 251–286. doi:10.1177/0265532207076365.
Article Google Scholar
Xi, X. (2010). How do we go about investigating test fairness? Language Testing, 27(2), 147–170.
Google Scholar
Zeidner, M. (1987). A comparison of ethnic, sex, and age bias in the predictive validity of English Language aptitude tests: Some Israeli data. Language Testing, 4, 55–71. doi:10.1177/026553228700400106.
Article Google Scholar
Zhang, S. (2006). Investigating the relative effects of persons, items, sections, and languages on TOEIC score dependability. Language Testing, 23(3), 351–369. doi:10.1191/0265532206lt332oa.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of English Language and Literature, Faculty of Foreign Language and Literatures, University of Tehran, Kargar Shomali Avenue, 14155-6553, Tehran, Iran
Hossein Karami

Authors

Hossein Karami
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hossein Karami.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Karami, H. An investigation of the gender differential performance on a high-stakes language proficiency test in Iran. Asia Pacific Educ. Rev. 14, 435–444 (2013). https://doi.org/10.1007/s12564-013-9272-y

Download citation

Received: 03 December 2012
Revised: 22 June 2013
Accepted: 02 July 2013
Published: 13 July 2013
Issue Date: September 2013
DOI: https://doi.org/10.1007/s12564-013-9272-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An investigation of the gender differential performance on a high-stakes language proficiency test in Iran

Abstract

Access this article

Similar content being viewed by others

Examination of factorial structure of Iranian English language proficiency test: An IRT analysis of Konkur examination

The Power of General English Proficiency Test on Taiwanese Society and Its Tertiary English Education

The effect of a national education policy on language test performance: a fairness perspective

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An investigation of the gender differential performance on a high-stakes language proficiency test in Iran

Abstract

Access this article

Similar content being viewed by others

Examination of factorial structure of Iranian English language proficiency test: An IRT analysis of Konkur examination

The Power of General English Proficiency Test on Taiwanese Society and Its Tertiary English Education

The effect of a national education policy on language test performance: a fairness perspective

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation