
Actualidades en Psicología, 29(119), 2015, 131-139
132 Sireci, Sukin & Ong
Introduction
Diversity in the language spoken by students
within and across countries has necessitated the
process of adapting educational tests for use
across multiple languages (Hambleton, Merenda,
& Spielberger, 2005). International assessments
such as the Trendsa in International Mathematics
and Science Study (TIMSS; Mullis, Martin, &
Foy, 2008), Program for International Student
Assessment (PISA; Organization for Economic
Cooperation and Development (OECD), 2006),
Progress in International Reading Literacy (PIRLS;
Baer, Baldi, Ayotte, Green, & McGrath, 2007), and
the Program for the International Assessment of
Adult Competencies (PIAC; Statistics Canada &
OECD, 2005) are examples of large-scale tests
that are administered in multiple languages so that
comparisons can be made across examinees who
function in different languages.
Within many countries, cross-lingual assessment
is also necessary. Adapted tests based on test
translation are used in Canada (Gierl & Khaliq,
2001), the United States (Sireci & Khaliq, 2002),
and many other countries.
Measurement of educational or psychological
constructs across languages typically involves
translation. The process of translating a test from
one language to another is known as adaptation
because the intent is to reproduce the meaning
and intent of each item in the target language,
as opposed to a literal word-by-word translation
(Hambleton, 2005). Although test adaptation
facilitates the assessment and comparison of
students who operate in different languages,
that different language versions of a test are
equivalent with respect to psychometric properties
cannot be assumed. Adapting tests for use across
multiple languages may result in differences in
difficulty across the different language versions
of a test or in the different versions measuring
different constructs altogether (International Test
Commission, 2010; Sireci, 1997; Sireci, Rios, &
Powers, in press; van de Vijver & Poortinga, 2005).
The degree to which adapted versions of tests
are equivalent across languages is an important
issue in considering the validity of tests used
across different language groups. The Standards
for Educational and Psychological Testing
(American Educational Research Association
[AERA], American Psychological Association, &
National Council on Measurement in Education,
2014), the Guidelines for Translating and Adapting
Tests (International Test Commission, 2010), and
many researchers (e.g., Hambleton, 2005; Sireci,
2011; Van de vijver & Poortinga, 2005) argue that
empirical evidence must be put forward to support
the validity of inferences derived from cross-lingual
assessments, especially when comparisons of test
performance are made across different language
groups.
However, providing data to support the validity
of cross-lingual assessments is difficult because
one cannot assume that the items on the different
language versions of the test are equivalent, and one
cannot assume the different groups of examinees
to be equivalent. Thus, there is nothing to anchor
a true comparison of test difficulty or construct
equivalence across languages (Sireci, 1997).
One way around this problem is to administer
different language versions of an assessment to a
sample of examinees who are proficient in both
languages (Sireci, 2005). Bilingual examinees may
represent a common group upon which comparisons
of tests and items can be made. In this paper, the
authors explore the utility of bilingual examinees
for evaluating the factorial invariance (i.e., structural
equivalence) of two different language versions of
a ninth-grade math test administered in Malaysia.
Both English and Malay versions of the test were
administered in counterbalanced order to English-
Malay bilingual students. Bilingual students have
been used to evaluate cross-lingual invariance
of survey items (Sireci & Berberoglu, 2000) and
to link educational tests across languages (Boldt,
1969; CTB, 1988). However, the use of bilinguals
for these purposes is rare, and there has been little