[Paleopsych] PNAS: Cultural variation in eye movements during scene perception

Tue Aug 23 22:40:10 UTC 2005

Cultural variation in eye movements during scene perception

Hannah Faye Chua, Julie E. Boland, and Richard E. Nisbett* Department of 
Psychology, University of Michigan, 530 Church Street, Ann Arbor, MI 48109-1043 
Contributed by Richard E. Nisbett, July 20, 2005 *To whom correspondence should 
be addressed. E-mail: nisbett at umich.edu.

http://www.pnas.org/cgi/doi/10.1073/pnas.0506162102 Proceedings of the National 
Academy of Sciences August 30, 2005 vol. 102 no. 35 12629-12633

[This is a Big Mac psychology article, but an important one. Nisbett is a big 
AntiRacist and wrote the best response to the Rushton-Jensen article "Thirty 
Years of Research on Race Differences in Cognitive Ability," which appeared in 
the same issue of _Psychology, Public Policy, and Law_, the whole issue being 
unreported in both the mainstream and alternative press.

[In fact, "racial differences" could have replaced "cultural differences" 
throughout the paper! The study found that East Asians look at the background 
in pictures more than Americans and claimed their "findings provide clear 
evidence that cultural differences in eye-movement patterns mirror and probably 
underlie the cultural differences in judgment and memory tasks." Toward the end 
the authors added

["In the past decade, cultural differences in perceptual judgment and memory 
have been observed: Westerners attend more to focal objects, whereas East 
Asians attend more to contextual information. However, the underlying 
mechanisms for the apparent differences in cognitive processing styles have not 
been known. In the present study, we examined the possibility that the cultural 
differences arise from culturally different viewing patterns when confronted 
with a naturalistic scene. We measured the eye movements of American and 
Chinese participants while they viewed photographs with focal object on complex 
background. In fact, the Americans fixated more on focal objects than did the 
Chinese, and the Americans tended to look at the focal object more quickly. In 
addition, the Chinese made more saccades to the background than did the 
Americans. Thus, it appears that differences in judgment and memory may have 
their origins in differences in what is actually attended as people view 
scene."

[I see here a mixing of psychological and social layers. Once the idea of 
gene-culture co-evolution becomes acceptable (that is, after the battle between 
Big Med and Big Ed resolves in Big Ed's favor and Big Ed discovers that as big 
a cash cow can come with "race-based education" as with "race-based medicine), 
we'll get something like this:

[For some reason or another, the *physical* environment in East Asia selected 
those whose visual systems focused upon the background more than the physical 
environment in Europe did. Weather patterns, reflectivity i snow or in the 
atmosphere, something like that. A byproduct of this difference in perceptual 
*psychology* was a psychology of greater attentiveness to holistic phenomenon 
in other aspects of the environment, including the social environment (other 
people). This byproduct had an impact on social organization as well, since 
those being attuned to other people are more likely to think in collectivist 
terms.

[The authors, to the contrary, assert that more collectivist societies somehow 
affect visual processing, a nice AntiRacist claim but an ad hoc one.

[I'm not sure the authors were really thinking about the co-evolution question, 
though. And they might have presented data about Chinese-Americans instead of 
just Chinese Chinese and White Americans. Nisbett did give data on Japanese 
Americans iirc in his excellent (but also AntiRacist) book, _The Geography of 
Thought_, cited at the end. If Chinese Americans performed exactly the same as 
Chinese Chinese in these experiments, we'd have something that is likely to be 
mostly the result of racial differences. But social organization can effect 
individual psychology (cultural anthropologists cite instances of this all the 
time), and it would have been fascinating to have had this extra information.

[But it would be safer not have included Chinese Americans, as the SSSM 
collapses, which collapse I reported on earlier. AntiRacists are getting more 
and more alike Creationists every day. They needn't, for the race issue is no 
longer that of superiority and inferiority. It is that of pluralism and whether 
differences among the world's cultures are deep enough to put a brake on the 
American democratic capitalism juggernaut.

[Invoking "culture" as an all-purpose explanation of everything is 
spiritualism, really, for such invocations brush aside any material substrate 
upon which culture can act! This is worse than Creationism.

[Thanks to Peter for passing on the reference to this article. I can supply the 
PDF if you want to see the graphics.

------------

Summary:

In the past decade, cultural differences in perceptual judgment and memory have 
been observed: Westerners attend more to focal objects, whereas East Asians 
attend more to contextual information. However, the underlying mechanisms for 
the apparent differences in cognitive processing styles have not been known. In 
the present study, we examined the possibility that the cultural differences 
arise from culturally different viewing patterns when confronted with a 
naturalistic scene. We measured the eye movements of American and Chinese 
participants while they viewed photographs with focal object on complex 
background. In fact, the Americans fixated more on focal objects than did the 
Chinese, and the Americans tended to look at the focal object more quickly. In 
addition, the Chinese made more saccades to the background than did the 
Americans. Thus, it appears that differences in judgment and memory may have 
their origins in differences in what is actually attended as people view scene. 
A growing literature suggests that people from different cultures have 
differing cognitive processing styles (1, 2) Westerners, in particular North 
Americans, tend to be more analytic than East Asians. That is, North Americans 
attend to focal objects more than do East Asians, analyzing their attributes 
and assigning them to categories. In contrast, East Asians have been held to be 
more holistic than Westerners and are more likely to attend to contextual 
information and make judgments based on relationships and similarities.

---------------------------

Causal attributions for events reflect these differences in analytic vs. 
holistic thought. For example, Westerners tend to explain events in terms that 
refer primarily or entirely to salient objects (including people) whereas East 
Asians are more inclined to explain events in terms of contextual factors (3-5) 
There also are differences in performance on perceptual judgment and memory 
tasks (6-8) For example, Masuda and Nisbett (6) asked participants to report 
what they saw in underwater scenes. Americans emphasized focal objects, that 
is, large, brightly colored, rapidly moving objects. Japanese reported 60% more 
information about the background (e.g. rocks, color of water, small nonmoving 
objects) than did Americans. After viewing scenes containing single animal 
against realistic background, Japanese and American participants were asked to 
make old/new recognition judgments for animals in a new series of pictures. 
Sometimes the focal animal was shown against the original background; other 
times the focal animal was shown against a new background. Japanese and 
Americans were equally accurate in detecting the focal animal when it was 
presented in its original background. However, Americans were more accurate 
than East Asians when the animal was displayed against new background. 
plausible interpretation is that, compared with Americans, the Japanese encoded 
the scenes more holistically, binding information about the objects with the 
backgrounds, so that the unfamiliar new background adversely affected the 
retrieval of the familiar animal.

The difference in attending to objects vs. context also was shown in perceptual 
judgment task, the Rod and Frame test (7) American and Chinese participants 
looked down long box. At the end of the box was rod whose orientation could be 
changed and frame around the rod that could be moved independently of the rod. 
The participants’ task was to judge when the rod was vertical. Chinese 
participants’ judgments of verticality were more dependent on the context, in 
that their judgments were more influenced by the position of the frame than 
were those of American participants. In change blindness study, Masuda and 
Nisbett asked American and Japanese participants to view sequence of still 
photos and also to view animated vignettes of complex visual scenes 
(unpublished data) Changes in focal object information (e.g. color and shape of 
foregrounded objects) and contextual information (e.g. location of background 
details) were introduced during the sequence of presentations. Overall, the 
Japanese reported more changes in the contextual details than did the 
Americans, whereas the Americans reported more changes in the focal objects 
than did the Japanese. This finding has at least two possible explanations (see 
ref. 9) On one account, the Asian participants had more detailed mental 
representations of the backgrounds, whereas the Westerners had more detailed 
representations of the focal objects. On the other account, the mental 
representations did not differ with culture, but the two groups differed in 
their accuracy for detecting deviation between their mental representation of 
the background/focal object and the current stimulus.

Clearly, there were systematic differences between the Americans’ and the East 
Asians’ performance in the causal perception, memory, and judgment studies. 
However, it is unclear whether the effects occur at the level of encoding, 
retrieval, mental comparison, or differences in reporting bias. To identify the 
stages in perceptual-cognitive processing at which the cultural differences 
might arise, consider what is known about scene perception:(i) Within 100ms of 
first viewing a scene, people can often encode the gist of the scene, e.g. 
"picnic" or "building" (10) (ii) People then construct mental model of the 
scene in working memory (11). The mental representation is not an exact 
rendering of the original scene and is usually incomplete in detail 
(12-13).(iii) Although the initial eye fixation may not be related to the 
configuration of the scene, the following fixations are to the most informative 
regions of the scene for the task at hand (14) The fixation positions are 
important because foveated regions are likely to been coded in greater detail 
than peripheral regions (15) (iv) The mental representation of the scene is 
then transferred to and consolidated in long-term memory. (v) Successful 
retrieval from long-term memory relies on appropriate 
retrievalcues.(vi)Duringretrieval,therecalledinformationmay be filtered by 
experimental demands and cultural expectations. Past studies (3-8) have failed 
to establish whether the effects are due to differences in perception, 
encoding, consolidation, recall, comparison judgments, or reporting bias.

To address this issue, we monitored eye movements of the American and the 
Chinese participants while they viewed scenes containing objects on relatively 
complex backgrounds. We chose this measure because eye fixations reflect the 
allocation of attention in fairly direct manner. Moreover, we have relatively 
little awareness of how our eyes move under normal viewing conditions. If 
differences in culture influence how participants actually view and encode the 
scenes, there will be differences in the pattern of saccades and fixations in 
the eye movements of the members of the two cultures. [Saccades are rapid, 
ballistic eye movements that shift gaze from one fixation to another (15). In 
particular, we would expect Americans to spend more time looking at the focal 
objects and less time looking at the context than the Chinese participants. 
Furthermore, if the Chinese participants perceive the picture more holistically 
and bind contextual features with features of the focal object, they might make 
more total saccades when surveying the scene than the Americans. On the other 
hand, if no eye movement differences emerge between the two cultures, then 
previous findings of memory and judgment differences are likely due to what 
happens at later stages, e.g. during memory retrieval or during reporting.

Fig. 1. (omitted) Sample pictures presented in the study. Thirty-six pictures 
with a single foregrounded object (animals or nonliving entities) on realistic 
backgrounds were presented to participants.

Methods

Participants.

Twenty-five European American graduate students (10 males, 15 females) and 27 
international Chinese graduate students (14 males, 12 females, data missing) at 
the University of Michigan participated in the study. The mean ages of 
Americans and Chinese were 24.3 and 25.4 years, respectively. All of the 
Chinese participants were born in China and had completed their undergraduate 
degrees there. Participants from the two cultures were matched on age and 
graduate fields of study. Participants were graduate students from engineering, 
life sciences, business programs, and, in few cases, from the social sciences. 
Recruitment e-mails were sent to Chinese student organization as well as to 
different graduate academic departments. Volunteers were each paid $14.00 for 
their participation in the study.

Materials.

A collection of animals, nonliving things, and background scenes was obtained 
from the COREL image collection (Corel, Eden Prairie, MN) and few were obtained 
from previous study (6) The pictures were manipulated by using PHOTOSHOP 
software (Adobe Systems, San Jose, CA) to create 36 pictures of single, focal, 
foregrounded objects (animal or nonliving thing) with realistic complex 
backgrounds. The final set of pictures contained 20 foregrounded animals and 16 
foregrounded nonliving entities, e.g., cars, planes, and boats (see Fig.1 for 
examples of the pictures shown). The set was composed mostly of culturally 
neutral photos, plus some Western and Asian objects and backgrounds. This set 
of 36 pictures was used in the study phase, during which the eye movement data 
were collected.

For the recognition-memory task, the original 36 objects and backgrounds 
together with 36 new objects and backgrounds were manipulated to create set of 
72 pictures. Half of the original objects were presented with old backgrounds 
and the other half with new backgrounds. Similarly, half of the new objects 
were presented with old backgrounds and the other half with new backgrounds. 
This procedure resulted in four picture combinations: (i) 18 previously seen 
objects with original backgrounds, (ii) 18 previously seen objects with new 
backgrounds, (iii) 18 new objects with original backgrounds, and (iv) 18 new 
objects with new backgrounds. This set of 72 pictures was used in the 
object-recognition phase. All participants saw the same set and sequence of 
trials to make comparisons of performance comparable.

Procedure.

Study phase.

The participants sat on chair and placed their chin on chin rest to standardize 
the distance of the head from the computer monitor. The distance of the chin 
rest from the monitor was 52.8 cm. The size of the monitor was 37.4 cm.

At the start of the session, participants wore 120-Hz head- mounted 
eye-movement tracker (ISCAN, Burlington, MA) and eye-tracking calibration was 
established before the presentation of stimuli. After this calibration, 
participants were given instructions on the screen. They were informed that 
they would be viewing several pictures, one at time. Before each picture was 
presented, blank screen with cross sign (+) was to appear. Participants were 
told to make sure that they looked at that cross sign. Once the picture 
appeared, they could freely move their eyes to look at the picture. For each of 
the pictures, participants verbally said number between and 7, indicating the 
degree to which they liked the picture (1, don't like at all; 4, neutral; 7, 
like verymuch).^ These instructions were followed by several screens showing 
sample of how the task would proceed. Once ready, participants started the 
actual task of viewing the 36 pictures. Each picture was presented for 3 s. 
Afterward, participants engaged in several distracter tasks for about 10 min. 
Participants were moved to different room and, for example, asked to do 
backward-counting task, subtracting starting from 100 until they reached zero.

[^The Chinese participants gave higher liking ratings than did the Americans 
(Ms, 4.64 vs. 4.16; 0.005).]

Object-recognition phase.

Participants were brought back to the computer room to complete 
recognition-memory task. Participants were told that they would be viewing 
pictures. Their task was to judge as fast as they could whether they had seen 
an object before, that is, whether they had seen the particular animal, car, 
train, boat, etc. in the pictures during the study phase. Participants pressed 
key if they believed that they had seen the object before, and they pressed 
another key if they believed that it was new. If participants were unsure, they 
were told to make guess. Participants then were shown sample picture informing 
them which item in the picture was the object and that the rest of the visual 
scene was the background. Participants were informed that each picture would be 
shown only for specified period. In the event that the picture had already left 
the screen, they could still input their response. Seventy-two pictures, 
including 36 original objects and 36 lure objects, were presented. The objects 
were presented with either an old or new background. Each picture was again 
presented for s, and fixation screen was presented between the picture 
presentations.

Fig.2. (not shown) Mean accuracy rates from the object-recognition phase (22 
Ameri- cans and 24 Chinese). Data shown refer to correct recognition of old 
objects, when the old objects were presented in old backgrounds, compared with 
when old objects were presented in new backgrounds. Object refers to the single 
foregrounded animal or nonliving entity on the picture; background refers to 
the rest of the realistic, complex spatial area on the visual scene.

Demographic questionnaire and debriefing.

At the end of the study, participants engaged in an object-familiarity task. 
All 72 objects were presented against white screen on computer. 
Participantscircled"yes" if they thought they had seen the object in real life 
or in pictorial information before coming to the study and "no" if they had 
not. This procedure was similar to that in previous study (6) We repeated the 
analyses reported in this paper with familiarity as covariate, and there were 
no changes in the statistical patterns. Participants also completed demographic 
questionnaire asking information about their age, education, family history, 
and English language ability. Participants were debriefed and paid.

Data analysis.

Six participants had hit rate of /0.5 on the object-recognition task, averaged 
across conditions. These participants' data were excluded in all statistical 
analyses. One additional European American had poor eye-tracking data. These 
exclusions resulted in data for 21 European American and 24 international 
Chinese participants being included in the eye-tracking analyses.

Results

The results for the object-recognition task were consistent with previous 
findings (6) indicating that East Asians are less likely to correctly recognize 
old foregrounded objects when presented in new backgrounds [F(1,44)=5.72, 
P=0.02] (Fig. 2) Thus, we have additional evidence for relatively holistic 
perception by East Asians: they appear to "bind" object with background in 
perception.

The eye-movement patterns of American and Chinese participants differed in 
several ways. As summarized in Fig. 3, the American participants looked at the 
foregrounded object sooner and longer than the Chinese, whereas the Chinese 
looked more at the background than did the Americans, confirming our 
predictions. Overall, both groups fixated the background more than the objects 
(Fig. 3A) probably because the background occupied a greater area of the visual 
scene [F(1,43)=72.46, P=0.001] The Chinese made more fixations during each 
picture presentation than the Americans [F(1,43)=4.43, P=0.05] but this was 
entirely due to the fact that Chinese made more fixations on the background 
[F(1,43)=9.50, P=0.005] The Americans looked at foregrounded objects 118 ms 
sooner than did the Chinese[t(43)=2.41, P=0.02] (Fig.3B). Participants from 
both cultures had longer fixations on the objects than on the backgrounds (Fig. 
3C) [F(1,43)=17.27, P=0.001] but this was far more true for the Americans than 
for the Chinese [F(1,43)=5.97, P=0.02] In short, the cultural difference in the 
memory study was reflected in the eye movements as well.^

[^Across both groups and for each participant group, we examined the 
correlation between six eye-movement variables and the object-memory index, 
i.e., the difference score between old object-old background memory and old 
object-new background memory. Of the 18 correlations, only 2 were marginally 
significant, and neither of these was readily interpretable.]

The cultural difference in eye-movement patterns emerged very early. At the 
onset of the picture slide, 32-35% of the time both the Americans and the 
Chinese happened to be looking at the object, but the first saccade increased 
that percentage by 42.8% for the Americans and only by 26.7% for the Chinese 
[t(43)=2.46, P=0.02]

To better understand the time course of cultural differences, we examined the 
fixation patterns across the 3- duration of picture presentations. Fig. shows 
that whereas the Americans were most likely to be looking at the object for 
about 600 ms of the first second, the Chinese exhibited very different eye- 
movement pattern. For the first 300-400 ms, no cultural differences were 
observed; at picture onset, both Americans and Chinese fixated the backgrounds 
more than the focal objects [F(1,43)=235.91, P=0.001] By about 420 ms after 
picture onset, the Americans were equally likely to be looking at the 
background and the focal object. At this point, there was an interaction of 
culture and fixation region, with only the Chinese fixating the backgrounds 
more than the objects [F(1,43)=6.43, P=0.02] Based on Fig. 4, the region during 
which the Americans attended preferentially to the object spanned 420-1,100 ms. 
Averaging the data across this interval, the Americans fixated the objects 
proportionately more than the backgrounds, whereas this was not at all true for 
the Chinese [F(1,43)=7.31, P=0.01] There was no time point at which the Chinese 
were fixating the objects significantly more than the backgrounds during the 3- 
presentation. Averaging the data from 1,100 to 3,000 ms, the Chinese looked 
more at the backgrounds than at the objects, whereas this was much less true 
for the Americans [F(1,43)=6.64, P=0.02] Taken together with the summary data 
from Fig. 3, these findings provide clear evidence that cultural differences in 
eye-movement patterns mirror and probably underlie the cultural differences in 
judgment and memory tasks.

Discussion

The present findings demonstrate that eye movements can differ as function of 
culture. Easterners and Westerners allocated attentional resources differently 
as they viewed the scenes. Apparently, Easterners and Westerners differ in 
attributing informativeness to foregrounded objects vs. backgrounds in the 
context of generic "How much do you like this picture?" task. The Americans' 
propensity to fixate sooner and longer on the foregrounded objects suggests 
that they encoded more visual even against new background. The Chinese pattern 
of more details for the objects than did the Chinese. If so, this could 
balanced fixations to the foreground object and background is explain the 
Americans' more accurate recognition of the objects, consistent with previous 
reports of holistic processing of visual scenes (6-8) Thus, previous findings 
of cultural differences in visual memory are likely due to how people from 
Eastern and Western cultures view scenes and are not solely due to cultural 
norms or expectations for reporting knowledge about scenes.

Fig. 3. Eye movement data. (A) Number of fixations to object or background by 
culture (21 Americans and 24 Chinese). Each picture was presented for 3 s. (B) 
Onset time to object by culture. Time was measured from onset of each picture 
to first fixation to object, comparing Americans and Chinese.(C) Average 
fixation times to object and background as a function of culture. All figures 
represent mean scores over 36 trials and SEM.

Fig. 4. Proportion of fixations to object or background, across the 3-s time 
course of a trial. Data points are sampled every 10 ms for 0-1,500 ms, and 
every 50 ms for 1,500-3,000 ms, averaging over all 36 trials. The sum of 
percentages at each time point may not total 100% because, at times, 
participants were in the process of making a saccade, thus they were in between 
fixations. The graph illustrates distinct eye tracking patterns of Americans 
and Chinese during the 3-s period. Cultural differences begin by 420 ms after 
onset, when an interaction of culture and region was observed, with the 
Chinese, but not the Americans continuing to fixate the background more than 
the focal object. Averaging the data from 420 to 1,100 ms, Americans were 
fixating focal objects at a greater proportion than backgrounds, compared with 
Chinese. Averaging the data from 1,100 to 3,000 ms, Chinese were fixating more 
often to the backgrounds and less to the objects, compared with Americans.

Cultural differences in eye movements, memory for scenes, and perceptual and 
causal judgments could stem from several sources, including differences in 
experience, expertise, or socialization. It is common to consider such factors 
in high-level cognition, but because such factors can influence the allocation 
of attention, they influence lower level cognition as well. Our hypothesis is 
that differential attention to context and object are stressed through 
socialization practices, as demonstrated in studies on childrearing practices 
by East Asians and Americans (16, 17) The childrearing practices are, in turn, 
influenced by societal differences. East Asians live in relatively complex 
social networks with prescribed role relations (18, 19) Attention to context 
is, therefore, important for effective functioning. In contrast, Westerners 
live in less constraining social worlds that stress independence and allow them 
to pay less attention to context.

The present results provide useful warning in world where opportunities to meet 
people from other cultural backgrounds continue to increase: people from 
different cultures may allocate attention differently, even within shared 
environment. The result is that we see different aspects of the world, in 
different ways.

We thank Chi-yue Chiu and Daniel Simons for their reviews of this paper and 
Meghan Carr Ahern, Chirag Patel, Jason Taylor, Holly Templeton, and Jeremy 
Phillips for their assistance in the study. This work was supported by the 
Culture and Cognition Program at the University of Michigan and National 
Science Foundation Grant 0132074.

1. Nisbett, R. E. Peng, K. Choi, I. Norenzayan, A. (2001) Psychol. Rev. 2, 
291-310.

2. Nisbett, R. E. Masuda, T. (2003) Proc. Natl. Acad. Sci. USA 100, 
11163-11170.

3. Choi, I. Nisbett, R. E. (1998) Pers. Soc. Psychol. Bull. 24, 949-960.

4. Morris, M.W. Peng, K. (1994) J. Pers. Soc. Psychol. 67, 949-971.

5. Chua, H. F. Leu, J. Nisbett, R. E. (2005) Pers. Soc. Psychol. Bull. 31, 
10925-10934.

6. Masuda, T. Nisbett, R. E. (2001) J. Pers. Soc. Psychol. 81, 922-934.

7. Ji, L. Peng, K. Nisbett, R. E. (2000) J. Pers. Soc. Psychol. 78, 943-955.

8. Kitayama, S. Duffy, S. Kawamura, T. Larsen, J. T. (2003) Psychol. Sci. 14, 
201-206.

9. Simons, D. J. Rensink, R. A. (2005) Trends Cognit. Sci. 9, 16-20.

10. Potter, M. C. (1976) J. Exp. Psychol. Hum. Learn. Mem. 2, 509-522.

11. Enns, J. T. (2004) The Thinking Eye, the Seeing Brain: Explorations in 
Visual Cognition (Norton, New York)

12. Intraub, H. (1997) Trends Cognit. Sci. 1, 217-212.

13. Potter, M. C. O'Connor, D. H. Olivia, A. (2002) J. Vision 2, 516.

14. Henderson, J. H. Hollingworth, A. (1999) Annu. Rev. Psychol. 50, 243-271.

15. Smith, E. E. Fredrickson, B. Loftus, G. Nolen-Hoeksema, S. (2002) Atkinson 
and Hilgard's Introduction to Psychology (Wadsworth, Belmont, CA) 14th Ed.

16. Fernald, A. Morikawa, H. (1993) Child Dev. 64, 637-656.

17. Tardif, T. Gelman, S. A. Xu, F. (1999) Child Dev. 70, 620-635.

18. Markus, H. R. Kitayama, S. (1991) Psychol. Rev. 98, 224-253.

19. Nisbett, R. E. (2003) The Geography of Thought: How Asians and Westerners 
Think Differently.. And Why (Free Press, New York)