[Paleopsych] NS: Why your brain has a Jennifer Aniston cell
Premise Checker
checker at panix.com
Thu Jun 23 14:31:59 UTC 2005
Why your brain has a Jennifer Aniston cell
http://www.newscientist.com/article.ns?id=dn7567&print=true
[The articles from Nature are appended. I can supply the PDFs.]
* 19:00 22 June 2005
* Anna Gosline
Obsessed with reruns of the TV sitcom Friends? Well then you probably
have at least one Jennifer Aniston cell in your brain, suggests
research on the activity patterns of single neurons in memory-linked
areas of the brain. The results point to a decades-old and dismissed
theory tying single neurons to individual concepts and could help
neuroscientists understand the elusive human memory.
For things that you see over and over again, your family, your
boyfriend, or celebrities, your brain wires up and fires very
specifically to them. These neurons are very, very specific, much more
than people think, says Christof Koch at the California Institute of
Technology in Pasadena, US, one of the researchers.
In the 1960s, neuroscientist Jerry Lettvin suggested that people have
neurons that respond to a single concept such as, for example, their
grandmother. The notion of these hyper-specific neurons, coined
grandmother cells was quickly rejected by psychologists as laughably
simplistic.
But Rodrigo Quiroga, at the University of Leicester, UK, who led the
new study, and his colleagues have found some very grandmother-like
cells. Previous unpublished findings from the team showed tantalising
results: a neuron that fired only in response to pictures of former US
president Bill Clinton, or another to images of the Beatles. But for
such grandmother cells to exist, they must invariably respond to the
concept of Bill Clinton, not just similar pictures.
Wired up, fired up
To investigate further, the team turned to eight patients currently
undergoing treatment for epilepsy. In an attempt to locate the brain
areas responsible for their seizures, each patient had around 100 tiny
electrodes implanted in their brain. Many of the wires were placed in
the hippocampus - an area of the brain vital to long-term memory
formation.
They first gave each subject a screening test, showing them between 71
and 114 images of famous people, places, and even food items. For each
subject, the researchers measured the electrical activity or firing of
the neurons connected to the electrodes. Of the 993 neurons sampled,
132 fired to at least one image.
The team then went back for a testing phase, this time showing
participants three to seven different pictures of the initial 132
photo subjects that hit. For example, one woman saw seven different
photos of the Jennifer Aniston alongside 80 other photos of animals,
buildings or additional famous people such as Julia Roberts. The
neuron almost ignored all other photos, but fired steadily each time
Aniston appeared on screen.
Conceptual connections
The team found similar results with another woman who had a neuron for
pictures of Halle Berry, including a drawing of her face and an image
of just the words of her name. This neuron is responding to the
concept, the abstract entity, of Halle Berry, says Quiroga. If you
show a line drawing or a profile, its the same response. We also
showed pictures of her as Catwoman, and you can hardly see her because
of the mask. But if you know it is Halle Berry then the neurons still
fire.
Given more time and an exhaustive list of images, the team may well
have landed upon other images that spiked the activity of the Halle
Berry neuron. In one participant, the Jen neuron also fired in
response to a picture of her former Friends cast-mate, Lisa Kudrow.
The pattern suggests that the actresses are tied together in the
memory associations of this particular woman, says Charles Connor, a
neuroscientist at Johns Hopkins University in Baltimore, US.
These object-specific neurons may be at the core of how we make
memories, say Connor. I think thats the excitement to these results,
he says. You are looking at the far end of the transformation from
metric, visual shapes to conceptual memory-related information. It is
that transformation that underlies our ability to understand the
world. Its not enough to see something familiar and match it. Its the
fact that you plug visual information into the rich tapestry of memory
that brings it to life.
Journal reference: Nature (vol 435 p 1102)
Weblinks
Rodrigo Quiroga, University of Leicester
http://www.vis.caltech.edu/~rodri/
Christof Kochs Lab, California Institute of Technology
http://www.klab.caltech.edu/
Charles E Connors Lab, Johns Hopkins University
http://www.mb.jhu.edu/connor.asp
Nature http://www.nature.com
--------
News and Views
Nature 435, 1036-1037 (23 June 2005) | doi: 10.1038/4351036a
Neuroscience: Friends and grandmothers
Charles E. Connor1
Abstract
How do neurons in the brain represent movie stars, famous buildings and other
familiar objects? Rare recordings from single neurons in the human brain
provide a fresh perspective on the question.
'Grandmother cell' is a term coined by J. Y. Lettvin to parody the simplistic
notion that the brain has a separate neuron to detect and represent every
object (including one's grandmother)1 . The phrase has become a shorthand for
invoking all of the overwhelming practical arguments against a one-to-one
object coding scheme2. No one wants to be accused of believing in grandmother
cells. But on page 1102 of this issue, Quiroga et al.3 describe a neuron in the
human brain that looks for all the world like a 'Jennifer Aniston' cell. Ms
Aniston could well become a grandmother herself someday. Are vision scientists
now forced to drop their dismissive tone when discussing the neural
representation of matriarchs?
A more technical term for the grandmother issue is 'sparseness' (Fig. 1 ). At
earlier stages in the brain's object-representation pathway, the neural code
for an object is a broad activity pattern distributed across a population of
neurons, each responsive to some discrete visual feature4. At later processing
stages, neurons become increasingly selective for combinations of features5 ,
and the code becomes increasingly sparse that is, fewer neurons are activated
by a given stimulus, although the code is still population-based6 . Sparseness
has its advantages, especially for memory, because compact coding maximizes
total storage capacity, and some evidence suggests that 'sparsification' is a
defining goal of visual information processing7, 8 . Grandmother cells are the
theoretical limit of sparseness, where the representation of an object is
reduced to a single neuron. Figure 1: Sparseness and invariance in neural
coding of visual stimuli. [Figure 1 : Sparseness and invariance in neural
coding of visual stimuli. Unfortunately we are unable to provide accessible
alternative text for this. If you require assistance to access this image, or
to obtain a text description, please contact npg at nature.com]
The blue and yellow pixel plots represent a hypothetical neural population.
Each pixel represents a neuron with low (blue) or high (yellow) activity. In
distributed coding schemes (left column), many neurons are active in response
to each stimulus. In sparse coding schemes (right column), few neurons are
active. If the neural representation is invariant (top row), different views of
the same person or object evoke identical activity patterns. If the neural
representation is not invariant (bottom row), different views evoke different
activity patterns. The implication of Quiroga and colleagues' results3, at
least as far as vision is concerned, is that neural representation is extremely
sparse and invariant.
High resolution image and legend (107K)
Quiroga and colleagues3 report what seems to be the closest approach yet to
that limit. They recorded neural activity from structures in the human medial
temporal lobe that are associated with late-stage visual processing and
long-term memory. The structures concerned were the entorhinal cortex, the
parahippocampal gyrus, the amygdala and the hippocampus, and the recordings
were made in the course of clinical procedures to treat epilepsy.
The first example cell responded significantly to seven different images of
Jennifer Aniston but not to 80 other stimuli, including pictures of Julia
Roberts and even pictures of Jennifer Aniston with Brad Pitt. The second
example cell preferred Halle Berry in the same way. Altogether, 44 units (out
of 137 with significant visual responses) were selective in this way for a
single object out of those tested.
The striking aspect of these results is the consistency of responses across
different images of the same person or object. This relates to another major
issue in visual coding, 'invariance' (Fig. 1 ). One of the most difficult
aspects of vision is that any given object must be recognizable from the front
or side, in light or shadow, and so on. Somehow, given those very different
retinal images, the brain consistently invokes the same set of memory
associations that give the object meaning. According to 'view-invariant'
theories, this is achieved in the visual cortex by some kind of neural
calculation that transforms the visual structure in different images into a
common format9, 10, 11 . According to 'view-dependent' theories, it is achieved
by learning temporal associations between different views and storing those
associations in the memory12, 13, 14.
Quiroga and colleagues' results3 set a new benchmark for both sparseness and
invariance, at least from a visual perspective. Most of the invariant
structural characteristics in images of Jennifer Aniston (such as relative
positions of eyes, nose and mouth) would be present in images of Julia Roberts
as well. Thus, any distributed visual coding scheme would predict substantial
overlap in the neural groups representing Aniston and Roberts; cells responding
to one and not the other would be rare. The clean, visually invariant
selectivity of the neurons described by Quiroga et al. implies a sparseness
bordering on grandmotherliness.
However, as the authors discuss, these results may be best understood in a
somewhat non-visual context. The brain structures that they studied stand at
the far end of the object-representation pathway or beyond, and their responses
may be more memory-related than strictly visual. In fact, several example cells
responded not only to pictures but also to the printed name of a particular
person or object. Clearly, this is a kind of invariance based on learned
associations, not geometric transformation of visual structure, and these cells
encode memory-based concepts rather than visual appearance.
How do you measure sparseness in conceptual space? It's a difficult
proposition, requiring knowledge of how the subject associates different
concepts in memory. The authors did their best (within the constraints of
limited recording time) to test images that might be conceptually related. In
one tantalizing example, a neuron responded to both Jennifer Aniston and Lisa
Kudrow, her co-star on the television show Friends. What seems to be a sparse
representation in visual space may be a distributed representation in sitcom
space! In another example, a neuron responded to two unrelated stimuli commonly
used by Quiroga et al. pictures of Jennifer Aniston with Brad Pitt and
pictures of the Sydney Opera House. This could reflect a new memory association
produced by the close temporal proximity of these stimuli during the recording
sessions, consistent with similar phenomena observed in monkey temporal
cortex15.
Thus, Quiroga and colleagues' findings may say less about visual representation
as such than they do about memory representation and how it relates to visual
inputs. Quiroga et al. have shown that, at or near the end of the
transformation from visual information about object structure to memory-related
conceptual information about object identity, the neural representation seems
extremely sparse and invariant in the visual domain. As the authors note, these
are predictable characteristics of an abstract, memory-based representation.
But I doubt that anyone would have predicted such striking confirmation at the
level of individual neurons.
References
1. Rose, D. Perception 25, 881?886 (1996). | PubMed | ChemPort |
2. Barlow, H. B. Perception 1, 371?394 (1972). | PubMed | ChemPort |
3. Quiroga, R. Q., Reddy, L., Kreiman, G., Koch, C. & Fried, I. Nature 435,
1102?1107 (2005). | Article |
4. Pasupathy, A. & Connor, C. E. Nature Neurosci. 5, 1332?1338 (2002). |
Article | PubMed | ISI | ChemPort |
5. Brincat, S. L. & Connor, C. E. Nature Neurosci. 7, 880?886 (2004). |
Article | PubMed | ChemPort |
6. Young, M. P. & Yamane, S. Science 256, 1327?1331 (1992). | PubMed | ISI |
ChemPort |
7. Olshausen, B. A. & Field, D. J. Nature 381, 607?609 (1996). | Article |
PubMed | ISI | ChemPort |
8. Vinje, W. E. & Gallant, J. L. Science 287, 1273?1276 (2000). | Article |
PubMed | ISI | ChemPort |
9. Biederman, I. Psychol. Rev. 94, 115?147 (1987). | Article | PubMed | ISI
| ChemPort |
10. Marr, D. & Nishihara, H. K. Proc. R. Soc. Lond. B 200, 269?294 (1978). |
PubMed | ISI | ChemPort |
11. Booth, M. C. & Rolls, E. T. Cereb. Cortex 8, 510?523 (1998). | Article |
PubMed | ISI | ChemPort |
12. Bulthoff, H. H., Edelman, S. Y. & Tarr, M. J. Cereb. Cortex 5, 247?260
(1995). | PubMed | ISI | ChemPort |
13. Vetter, T., Hurlbert, A. & Poggio, T. Cereb. Cortex 5, 261?269 (1995). |
PubMed | ISI | ChemPort |
14. Logothetis, N. K. & Pauls, J. Cereb. Cortex 5, 270?288 (1995). | PubMed |
ISI | ChemPort |
15. Sakai, K. & Miyashita, Y. Nature 354, 152?155 (1991). | Article | PubMed
| ISI | ChemPort |
1. Charles E. Connor is in the Department of Neuroscience and the Zanvyl
Krieger Mind/Brain Institute, Johns Hopkins University, Baltimore, Maryland
21218, USA.
Email: connor at jhu.edu
---------------
Letter
Nature 435, 1102-1107 (23 June 2005) | doi: 10.1038/nature03687
Invariant visual representation by single neurons in the human brain
R. Quian Quiroga1,2,5, L. Reddy1, G. Kreiman3, C. Koch1 and I. Fried2,4
Abstract
It takes a fraction of a second to recognize a person or an object even when
seen under strikingly different conditions. How such a robust, high-level
representation is achieved by neurons in the human brain is still unclear1, 2,
3, 4, 5, 6 . In monkeys, neurons in the upper stages of the ventral visual
pathway respond to complex images such as faces and objects and show some
degree of invariance to metric properties such as the stimulus size, position
and viewing angle2, 4, 7, 8, 9, 10, 11, 12 . We have previously shown that
neurons in the human medial temporal lobe (MTL) fire selectively to images of
faces, animals, objects or scenes13, 14 . Here we report on a remarkable subset
of MTL neurons that are selectively activated by strikingly different pictures
of given individuals, landmarks or objects and in some cases even by letter
strings with their names. These results suggest an invariant, sparse and
explicit code, which might be important in the transformation of complex visual
percepts into long-term and more abstract memories.
The subjects were eight patients with pharmacologically intractable epilepsy
who had been implanted with depth electrodes to localize the focus of seizure
onset. For each patient, the placement of the depth electrodes, in combination
with micro-wires, was determined exclusively by clinical criteria13 . We
analysed responses of neurons from the hippocampus, amygdala, entorhinal cortex
and parahippocampal gyrus to images shown on a laptop computer in 21 recording
sessions. Stimuli were different pictures of individuals, animals, objects and
landmark buildings presented for 1 s in pseudo-random order, six times each. An
unpublished observation in our previous recordings was the sometimes surprising
degree of invariance inherent in the neuron's (that is, unit's) firing
behaviour. For example, in one case, a unit responded only to three completely
different images of the ex-president Bill Clinton. Another unit (from a
different patient) responded only to images of The Beatles, another one to
cartoons from The Simpson's television series and another one to pictures of
the basketball player Michael Jordan. This suggested that neurons might encode
an abstract representation of an individual. We here ask whether MTL neurons
can represent high-level information in an abstract manner characterized by
invariance to the metric characteristics of the images. By invariance we mean
that a given unit is activated mainly, albeit not necessarily uniquely, by
different pictures of a given individual, landmark or object.
To investigate further this abstract representation, we introduced several
modifications to optimize our recording and data processing conditions (see
Supplementary Information ) and we designed a paradigm to systematically search
for and characterize such invariant neurons. In a first recording session,
usually done early in the morning (screening session), a large number of images
of famous persons, landmark buildings, animals and objects were shown. This set
was complemented by images chosen after an interview with the patient. The mean
number of images in the screening session was 93.9 (range 71?114). The data
were quickly analysed offline to determine the stimuli that elicited responses
in at least one unit (see definition of response below). Subsequently, in later
sessions (testing sessions) between three and eight variants of all the stimuli
that had previously elicited a response were shown. If not enough stimuli
elicited significant responses in the screening session, we chose those stimuli
with the strongest responses. On average, 88.6 (range 70?110) different images
showing distinct views of 14 individuals or objects (range 7?23) were used in
the testing sessions. Single views of random stimuli (for example, famous and
non-famous faces, houses, animals, etc) were also included. The total number of
stimuli was determined by the time available with the patient (about 30 min on
average). Because in our clinical set-up the recording conditions can sometimes
change within a few hours, we always tried to perform the testing sessions
shortly after the screening sessions in order to maximize the probability of
recording from the same units. Unless explicitly stated otherwise, all the data
reported in this study are from the testing sessions. To hold their attention,
patients had to perform a simple task during all sessions (indicating with a
key press whether a human face was present in the image). Performance was close
to 100%.
We recorded from a total of 993 units (343 single units and 650 multi-units),
with an average of 47.3 units per session (16.3 single units and 31.0
multi-units). Of these, 132 (14%; 64 single units and 68 multi-units) showed a
statistically significant response to at least one picture. A response was
considered significant if it was larger than the mean plus 5 standard
deviations (s.d.) of the baseline and had at least two spikes in the
post-stimulus time interval considered (300?1,000 ms). All these responses were
highly selective: for the responsive units, an average of only 2.8% of the
presented pictures (range: 0.9?22.8%) showed significant activations according
to this criterion. This high selectivity was also present in the screening
sessions, where only 3.1% of the pictures shown elicited responses (range:
0.9?18.0%). There was no significant difference between the relative number of
responsive pictures obtained in the screening and testing sessions (t-test, P =
0.40). Responses started around 300 ms after stimulus onset and had mainly
three non-exclusive patterns of activation (with about one-third of the cells
having each type of response): the response disappeared with stimulus offset, 1
s after stimulus onset; it consisted of a rapid sequence of about 6 spikes
(s.d. = 5) between 300 and 600 ms after stimulus onset; or it was prolonged and
continued up to 1 s after stimulus offset. For this study, we calculated the
responses in a time window between 300 and 1,000 ms after stimulus onset. In a
few cases we also observed cells that responded selectively only after the
image was removed from view (that is, after 1 s). These are not further
analysed here.
Figure 1a shows the responses of a single unit in the left posterior
hippocampus to a selection of 30 out of the 87 pictures presented to the
patient. None of the other pictures elicited a statistically significant
response. This unit fired to all pictures of the actress Jennifer Aniston
alone, but not (or only very weakly) to other famous and non-famous faces,
landmarks, animals or objects. Interestingly, the unit did not respond to
pictures of Jennifer Aniston together with the actor Brad Pitt (but see
Supplementary Fig. 2 ). Pictures of Jennifer Aniston elicited an average of
4.85 spikes (s.d. = 3.59) between 300 and 600 ms after stimulus onset. Notably,
this unit was nearly silent during baseline (average of 0.02 spikes in a 700-ms
pre-stimulus time window) and during the presentation of most other pictures
(Fig. 1b). Figure 1b plots the median number of spikes (across trials) in the
300?1,000-ms post-stimulus interval for all 87 pictures shown to the patient.
The histogram shows a marked differential response to pictures of Jennifer
Aniston (red bars). Figure 1: A single unit in the left posterior hippocampus
activated exclusively by different views of the actress Jennifer Aniston.
[Figure 1 : A single unit in the left posterior hippocampus activated
exclusively by different views of the actress Jennifer Aniston. Unfortunately
we are unable to provide accessible alternative text for this. If you require
assistance to access this image, or to obtain a text description, please
contact npg at nature.com]
a , Responses to 30 of the 87 images are shown. There were no statistically
significant responses to the other 57 pictures. For each picture, the
corresponding raster plots (the order of trial number is from top to bottom)
and post-stimulus time histograms are given. Vertical dashed lines indicate
image onset and offset (1 s apart). Note that owing to insurmountable copyright
problems, all original images were replaced in this and all subsequent figures
by very similar ones (same subject, animal or building, similar pose, similar
colour, line drawing, and so on). b, The median responses to all pictures. The
image numbers correspond to those in a . The two horizontal lines show the mean
baseline activity (0.02 spikes) and the mean plus 5 s.d. (0.82 spikes).
Pictures of Jennifer Aniston are denoted by red bars. c, The associated ROC
curve (red trace) testing the hypothesis that the cell responded in an
invariant manner to all seven photographs of Jennifer Aniston (hits) but not to
other images (including photographs of Jennifer Aniston and Brad Pitt together;
false positives). The grey lines correspond to the same ROC analysis for 99
surrogate sets of 7 randomly chosen pictures (P < 0.01). The area under the red
curve is 1.00.
High resolution image and legend (87K)
Next, we quantified the degree of invariance using a receiver operating
characteristic (ROC) framework15. We considered as the hit rate (y axis) the
relative number of responses to pictures of a specific individual, object,
animal or landmark building, and as the false positive rate (x axis) the
relative number of responses to other pictures. The ROC curve corresponds to
the performance of a linear binary classifier for different values of a
response threshold. Decreasing the threshold increases the probability of hits
but also of false alarms. A cell responding to a large set of pictures of
different individuals will have a ROC curve close to the diagonal (with an area
under the curve of 0.5), whereas a cell that responds to all pictures of an
individual but not to others will have a convex ROC curve far from the
diagonal, with an area close to 1. In Fig. 1c we show the ROC curve for all
seven pictures of Jennifer Aniston (red trace, with an area equal to 1). The
grey lines show 99 ROC surrogate curves, testing invariance to randomly
selected groups of pictures (see Methods). As expected, these curves are close
to the diagonal, having an area of about 0.5. None of the 99 surrogate curves
had an area equal or larger than the original ROC curve, implying that it is
unlikely (P < 0.01) that the responses to Jennifer Aniston were obtained by
chance. A responsive unit was defined to have an invariant representation if
the area under the ROC curve was larger than the area of the 99 surrogate
curves.
Figure 2 shows another single unit located in the right anterior hippocampus of
a different patient. This unit was selectively activated by pictures of the
actress Halle Berry as well as by a drawing of her (but not by other drawings;
for example, picture no. 87). This unit was also activated by several pictures
of Halle Berry dressed as Catwoman, her character in a recent film, but not by
other images of Catwoman that were not her (data not shown). Notably, the unit
was selectively activated by the letter string 'Halle Berry'. Such an invariant
pattern of activation goes beyond common visual features of the different
stimuli. As with the previous unit, the responses were mainly localized between
300 and 600 ms after stimulus onset. Figure 2c shows the ROC curve for the
pictures of Halle Berry (red trace) and for 99 surrogates (grey lines). The
area under the ROC curve was 0.99, larger than that of the surrogates. Figure
2: A single unit in the right anterior hippocampus that responds to pictures of
the actress Halle Berry (conventions as in Fig. 1). [Figure 2 : A single unit
in the right anterior hippocampus that responds to pictures of the actress
Halle Berry (conventions as in Fig. 1). Unfortunately we are unable to provide
accessible alternative text for this. If you require assistance to access this
image, or to obtain a text description, please contact npg at nature.com]
a?c , Strikingly, this cell also responds to a drawing of her, to herself
dressed as Catwoman (a recent movie in which she played the lead role) and to
the letter string 'Halle Berry' (picture no. 96). Such an invariant response
cannot be attributed to common visual features of the stimuli. This unit also
had a very low baseline firing rate (0.06 spikes). The area under the red curve
in c is 0.99. High resolution image and legend (88K)
Figure 3 illustrates a multi-unit in the left anterior hippocampus responding
to pictures of the Sydney Opera House and the Baha'i Temple. Because the
patient identified both landmark buildings as the Sydney Opera House, all these
pictures were considered as a single landmark building for the ROC analysis.
This unit also responded to the letter string 'Sydney Opera' (pictures no. 2
and 8) but not to other letter strings, such as 'Eiffel Tower' (picture no. 1).
More examples of invariant responses are shown in the Supplementary Figs 2?11.
Figure 3: A multi-unit in the left anterior hippocampus that responds to
photographs of the Sydney Opera House and the Baha'i Temple (conventions as in
Fig. 1). [Figure 3 : A multi-unit in the left anterior hippocampus that
responds to photographs of the Sydney Opera House and the Baha'i Temple
(conventions as in Fig. 1). Unfortunately we are unable to provide accessible
alternative text for this. If you require assistance to access this image, or
to obtain a text description, please contact npg at nature.com]
a?c , The patient identified all pictures of both of these buildings as the
Sydney Opera, and we therefore considered them as a single landmark. This unit
also responded to the presentation of the letter string 'Sydney Opera'
(pictures no. 2 and 8), but not to other strings, such as 'Eiffel Tower'
(picture no. 1). In contrast to the previous two figures, this unit had a
higher baseline firing rate (2.64 spikes). The area under the red curve in c is
0.97.
High resolution image and legend (118K)
Out of the 132 responsive units, 51 (38.6%; 30 single units and 21 multi-units)
showed invariance to a particular individual (38 units responding to Jennifer
Aniston, Halle Berry, Julia Roberts, Kobe Bryant, and so on), landmark building
(6 units responding to the Tower of Pisa, the Baha'i Temple and the Sydney
Opera House), animal (5 units responding to spiders, seals and horses) or
object (2 units responding to specific food items), with P < 0.01 as defined
above by means of the surrogate tests. A one-way analysis of variance (ANOVA)
yielded similar results (see Methods). Eight of these units (two single units
and six multi-units) responded to two different individuals (or to an
individual and an object). Figure 4 presents the distribution of the areas
under the ROC curves for all 51 units that showed an invariant representation
to individuals or objects. The areas ranged from 0.76 to 1.00, with a median of
0.94. These units were located in the hippocampus (27 out of 60 responsive
units; 45%), parahippocampal gyrus (11 out of 20 responsive units; 55%),
amygdala (8 out of 30 responsive units; 27%) and entorhinal cortex (5 out of 22
responsive units; 23%). There were no clear differences in the latencies and
firing patterns among the different areas. However, more data are needed before
making a conclusive claim about systematic differences between the various
structures of the MTL. Figure 4: Distribution of the area under the ROC curves
for the 51 units (out of 132 responsive units) showing an invariant
representation. [Figure 4 : Distribution of the area under the ROC curves for
the 51 units (out of 132 responsive units) showing an invariant representation.
Unfortunately we are unable to provide accessible alternative text for this. If
you require assistance to access this image, or to obtain a text description,
please contact npg at nature.com]
Of these, 43 responded to a single individual or object and 8 to two
individuals or objects. The dashed vertical line marks the median of the
distribution (0.94).
High resolution image and legend (38K)
As shown in Figs 2 and 3 , one of the most extreme cases of an abstract
representation is the one given by responses to pictures of a particular
individual (or object) and to the presentation of the corresponding letter
string with its name. In 18 of the 21 testing sessions we also tested responses
to letter strings with the names of the individuals and objects. Eight of the
132 responsive units (6.1%) showed a selective response to an individual and
its name (with no response to other names). Six of these were in the
hippocampus, one was in the entorhinal cortex and one was in the amygdala.
These neuronal responses cannot be attributed to any particular movement
artefact, because selective responses started around 300 ms after image onset,
whereas key presses occurred at 1 s or later, and neuronal responses were very
selective. About one-third of the responsive units had a response localized
between 300 and 600 ms. This interval corresponds to the latency of
event-related responses correlated with the recognition of 'oddball' stimuli in
scalp electroencephalogram, namely, the P300 (ref. 16). Some studies argue for
a generation of the P300 in the hippocampal formation and amygdala17, 18,
consistent with our findings.
What are the common features that activate these neurons? Given the great
diversity of distinct images of a single individual (pencil sketches,
caricatures, letter strings, coloured photographs with different backgrounds)
that these cells can selectively respond to, it is unlikely that this degree of
invariance can be explained by a simple set of metric features common to these
images. Indeed, our data are compatible with an abstract representation of the
identity of the individual or object shown. The existence of such high-level
visual responses in medial temporal lobe structures, usually considered to be
involved in long-term memory formation and consolidation, should not be
surprising given the following: (1) the known anatomical connections between
the higher stages of the visual hierarchy in the ventral pathway and the MTL19,
20 ; (2) the well-characterized reactivity of the cortical stages feeding into
the MTL to the sight of faces, objects, or spatial scenes (as ascertained using
functional magnetic resonance imaging (fMRI) in humans21, 22 and
electrophysiology in monkeys2, 4, 7, 8, 9, 10, 11 ); and (3) the observation
that any visual percept that will be consciously remembered later on will have
to be represented in the hippocampal system23, 24, 25 . This is true even
though patients with bilateral loss of parts of the MTL do not, in general,
have a deficit in the perception of images25. Neurons in the MTL might have a
fundamental role in learning associations between abstract representations26 .
Thus, our observed invariant responses probably arise from experiencing very
different pictures, words or other visual stimuli in association with a given
individual or object.
How neurons encode different percepts is one of the most intriguing questions
in neuroscience. Two extreme hypotheses are schemes based on the explicit
representations by highly selective (cardinal, gnostic or grandmother) neurons
and schemes that rely on an implicit representation over a very broad and
distributed population of neurons1, 2, 3, 4, 6 . In the latter case,
recognition would require the simultaneous activation of a large number of
cells and therefore we would expect each cell to respond to many pictures with
similar basic features. This is in contrast to the sparse firing we observe,
because most MTL cells do not respond to the great majority of images seen by
the patient. Furthermore, cells signal a particular individual or object in an
explicit manner27 , in the sense that the presence of the individual can, in
principle, be reliably decoded from a very small number of neurons. We do not
mean to imply the existence of single neurons coding uniquely for discrete
percepts for several reasons: first, some of these units responded to pictures
of more than one individual or object; second, given the limited duration of
our recording sessions, we can only explore a tiny portion of stimulus space;
and third, the fact that we can discover in this short time some imagessuch as
photographs of Jennifer Anistonthat drive the cells suggests that each cell
might represent more than one class of images. Yet, this subset of MTL cells is
selectively activated by different views of individuals, landmarks, animals or
objects. This is quite distinct from a completely distributed population code
and suggests a sparse, explicit and invariant encoding of visual percepts in
MTL. Such an abstract representation, in contrast to the metric representation
in the early stages of the visual pathway, might be important in the storage of
long-term memories. Other factors, including emotional responses towards some
images, could conceivably influence the neuronal activity as well. The
responses of these neurons are reminiscent of the behaviour of hippocampal
place cells in rodents28 that only fire if the animal moves through a
particular spatial location, with the actual place field defined independently
of sensory cues. Notably, place cells have been found recently in the human
hippocampus as well29 . Both classes of neuronsplace cells and the cells in
the present studyhave a very low baseline activity and respond in a highly
selective manner. Future research might show that this similarity has
functional implications, enabling mammals to encode behaviourally important
features of the environment and to transition between them, either in physical
space or in a more conceptual space13.
Methods
The data in the present study come from 21 sessions in 8 patients with
pharmacologically intractable epilepsy (eight right handed; 3 male; 17?47 years
old). Extensive non-invasive monitoring did not yield concordant data
corresponding to a single resectable epileptogenic focus. Therefore, the
patients were implanted with chronic depth electrodes for 7?10 days to
determine the seizure focus for possible surgical resection13 . Here we report
data from sites in the hippocampus, amygdala, entorhinal cortex and
parahippocampal gyrus. All studies conformed to the guidelines of the Medical
Institutional Review Board at UCLA. The electrode locations were based
exclusively on clinical criteria and were verified by MRI or by computer
tomography co-registered to preoperative MRI. Each electrode probe had a total
of nine micro-wires at its end13 , eight active recording channels and one
reference. The differential signal from the micro-wires was amplified using a
64-channel Neuralynx system, filtered between 1 and 9,000 Hz. We computed the
power spectrum for every unit after spike sorting. Units that showed evidence
of line noise were excluded from subsequent analysis14. Signals were sampled at
28 kHz. Each recording session lasted about 30 min.
Subjects lay in bed, facing a laptop computer. Each image covered about 1.5°
and was presented at the centre of the screen six times for 1 s. The order of
the pictures was randomized. Subjects had to respond, after image offset,
according to whether the picture contained a human face or something else by
pressing the 'Y' and 'N' keys, respectively. This simple task, on which
performance was virtually flawless, required them to attend to the pictures.
After the experiments, patients gave feedback on whether they recognized the
images or not. Pictures included famous and unknown individuals, animals,
landmarks and objects. We tried to maximize the differences between pictures of
the individuals (for example, different clothing, size, point of view, and so
on). In 18 of the 21 sessions, we also presented letter strings with names of
individuals or objects.
The data from the screening sessions were rapidly processed to identify
responsive units and images. All pictures that elicited a response in the
screening session were included in the later testing sessions. Three to eight
different views of seven to twenty-three different individuals or objects were
used in the testing sessions with a mean of 88.6 images per session (range
70?110). Spike detection and sorting was applied to the continuous recordings
using a novel clustering algorithm30 (see Supplementary Information ). The
response to a picture was defined as the median number of spikes across trials
between 300 and 1,000 ms after stimulus onset. Baseline activity was the
average spike count for all pictures between 1,000 and 300 ms before stimulus
onset. A unit was considered responsive if the activity to at least one picture
fulfilled two criteria: (1) the median number of spikes was larger than the
average number of spikes for the baseline plus 5 s.d.; and (2) the median
number of spikes was at least two.
The classification between single unit and multi-unit was done visually based
on the following: (1) the spike shape and its variance; (2) the ratio between
the spike peak value and the noise level; (3) the inter-spike interval
distribution of each cluster; and (4) the presence of a refractory period for
the single units (that is, less than 1% of spikes within less than 3 ms
inter-spike interval).
Whenever a unit had a response to a given stimulus, we further analysed the
responses to other pictures of the same individual or object by a ROC analysis.
This tested whether cells responded selectively to pictures of a given
individual. The hit rate (y axis) was defined as the number of responses to the
individual divided by the total number of pictures of this individual. The
false positive rate (x axis) was defined as the number of responses to the
other pictures divided by the total number of other pictures. The ROC curve was
obtained by gradually lowering the threshold of the responses (the median
number of spikes in Figs 1b, 2b and 3b ). Starting with a very high threshold
(no hits, no false positives, lower left-hand corner in the ROC diagram), if a
unit responds exclusively to an image of a particular individual or object, the
ROC curve will show a steep increase when lowering the threshold (a hit rate of
1 and no false positives). If a unit responds to a random selection of
pictures, it will have a similar relative number of hits and false positives
and the ROC curve will fall along the diagonal. In the first case, for a highly
invariant unit, the area under the ROC curve will be close to 1, whereas in the
latter case it will be about 0.5. To evaluate the statistical significance, we
created 99 surrogate curves for each responsive unit, testing the null
hypothesis that the unit responded preferentially to n randomly chosen pictures
(with n being the number of pictures of the individual for which invariance was
tested). A unit was considered invariant to a certain individual or object if
the area under the ROC curve was larger than the area of all of the 99
surrogates (that is, with a confidence of P < 0.01). Alternatively, the ROC
analysis can be done with the single trial responses instead of the median
responses across trials. Here, responses to the trials corresponding to any
picture of the individual tested are considered as hits and responses to trials
to other pictures as false positives. This trial-by-trial analysis led to very
similar results, with 55 units of all 132 responsive units showing an invariant
representation. A one-way ANOVA also yielded similar results. In particular, we
tested whether the distribution of median firing rates for all responsive units
showed a dependence on the factor identity (that is, the individual, landmark
or object shown). The different views of each individual were the repeated
measures. As with the ROC analysis, an ANOVA test was performed on all
responsive units. Overall, the results were very similar to those obtained with
the ROC analysis: of 132 responsive units, 49 had a significant effect for
factor identity with P < 0.01, compared to 51 units showing an invariant
representation with the ROC analysis. The ANOVA analysis, however, does not
demonstrate that the invariant responses were very selective, whereas the ROC
analysis explicitly tests the presence of an invariant as well as sparse
representation.
Images were obtained from Corbis and Photorazzi, with licensed rights to
reproduce them in this paper and in the Supplementary Information.
Acknowledgments
We thank all patients for their participation; P. Sinha for drawing some faces;
colleagues for providing pictures; I. Wainwright for administrative assistance;
and E. Behnke, T. Fields, E. Ho, E. Isham, A. Kraskov, P. Steinmetz, I.
Viskontas and C. Wilson for technical assistance. This work was supported by
grants from the NINDS, NIMH, NSF, DARPA, the Office of Naval Research, the W.M.
Keck Foundation Fund for Discovery in Basic Medical Research, a Whiteman
fellowship (to G.K.), the Gordon Moore Foundation, the Sloan Foundation, and
the Swartz Foundation for Computational Neuroscience. Competing interests
statement:
The authors declared no competing interests.
Supplementary information accompanies this paper.
References
1. Barlow, H. Single units and sensation: a neuron doctrine for perception.
Perception 1, 371?394 (1972) | PubMed | ChemPort |
2. Gross, C. G., Bender, D. B. & Rocha-Miranda, C. E. Visual receptive
fields of neurons in inferotemporal cortex of the monkey. Science 166,
1303?1306 (1969) | PubMed | ISI | ChemPort |
3. Konorski, J. Integrative Activity of the Brain (Univ. Chicago Press,
Chicago, 1967)
4. Logothetis, N. K. & Sheinberg, D. L. Visual object recognition. Annu.
Rev. Neurosci. 19, 577?621 (1996) | Article | PubMed | ISI | ChemPort |
5. Riesenhuber, M. & Poggio, T. Neural mechanisms of object recognition.
Curr. Opin. Neurobiol. 12, 162?168 (2002) | Article | PubMed | ISI | ChemPort |
6. Young, M. P. & Yamane, S. Sparse population coding of faces in the
inferior temporal cortex. Science 256, 1327?1331 (1992) | PubMed | ISI |
ChemPort |
7. Logothetis, N. K., Pauls, J. & Poggio, T. Shape representation in the
inferior temporal cortex of monkeys. Curr. Biol. 5, 552?563 (1995) | Article |
PubMed | ISI | ChemPort |
8. Logothetis, N. K. & Pauls, J. Psychophysical and physiological evidence
for viewer-centered object representations in the primate. Cereb. Cortex 3,
270?288 (1995)
9. Perrett, D., Rolls, E. & Caan, W. Visual neurons responsive to faces in
the monkey temporal cortex. Exp. Brain Res. 47, 329?342 (1982) | Article |
PubMed | ISI | ChemPort |
10. Schwartz, E. L., Desimone, R., Albright, T. D. & Gross, C. G. Shape
recognition and inferior temporal neurons. Proc. Natl Acad. Sci. USA 80,
5776?5778 (1983) | PubMed | ChemPort |
11. Tanaka, K. Inferotemporal cortex and object vision. Annu. Rev. Neurosci.
19, 109?139 (1996) | Article | PubMed | ISI | ChemPort |
12. Miyashita, Y. & Chang, H. S. Neuronal correlate of pictorial short-term
memory in the primate temporal cortex. Nature 331, 68?71 (1988) | Article |
PubMed | ISI | ChemPort |
13. Fried, I., MacDonald, K. A. & Wilson, C. Single neuron activity in human
hippocampus and amygdale during recognition of faces and objects. Neuron 18,
753?765 (1997) | Article | PubMed | ISI | ChemPort |
14. Kreiman, G., Koch, C. & Fried, I. Category-specific visual responses of
single neurons in the human medial temporal lobe. Nature Neurosci. 3, 946?953
(2000) | Article | PubMed | ISI | ChemPort |
15. Macmillan, N. A. & Creelman, C. D. Detection Theory: A User's Guide
(Cambridge Univ. Press, New York, 1991)
16. Picton, T. The P300 wave of the human event-related potential. J. Clin.
Neurophysiol. 9, 456?479 (1992) | PubMed | ISI | ChemPort |
17. Halgren, E., Marinkovic, K. & Chauvel, P. Generators of the late
cognitive potentials in auditory and visual oddball tasks. Electroencephalogr.
Clin. Neurophysiol. 106, 156?164 (1998) | Article | PubMed | ISI | ChemPort |
18. McCarthy, G., Wood, C. C., Williamson, P. D. & Spencer, D. D.
Task-dependent field potentials in human hippocampal formation. J. Neurosci. 9,
4253?4268 (1989) | PubMed | ISI | ChemPort |
19. Saleem, K. S. & Tanaka, K. Divergent projections from the anterior
inferotemporal area TE to the perirhinal and entorhinal cortices in the macaque
monkey. J. Neurosci. 16, 4757?4775 (1996) | PubMed | ISI | ChemPort |
20. Suzuki, W. A. Neuroanatomy of the monkey entorhinal, perirhinal and
parahippocampal cortices: Organization of cortical inputs and interconnections
with amygdale and striatum. Seminar Neurosci. 8, 3?12 (1996)
21. Kanwisher, N., McDermott, J. & Chun, M. M. The fusiform face area: A
module in human extrastriate cortex specialized for face perception. J.
Neurosci. 17, 4302?4311 (1997) | PubMed | ISI | ChemPort |
22. Haxby, J. V. et al. Distributed and overlapping representations of faces
and objects in ventral temporal cortex. Science 293, 2425?2430 (2001) | Article
| PubMed | ISI | ChemPort |
23. Eichenbaum, H. A cortical-hippocampal system for declarative memory.
Nature Rev. Neurosci. 1, 41?50 (2000) | Article | PubMed | ISI | ChemPort |
24. Hampson, R. E., Pons, P. P., Stanford, T. R. & Deadwyler, S. A.
Categorization in the monkey hippocampus: A possible mechanism for encoding
information into memory. Proc. Natl Acad. Sci. USA 101, 3184?3189 (2004) |
Article | PubMed | ChemPort |
25. Squire, L. R., Stark, C. E. L. & Clark, R. E. The medial temporal lobe.
Annu. Rev. Neurosci. 27, 279?306 (2004) | Article | PubMed | ISI | ChemPort |
26. Mishashita, Y. Neuronal correlate of visual associative long-term memory
in the primate temporal cortex. Nature 335, 817?820 (1988) | Article | PubMed |
ISI | ChemPort |
27. Koch, C. The Quest for Consciousness: A Neurobiological Approach
(Roberts, Englewood, Colorado, 2004)
28. Wilson, M. A. & McNaughton, B. L. Dynamics of the hippocampal ensemble
code for space. Science 261, 1055?1058 (1993) | PubMed | ISI | ChemPort |
29. Ekstrom, A. D. et al. Cellular networks underlying human spatial
navigation. Nature 425, 184?187 (2003) | Article | PubMed | ISI | ChemPort |
30. Quian Quiroga, R., Nadasdy, Z. & Ben-Shaul, Y. Unsupervised spike
detection and sorting with wavelets and super-paramagnetic clustering. Neural
Comput. 16, 1661?1687 (2004) | PubMed |
1. Computation and Neural Systems, California Institute of Technology,
Pasadena, California 91125, USA
2. Division of Neurosurgery and Neuropsychiatric Institute, University of
California, Los Angeles (UCLA), California 90095, USA
3. Brain and Cognitive Sciences, Massachusetts Institute of Technology,
Cambridge, Massachusetts 02142, USA
4. Functional Neurosurgery Unit, Tel-Aviv Medical Center and Sackler Faculty
of Medicine, Tel-Aviv University, Tel-Aviv 69978, Israel
5. Present address: Department of Engineering, University of Leicester, LE1
7RH, UK
Correspondence to: R. Quian Quiroga1,2,5 Correspondence and request for
materials should be addressed to R.Q.Q. (Email: rodri at vis.caltech.edu).
Received 1 December 2004; Accepted 3 February 2005
More information about the paleopsych
mailing list