Subido por madelon.maniacc

Sensation and perception: Goldstein

Anuncio
UPDATED, ENHANCED, and packaged with
each new text—with more than 75 new
media exercises to help students
learn interactively!
Virtual Lab CD-ROM
By Bruce Goldstein, Colin Ryan, and John Baro
The Virtual Lab CD-ROM includes more than 200
demonstrations and activities that help students become more
engaged with sensation and perception topics, comprehend
concepts through interactive exercises, and get the most out
of the course:
• Media Interactions
include drag-and-drop
animations, experiments,
and interactive exercises
that illustrate principles
in this book, as well as
auditory demonstrations.
Thomas V. PapaThomas, Rolling Eyes on a Hollow Mask
• Media Experiments and Demonstrations are designed
so students can practice gathering data, varying many
parameters to determine how changes to parameters
affect perception, and
analyzing the results.
The more than 75 new
additions include work
by researchers from
around the globe, and
feature many new
illusions (visual and
auditory), eye movement
records, hearing loss
demonstrations,
and more.
Ted Adelson, White’s Illusion
Michael Bach, Dalmation Hidden Figure
Completely integrated with the text
Throughout the text, a Virtual Lab icon directs students
to specific animations and videos designed to help them
visualize the material about which they are reading. The
number beside each icon indicates the number of the
relevant media element. At the end of each chapter, the
titles of related Virtual Lab exercises are listed.
Accessible in three convenient ways!
Virtual Labs can be accessed via the
CD-ROM that is packaged with each new
text, through CengageNOW™ for Sensation
and Perception, Eighth Edition, and through
WebTutor™ on WebCT® or Blackboard®.
Instructors: Contact your local Cengage Learning
representative to help create the package that’s just
right for you and your students.
1.
2.
3.
Virtual Lab
Manual
Accompanied by the Virtual Lab Manual
The streamlined Virtual Lab Manual (available digitally on the CD-ROM and in a
printed version) includes worksheets for the Virtual Lab experiments to encourage
students to take a closer look at the labs and engage in analysis of the results.
E. BRUCE GOLDSTEIN
Instructors—If you would like the printed version of the Virtual Lab Manual to be
packaged with each new text, please use these ISBNs when placing your textbook order:
ISBN-10: 0-495-76050-1 • ISBN-13: 978-0-495-76050-4.
Sensation
and Perception
This page intentionally left blank
Eighth Edition
Sensation
and Perception
E. Bruce Goldstein
University of Pittsburgh
University of Arizona
Australia • Brazil • Japan • Korea • Mexico • Singapore • Spain • United Kingdom • United States
iii
Sensation and Perception, Eighth Edition
E. Bruce Goldstein
Senior Publisher: Linda Schreiber
Editors: Jon-David Hague, Jaime A. Perkins
Managing Development Editor: Jeremy Judson
Assistant Editor: Trina Tom
© 2010, 2007 Wadsworth, Cengage Learning
ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may
be reproduced, transmitted, stored, or used in any form or by any means graphic,
electronic, or mechanical, including but not limited to photocopying, recording,
scanning, digitizing, taping, Web distribution, information networks, or information
storage and retrieval systems, except as permitted under Section 107 or 108 of
the 1976 United States Copyright Act, without the prior written permission of the
publisher.
Editorial Assistant: Sarah Worrell
Media Editor: Lauren Keyes
Marketing Manager: Elisabeth Rhoden
Marketing Assistant: Molly Felz
Marketing Communications Manager: Talia Wise
Project Managers, Editorial Production: Mary Noel,
Rita Jaramillo
Creative Director: Rob Hugel
For product information and technology assistance, contact us at
Cengage Learning Customer & Sales Support, 1-800-354-9706.
For permission to use material from this text or product,
submit all requests online at www.cengage.com/permissions.
Further permissions questions can be e-mailed to
permissionrequest@cengage.com.
Library of Congress Control Number: 2008940684
Art Director: Vernon T. Boes
Print Buyer: Judy Inouye
Permissions Editors: Mandy Groszko, Tim Sisler
Production Service: Scratchgravel Publishing Services
Text Designer: Lisa Buckley
Art Editor: Lisa Torri
ISBN-13: 978-0-495-60149-4
ISBN-10: 0-495-60149-7
Wadsworth
10 Davis Drive
Belmont, CA 94002-3098
USA
Photo Researcher: Laura Cordova
Copy Editor: Margaret C. Tropp
Cover Designer: Irene Morris
Cover Image: Color Blocks #40 by Nancy Crow
photographed by J. Kevin Fitzsimons
Compositor: Newgen
Cengage Learning is a leading provider of customized learning solutions
with office locations around the globe, including Singapore, the
United Kingdom, Australia, Mexico, Brazil, and Japan. Locate your local
office at www.cengage.com/global.
Cengage Learning products are represented in Canada by
Nelson Education, Ltd.
To learn more about Wadsworth, visit www.cengage.com/wadsworth.
Purchase any of our products at your local college store or at our preferred online
store www.ichapters.com.
Printed in Canada
1 2 3 4 5 6 7 12 11 10 09
To my wife, Barbara, more than ever
and
To all of the students and teachers whose suggestions
helped shape this edition
About the Author
E. Bruce Goldstein is Professor Emeritus of Psychology at the University of Pittsburgh and Adjunct Professor of Psychology at the University of
Arizona. He has received the Chancellor’s Distinguished Teaching Award from the University of
Pittsburgh for his classroom teaching and textbook writing. He received his bachelor’s degree in
chemical engineering from Tufts University and
his PhD in experimental psychology from Brown University;
he was a postdoctoral fellow in the Biology Department at
Harvard University before joining the faculty at the University of Pittsburgh. Bruce has published papers on a wide variety of topics, including retinal and cortical physiology, visual
attention, and the perception of pictures. He is the author of
Cognitive Psychology: Connecting Mind, Research, and Everyday Experience, 2nd Edition (Wadsworth, 2008), and the editor of the
Blackwell Handbook of Perception (Blackwell, 2001) and the forthcoming two-volume Sage Encyclopedia of Perception (Sage, 2010).
Brief Contents
1 Introduction to Perception
3
12 Sound Localization and the
Auditory Scene 291
2 Introduction to the Physiology of
Perception 23
13 Speech Perception
3 Introduction to Vision
14 The Cutaneous Senses
43
311
329
4 The Visual Cortex and Beyond
73
15 The Chemical Senses
5 Perceiving Objects and Scenes
99
16 Perceptual Development
6 Visual Attention
7 Taking Action
9 Perceiving Color
Appendix
Signal Detection Theory
133
155
8 Perceiving Motion
Glossary
177
401
425
Name Index
229
379
407
References
201
10 Perceiving Depth and Size
355
Subject Index
443
449
11 Sound, the Auditory System, and
Pitch Perception 259
vii
This page intentionally left blank
Contents
Introduction
to the
Physiology of
2 Perception 23
Introduction to
1 Perception 3
WHY READ THIS BOOK?
4
THE PERCEPTUAL PROCESS
The Stimulus 5
Electricity 7
Experience and Action
Knowledge 9
5
DEMONSTRATION: Perceiving a Picture
10
HOW TO APPROACH THE STUDY OF
PERCEPTION 11
MEASURING PERCEPTION
12
Description 13
Recognition 13
Detection
13
13
METHOD: Determining the Absolute Threshold 13
METHOD: Determining the Difference Threshold 15
Magnitude Estimation
16
METHOD: Magnitude Estimation
Search 17
Other Methods of Measurement
16
17
SOMETHING TO CONSIDER: THRESHOLD
MEASUREMENT CAN BE INFLUENCED BY HOW
A PERSON CHOOSES TO RESPOND 18
❚ TEST YOURSELF 1.1
18
Think About It 19
If You Want to Know More
Key Terms 19
Media Resources 19
VL VIRTUAL LAB
20
24
Brief History of the Physiological Approach
Basic Structure of the Brain 26
24
NEURONS: CELLS THAT CREATE AND TRANSMIT
ELECTRICAL SIGNALS 26
8
METHOD: Recognition
THE BRAIN: THE MIND’S COMPUTER
19
Structure of Neurons 26
Recording Electrical Signals in Neurons 27
METHOD: Recording From a Neuron
27
Chemical Basis of Action Potentials 29
Basic Properties of Action Potentials 30
Events at the Synapse 30
❚ TEST YOURSELF 2.1 32
NEURAL PROCESSING: EXCITATION,
INHIBITION, AND INTERACTIONS BETWEEN
NEURONS 32
Excitation, Inhibition, and Neural Responding 32
Introduction to Receptive Fields 34
METHOD: Determining a Neuron’s Receptive Field
34
THE SENSORY CODE: HOW THE ENVIRONMENT
IS REPRESENTED BY THE FIRING OF
NEURONS 36
Specificity Coding: Representation by the Firing of Single
Neurons 36
Distributed Coding: Representation by the Firing of Groups
of Neurons 38
Sparse Coding: Distributed Coding With Just a Few
Neurons 38
SOMETHING TO CONSIDER: THE MIND–BODY
PROBLEM 39
❚ TEST YOURSELF 2.2
Think About It
39
40
ix
If You Want to Know More
Key Terms 40
Media Resources 40
VL VIRTUAL LAB
40
40
The Visual
Cortex and
4 Beyond 73
FOLLOWING THE SIGNALS FROM RETINA
TO CORTEX 74
Introduction
3 to Vision 43
FOCUSING LIGHT ONTO THE RETINA
76
Selective Adaptation and Feature Detectors 79
METHOD: Selective Adaptation to Orientation
45
Selective Rearing and Feature Detectors
TRANSFORMING LIGHT INTO ELECTRICITY
The Visual Receptors and Transduction
How Does Transduction Occur? 47
Distribution of the Rods and Cones
47
80
80
MAPS AND COLUMNS IN THE STRIATE CORTEX
47
METHOD: Brain Imaging
50
52
DEMONSTRATION: Filling in the Blind Spot
82
Columns in the Striate Cortex 84
How Is an Object Represented in the Striate Cortex? 86
❚ TEST YOURSELF 4.1 87
50
52
Dark Adaptation of the Rods and Cones
52
STREAMS: PATHWAYS FOR WHAT, WHERE, AND
HOW 87
METHOD: Measuring Dark Adaptation
53
Streams for Information About What and Where 88
Spectral Sensitivity of the Rods and Cones
❚ TEST YOURSELF 3.1 57
82
Maps in the Striate Cortex 82
DEMONSTRATION: Becoming Aware of the Blind
Spot
From Neurons
DO FEATURE DETECTORS PLAY A ROLE IN
PERCEPTION? 79
DEMONSTRATION: Becoming Aware of What Is in
PIGMENTS AND PERCEPTION
75
METHOD: Determining Retinotopic Maps by Recording
Receptive Fields of Neurons in the Striate Cortex 77
44
Light: The Stimulus for Vision 44
The Eye 44
Light Is Focused by the Eye 44
Focus
The Visual System 74
Processing in the Lateral Geniculate Nucleus
METHOD: Brain Ablation
56
88
Streams for Information About What and How 89
METHOD: Dissociations in Neuropsychology
89
NEURAL CONVERGENCE AND PERCEPTION
58
Why Rods Result in Greater Sensitivity Than Cones
Why We Use Our Cones to See Details 60
58
MODULARITY: STRUCTURES FOR FACES, PLACES,
AND BODIES 91
60
Face Neurons in the Monkey’s IT Cortex 92
Areas for Faces, Places, and Bodies in the Human Brain
DEMONSTRATION: Foveal Versus Peripheral Acuity
LATERAL INHIBITION AND PERCEPTION
61
SOMETHING TO CONSIDER: HOW DO NEURONS
BECOME SPECIALIZED? 94
What the Horseshoe Crab Teaches Us About
Inhibition 62
Lateral Inhibition and Lightness Perception 62
DEMONSTRATION: Creating Mach Bands in Shadows
DEMONSTRATION: Simultaneous Contrast 66
A Display That Can’t Be Explained by Lateral
Inhibition 67
SOMETHING TO CONSIDER: PERCEPTION IS
INDIRECT 68
❚ TEST YOURSELF 3.2
68
Think About It 68
If You Want to Know More
Key Terms 69
Media Resources 70
VL VIRTUAL LAB
x
Contents
70
69
64
Is Neural Selectivity Shaped by Evolution? 94
How Neurons Can Be Shaped by Experience 94
❚ TEST YOURSELF 4.2 95
Think About It 95
If You Want to Know More
Key Terms 96
Media Resources 97
VL VIRTUAL LAB
97
96
92
Key Terms 129
Media Resources 129
VL VIRTUAL LAB
129
Perceiving
Objects and
5 Scenes 99
WHY IS IT SO DIFFICULT TO DESIGN A PERCEIVING
MACHINE? 101
The Stimulus on the Receptors Is Ambiguous 101
Objects Can Be Hidden or Blurred 102
Objects Look Different From Different Viewpoints 102
THE GESTALT APPROACH TO OBJECT
PERCEPTION 104
DEMONSTRATION: Making Illusory Contours
Vanish 104
The Gestalt Laws of Perceptual Organization
ATTENTION AND PERCEIVING THE
ENVIRONMENT 134
105
DEMONSTRATION: Finding Faces in a Landscape
107
Perceptual Segregation: How Objects Are Separated From
the Background 108
The Gestalt “Laws” as Heuristics 109
RECOGNITION-BY-COMPONENTS THEORY 110
DEMONSTRATION: Non-Accidental Properties 111
❚ TEST YOURSELF 5.1
Visual
6 Attention 133
113
Why Is Selective Attention Necessary? 134
How Is Selective Attention Achieved? 135
What Determines How We Scan a Scene? 135
HOW DOES ATTENTION AFFECT OUR ABILITY TO
PERCEIVE? 137
Perception Can Occur Without Focused Attention
Perception Can Be Affected by a Lack of Focused
Attention 138
DEMONSTRATION: Change Detection
139
PERCEIVING SCENES AND OBJECTS IN
SCENES 114
❚ TEST YOURSELF 6.1
Perceiving the Gist of a Scene
DOES ATTENTION ENHANCE PERCEPTION?
114
METHOD: Using a Mask to Achieve Brief Stimulus
Presentations
114
DEMONSTRATION: Shape From Shading 116
DEMONSTRATION: Visualizing Scenes and Objects
117
THE PHYSIOLOGY OF OBJECT AND SCENE
PERCEPTION 120
Neurons That Respond to Perceptual Grouping and
Figure–Ground 120
How Does the Brain Respond to Objects? 121
Connecting Neural Activity and Perception 122
122
SOMETHING TO CONSIDER: MODELS OF BRAIN
ACTIVITY THAT CAN PREDICT WHAT A PERSON IS
LOOKING AT 124
❚ TEST YOURSELF 5.2
141
141
ATTENTION AND EXPERIENCING A COHERENT
WORLD 143
The Role of Inference in Perception 118
Revisiting the Science Project: Designing a Perceiving
Machine 119
METHOD: Region-of-Interest Approach
141
Effects of Attention on Information Processing
Effects of Attention on Perception 142
Regularities in the Environment: Information for
Perceiving 115
137
Why Is Binding Necessary? 143
Feature Integration Theory 144
DEMONSTRATION: Searching for Conjunctions
The Physiological Approach to Binding
146
THE PHYSIOLOGY OF ATTENTION
146
145
SOMETHING TO CONSIDER: ATTENTION IN
AUTISM 148
❚ TEST YOURSELF 6.2
150
Think About It 150
If You Want to Know More
Key Terms 152
Media Resources 152
VL VIRTUAL LAB
151
152
127
Think About It 127
If You Want to Know More
128
Contents
xi
Motion Provides Information About Objects
179
DEMONSTRATION: Perceiving a Camouflaged Bird
Taking
Motion of a Stimulus Across the Retina: The Aperture
Problem 184
DEMONSTRATION: Motion of a Bar Across an
The Moving Observer and Information in the
Environment 156
Self-Produced Information 157
The Senses Do Not Work in Isolation 158
Aperture
METHOD: Microstimulation
❚ TEST YOURSELF 8.1
158
Other Strategies for Navigating 159
The Physiology of Navigation 161
❚ TEST YOURSELF 7.1 165
TAKING EYE MOTIONS INTO ACCOUNT: THE
COROLLARY DISCHARGE 189
Signal With an Afterimage
OBSERVING OTHER PEOPLE’S ACTIONS
Eyelid
190
Physiological Evidence for Corollary Discharge Theory
168
168
PERCEIVING BIOLOGICAL MOTION
191
192
Brain Activation by Point-Light Walkers 192
Linking Brain Activity and the Perception of Biological
Motion 193
172
METHOD: Transcranial Magnetic Stimulation
(TMS)
193
SOMETHING TO CONSIDER: GOING BEYOND THE
STIMULUS 194
Implied Motion 194
Apparent Motion 195
❚ TEST YOURSELF 8.2 195
173
174
Think About It 196
If You Want to Know More
Key Terms 197
Media Resources 197
VL VIRTUAL LAB
Perceiving
8 Motion 177
FUNCTIONS OF MOTION PERCEPTION
Motion Helps Us Understand Events in Our
Environment 178
Motion Attracts Attention 179
Contents
190
DEMONSTRATION: Seeing Motion by Pushing on Your
SOMETHING TO CONSIDER: CONTROLLING
MOVEMENT WITH THE MIND 171
xii
188
DEMONSTRATION: Eliminating the Image Displacement
Affordances: What Objects Are Used For 165
The Physiology of Reaching and Grasping 166
VL VIRTUAL LAB
159
186
188
Corollary Discharge Theory 189
Behavioral Demonstrations of Corollary Discharge
Theory 190
ACTING ON OBJECTS: REACHING AND
GRASPING 165
Think About It 173
If You Want to Know More
Key Terms 173
Media Resources 174
185
Motion of Arrays of Dots on the Retina
NAVIGATING THROUGH THE ENVIRONMENT
❚ TEST YOURSELF 7.2
181
NEURAL FIRING TO MOTION ACROSS THE
RETINA 184
THE ECOLOGICAL APPROACH TO
PERCEPTION 156
Mirroring Others’ Actions in the Brain
Predicting People’s Intentions 169
Mirror Neurons and Experience 170
180
When Do We Perceive Motion? 180
Comparing Real and Apparent Motion
What We Want to Explain 182
MOTION PERCEPTION: INFORMATION IN THE
ENVIRONMENT 183
7 Action 155
DEMONSTRATION: Keeping Your Balance
STUDYING MOTION PERCEPTION
179
178
197
196
SOMETHING TO CONSIDER: EXPERIENCES THAT
ARE CREATED BY THE NERVOUS SYSTEM 224
❚ TEST YOURSELF 9.3
224
Think About It 224
If You Want to Know More
Key Terms 226
Media Resources 226
VL VIRTUAL LAB
225
227
Perceiving
9 Color 201
INTRODUCTION TO COLOR
202
What Are Some Functions of Color Vision?
What Colors Do We Perceive? 203
Color and Wavelength 204
Wavelengths Do Not Have Color! 206
202
TRICHROMATIC THEORY OF COLOR VISION
207
Behavioral Evidence for the Theory 207
The Theory: Vision Is Trichromatic 207
Physiology of Trichromatic Theory 207
❚ TEST YOURSELF 9.1 211
COLOR DEFICIENCY
Perceiving Depth
10 and Size 229
OCULOMOTOR CUES 231
DEMONSTRATION: Feelings in Your Eyes
211
Monochromatism 212
Dichromatism 212
Physiological Mechanisms of Receptor-Based Color
Deficiency 213
Contrast
233
BINOCULAR DEPTH INFORMATION
Binocular Disparity
213
DEMONSTRATION: The Colors of the Flag 214
DEMONSTRATION: Afterimages and Simultaneous
214
DEMONSTRATION: Visualizing Colors
231
Pictorial Cues 231
Motion-Produced Cues
DEMONSTRATION: Deletion and Accretion
OPPONENT-PROCESS THEORY OF COLOR
VISION 213
Behavioral Evidence for the Theory
MONOCULAR CUES
231
234
235
235
DEMONSTRATION: Two Eyes: Two Viewpoints
DEMONSTRATION: Binocular Depth From a Picture,
214
Without a Stereoscope
238
The Theory: Vision Is An Opponent Process 215
The Physiology of Opponent-Process Vision 215
The Correspondence Problem
COLOR IN THE CORTEX
DEPTH INFORMATION ACROSS SPECIES
❚ TEST YOURSELF 9.2
217
PERCEIVING COLORS UNDER CHANGING
ILLUMINATION 217
DEMONSTRATION: Color Perception Under Changing
Illumination 218
Chromatic Adaptation 219
DEMONSTRATION: Adapting to Red
219
220
220
LIGHTNESS CONSTANCY
220
Intensity Relationships: The Ratio Principle 221
Lightness Perception Under Uneven Illumination 221
DEMONSTRATION: The Penumbra and Lightness
Perception
240
222
DEMONSTRATION: Perceiving Lightness at a
Corner 223
242
Neurons That Respond to Pictorial Depth 242
Neurons That Respond to Binocular Disparity 242
Connecting Binocular Depth Cells and Depth
Perception 242
❚ TEST YOURSELF 10.1 243
PERCEIVING SIZE
220
DEMONSTRATION: Color and the Surroundings
Memory and Color
240
THE PHYSIOLOGY OF DEPTH PERCEPTION
217
The Effect of the Surroundings
235
Connecting Disparity Information and the Perception of
Depth 238
243
The Holway and Boring Experiment
Size Constancy 246
244
DEMONSTRATION: Perceiving Size at a Distance 247
DEMONSTRATION: Size–Distance Scaling and Emmert’s
Law
247
VISUAL ILLUSIONS
249
The Müller-Lyer Illusion
249
DEMONSTRATION: Measuring the Müller-Lyer
Illusion
249
DEMONSTRATION: The Müller-Lyer Illusion With
Books
250
Contents
xiii
PITCH AND THE BRAIN
The Ponzo Illusion 251
The Ames Room 251
The Moon Illusion 252
SOMETHING TO CONSIDER: DISTANCE
PERCEPTION AND PERCEIVED EFFORT
❚ TEST YOURSELF 10.2
253
254
Think About It 254
If You Want to Know More
Key Terms 256
Media Resources 256
VL VIRTUAL LAB
SOMETHING TO CONSIDER: COCHLEAR
IMPLANTS—WHERE SCIENCE AND CULTURE
MEET 285
The Technology 286
The Controversy 287
❚ TEST YOURSELF 11.3
255
150
Military jet
on runway (140)
140
Pain threshold (130)
130
Rock concert in
front row (120)
120
287
Think About It 287
If You Want to Know More
Key Terms 288
Media Resources 288
256
VL VIRTUAL LAB
Space Shuttle launch
ground zero (150)
283
Linking Physiological Responding and Perception 283
How the Auditory Cortex Is Shaped by Experience 284
287
289
110
Loud basketball or
hockey crowd (100)
100
90
Heavy traffic (80)
80
70
Normal
conversation (60)
60
50
Library (40)
40
30
Whisper at 5 feet (20)
20
10
11
Threshold of
hearing (0)
0
dB
Sound, the Auditory
System, and Pitch
Perception 259
THE SOUND STIMULUS
Sound as Pressure Changes 261
Pressure Changes: Pure Tones 262
Pressure Changes: Complex Tones 263
PERCEIVING SOUND
12
264
AUDITORY LOCALIZATION 292
DEMONSTRATION: Sound Localization
Loudness 264
Pitch 265
The Range of Hearing 265
Timbre 267
❚ TEST YOURSELF 11.1 268
THE EAR
THE PHYSIOLOGY OF AUDITORY
LOCALIZATION 297
Narrowly Tuned ITD Neurons 297
Broadly Tuned ITD Neurons 298
❚ TEST YOURSELF 12.1 298
268
THE REPRESENTATION OF FREQUENCY IN THE
COCHLEA 272
Békésy’s Place Theory of Hearing
Evidence for Place Theory 274
273
METHOD: Neural Frequency Tuning Curves
METHOD: Auditory Masking 275
274
How the Basilar Membrane Vibrates to Complex Tones 276
Updating Békésy 277
How the Timing of Neural Firing Can Signal
Frequency 277
Hearing Loss Due to Hair Cell Damage 278
❚ TEST YOURSELF 11.2 279
CENTRAL AUDITORY PROCESSING
280
Pathway From the Cochlea to the Cortex 280
Auditory Areas in the Cortex 280
What and Where Streams for Hearing 281
Contents
292
Binaural Cues for Sound Location 293
Monaural Cue for Localization 295
The Outer Ear 268
The Middle Ear 268
The Inner Ear 270
xiv
Sound Localization
and the Auditory
Scene 291
261
PERCEPTUALLY ORGANIZING SOUNDS IN THE
ENVIRONMENT 299
Auditory Scene Analysis 299
Principles of Auditory Grouping 300
HEARING INSIDE ROOMS
303
Perceiving Two Sounds That Reach the Ears at Different
Times 304
DEMONSTRATION: The Precedence Effect
Architectural Acoustics
305
305
SOMETHING TO CONSIDER: INTERACTIONS
BETWEEN VISION AND HEARING 306
❚ TEST YOURSELF 12.2
307
Think About It 307
If You Want to Know More
Key Terms 308
Media Resources 308
VL VIRTUAL LAB
308
308
e
p
t
io
n
Mechanoreceptors 331
Pathways From Skin to Cortex 331
Maps of the Body on the Cortex 332
The Plasticity of Cortical Body Maps 333
P
er
c
PERCEIVING DETAILS 334
METHOD: Measuring Tactile Acuity
ch
Receptor Mechanisms for Tactile Acuity
335
335
ee
DEMONSTRATION: Comparing Two-Point Thresholds
p
Speech
Perception
S
13
THE SPEECH STIMULUS
Cortical Mechanisms for Tactile Acuity
311
PERCEIVING VIBRATION
The Acoustic Signal 312
Basic Units of Speech 313
❚ TEST YOURSELF 14.1
THE VARIABLE RELATIONSHIP BETWEEN
PHONEMES AND THE ACOUSTIC SIGNAL
Variability From Context 315
Variability From Different Speakers
315
315
Categorical Perception 316
Information Provided by the Face 318
Information From Our Knowledge of Language
❚ TEST YOURSELF 13.1 319
316
318
PERCEIVING OBJECTS 340
DEMONSTRATION: Identifying Objects
340
PAIN
343
Questioning the Direct Pathway Model of Pain
The Gate Control Model 345
Cognition and Pain 345
The Brain and Pain 346
343
SOMETHING TO CONSIDER: PAIN IN SOCIAL
SITUATIONS 349
INFORMATION FOR SPOKEN WORD
PERCEPTION 319
❚ TEST YOURSELF 14.2
319
DEMONSTRATION: Perceiving Degraded Sentences 319
DEMONSTRATION: Organizing Strings of Sounds 320
Information From Speaker Characteristics
322
SPEECH PERCEPTION AND THE BRAIN
Cortical Location of Speech Perception
Experience-Dependent Plasticity 324
339
339
Identifying Objects by Haptic Exploration 340
The Physiology of Tactile Object Perception 341
INFORMATION FOR PHONEME PERCEPTION
Information From Sentence Context
337
PERCEIVING TEXTURE 338
DEMONSTRATION: Perceiving Texture With a Pen
312
335
336
323
323
349
Think About It 350
If You Want to Know More
Key Terms 351
Media Resources 351
VL VIRTUAL LAB
350
351
SOMETHING TO CONSIDER: SPEECH PERCEPTION
AND ACTION 324
❚ TEST YOURSELF 13.2
325
Think About It 325
If You Want to Know More
Key Terms 326
Media Resources 326
VL VIRTUAL LAB
326
327
The Chemical
15 Senses 355
THE OLFACTORY SYSTEM
Functions of Olfaction
Detecting Odors 357
356
356
METHOD: Measuring the Detection Threshold
Identifying Odors
357
358
DEMONSTRATION: Naming and Odor Identification
The Cutaneous
14 Senses 329
OVERVIEW OF THE CUTANEOUS SYSTEM
The Skin
330
The Puzzle of Olfactory Quality
358
358
THE NEURAL CODE FOR OLFACTORY
QUALITY 359
330
The Olfactory Mucosa 359
Olfactory Receptor Neurons 359
Contents
xv
Activating Olfactory Receptor Neurons
METHOD: Calcium Imaging
361
Activating the Olfactory Bulb
361
METHOD: Optical Imaging 362
METHOD: 2-Deoxyglucose Technique
361
PERCEIVING FACES
362
HIGHER-ORDER OLFACTORY PROCESSING
364
367
Structure of the Taste System 367
Distributed Coding 369
Specificity Coding 370
THE PERCEPTION OF FLAVOR
372
373
DEMONSTRATION: “Tasting” With and Without the
373
The Physiology of Flavor Perception
373
SOMETHING TO CONSIDER: INDIVIDUAL
DIFFERENCES IN TASTING 374
INTERMODAL PERCEPTION
OLFACTION AND TASTE
394
395
SOMETHING TO CONSIDER: THE UNITY OF
PERCEPTION 396
METHOD: Paired Comparison 396
❚ TEST YOURSELF 16.2
397
Think About It 397
If You Want to Know More
Key Terms 399
Media Resources 399
VL VIRTUAL LAB
398
399
APPENDIX
376
Signal Detection Theory
A SIGNAL DETECTION EXPERIMENT
377
SIGNAL DETECTION THEORY
Glossary 407
References 425
Name Index 443
Subject Index 449
Perceptual
16 Development 379
BASIC VISUAL CAPACITIES
380
380
METHODS: Preferential Looking and Visual Evoked
Potential
380
Contrast Sensitivity 383
Perceiving Color 384
METHOD: Habituation
Perceiving Depth
xvi
Contents
386
385
401
401
403
Signal and Noise 403
Probability Distributions 404
The Criterion 404
The Effect of Sensitivity on the ROC Curve
Visual Acuity
393
376
Think About It 376
If You Want to Know More
Key Terms 377
Media Resources 377
VL VIRTUAL LAB
392
The Categorical Perception of Phonemes
Experience and Speech Perception 394
THE NEURAL CODE FOR TASTE QUALITY
❚ TEST YOURSELF 15.2
391
PERCEIVING SPEECH
Functions of Taste 366
Basic Taste Qualities 367
Nose
389
Threshold for Hearing a Tone 391
Recognizing Their Mother’s Voice 391
366
Flavor ⫽ Taste ⫹ Olfaction
PERCEIVING OBJECT UNITY
HEARING
Olfaction in the Environment 364
The Physiology of Higher-Order Processing 365
❚ TEST YOURSELF 15.1 366
THE TASTE SYSTEM
387
Recognizing Their Mother’s Face 387
Is There a Special Mechanism for Perceiving Faces?
❚ TEST YOURSELF 16.1 389
405
388
Virtual Lab
Contents
Chapter 1
1. The Method of Limits 13
2. Measuring Illusions 13
3. Measurement Fluctuation and Error 14
4. Adjustment and PSE 14
5. Method of Constant Stimuli 14
6. Just Noticeable Difference 15
7. Weber’s Law and Weber Fraction 15
8. DL vs. Weight 15
Chapter 2
1. Structure of a Neuron 27
2. Oscilloscopes and Intracellular Recording 28
3. Resting Potential 28
4. Phases of Action Potential 29
5. Nerve Impulse Coding and Stimulus Strength 30
6. Synaptic Transmission 30
7. Excitation and Inhibition 32
8. Simple Neural Circuits 32
9. Receptive Fields of Retinal Ganglion Cells 35
10. Mapping Receptive Fields 35
11. Receptive Field Mapping 35
12. Stimulus Size and Receptive Fields 35
13. Receptive Fields and Stimulus Size and Shape 35
Chapter 3
1. A Day Without Sight 44
2. The Human Eye 44
3. Filling In 52
4. Types of Cones 57
5. Cross Section of the Retina 58
6. Visual Path Within the Eyeball 58
7. Receptor Wiring and Sensitivity 59
8. Receptor Wiring and Acuity 61
9. Lateral Inhibition 62
10. Lateral Inhibition in the Hermann Grid 63
11. Receptive Fields of Retinal Ganglion Cells 63
12. Intensity and Brightness 64
13. Vasarely Illusion 64
14. Pyramid Illusion 64
15. Simultaneous Contrast 66
16. Simultaneous Contrast: Dynamic 66
17. Simultaneous Contrast 2 66
18. White’s Illusion 67
19. Craik-Obrien-Cornsweet Effect 67
20. Criss-Cross Illusion 67
21. Haze Illusion 67
22. Knill and Kersten’s Illusion 67
23. Koffka Ring 67
24. The Corrugated Plaid 67
25. Snake Illusion 67
26. Hermann Grid, Curving 67
Chapter 4
1. The Visual Pathways 75
2. Visual Cortex of the Cat 77
3. Simple Cells in the Cortex 78
4. Complex Cells in the Cortex 78
5. Contrast Sensitivity 79
6. Orientation Aftereffect 80
7. Size Aftereffect 80
8. Development in the Visual Cortex
9. Retinotopy Movie: Ring 82
10. Retinotopy Movie: Wedge 82
11. What and Where Streams 89
81
Chapter 5
1. Robotic Vehicle Navigation: DARPA Urban Challenge 101
2. Apparent Movement 104
3. Linear and Curved Illusory Contours 105
4. Enhancing Illusory Contours 105
5. Context and Perception: The Hering Illusion 105
6. Context and Perception: The Poggendorf Illusion 105
7. Ambiguous Reversible Cube 105
8. Perceptual Organization: The Dalmatian Dog 105
9. Law of Simplicity or Good Figure 105
10. Law of Similarity 106
11. Law of Good Continuation 106
12. Law of Closure 106
xvii
13. Law of Proximity 106
14. Law of Common Fate 107
15. Real-World Figure–Ground Ambiguity 108
16. Figure–Ground Ambiguity 109
17. Perceiving Rapidly Flashed Stimuli 114
18. Rotating Mask 1 117
19. Rotating Mask 2 117
20. Rotating Mask 3 117
21. Global Precedence 128
Chapter 6
1. Eye Movements While Viewing a Scene 135
2. Task-Driven Eye Movements 137
3. Perception Without Focused Attention 137
4. Inattentional Blindness Stimuli 138
5. Change Detection: Gradual Changes 140
6. Change Detection: Airplane 140
7. Change Detection: Farm 140
8. Change Blindness: Harborside 140
9. Change Detection: Money 140
10. Change Detection: Sailboats 140
11. Change Detection: Tourists 140
12. Feature Analysis 152
Chapter 7
1. Flow From Walking Down a Hallway 157
2. Stimuli Used in Warren Experiment 159
3. Pierno Stimuli 169
4. Neural Prosthesis 172
Chapter 8
1. Motion Providing Organization: The Hidden Bird 179
2. Perceptual Organization: The Dalmatian Dog 179
3. Motion Parallax and Object Form 180
4. Shape From Movement 180
5. Form and Motion 180
6. Motion Reference 180
7. Motion Binding 180
8. The Phi Phenomenon, Space, and Time 180
9. Illusory Contour Motion 180
10. Apparent Movement and Figural Selection 180
11. Motion Capture 180
12. Induced Movement 181
13. Waterfall Illusion 181
14. Spiral Motion Aftereffect 181
15. Flow From Walking Down a Hallway 184
16. Aperture Problem 185
17. Barberpole Illusion 185
18. Cortical Activation by Motion 187
19. Corollary Discharge Model 189
20. Biological Motion 1 192
21. Biological Motion 2 192
22. Motion and Introduced Occlusion 195
23. Field Effects and Apparent Movement 195
xviii
Virtual Lab Contents
24. Line-Motion Effect 195
25. Context and Apparent Speed
195
Chapter 9
1. Color Mixing 205
2. Cone Response Profiles and Hue 207
3. Cone Response Profiles and Perceived Color 207
4. Color Arrangement Test 211
5. Rod Monochromacy 212
6. Dichromacy 212
7. Missing Blue–Yellow Channel 213
8. “Oh Say Can You See” Afterimage Demonstration
9. Mixing Complementary Colors 214
10. Strength of Blue–Yellow Mechanisms 216
11. Strength of Red–Green Mechanism 216
12. Opponent-Process Coding of Hue 216
13. Checker-Shadow Illusion 223
14. Corrugated Plaid Illusion 1 223
15. Corrugated Plaid Illusion 2 223
16. Impossible Steps 223
17. Troxler Effect 224
214
Chapter 10
1. Convergence 231
2. Shape From Shading 233
3. The Horopter and Corresponding Points 236
4. Disparity and Retinal Location 237
5. Pictures 238
6. Outlines 238
7. Depth Perception 238
8. Random-Dot Stereogram 239
9. The Müller-Lyer Illusion 249
10. The Ponzo Illusion 251
11. Size Perception and Depth 251
12. Horizontal–Vertical Illusion 253
13. Zollner Illusion 253
14. Context and Perception: The Hering Illusion 253
15. Context and Perception: The Poggendorf Illusion 253
16. Poggendorf Illusion 253
Chapter 11
1. Decibel Scale 263
2. Loudness Scaling 264
3. Tone Height and Tone Chroma 265
4. Periodicity Pitch: Eliminating the Fundamental and Lower
Harmonics 265
5. Periodicity Pitch: St. Martin’s Chimes With Harmonics
Removed 265
6. Frequency Response of the Ear 265
7. Harmonics of a Gong 267
8. Effect of Harmonics on Timbre 267
9. Timbre of a Piano Tone Played Backward 267
10. Cochlear Mechanics: Cilia Movement 271
11. Cochlear Mechanics: Traveling Waves 273
12. Masking High and Low Frequencies 275
13. Cochlear Mechanics: Cochlear Amplifier 277
14. Hearing Loss 278
15. Cochear Implant: Environmental Sounds 287
16. Cochlear Implant: Music 287
17. Cochlear Implant: Speech 287
Chapter 12
1. Interaural Level Difference as a Cue for Sound
Localization 294
2. Grouping by Similarity of Timbre: The Wessel
Demonstration 300
3. Grouping by Pitch and Temporal Closeness 301
4. Effect of Repetition on Grouping by Pitch 301
5. Captor Tone Demonstration 301
6. Grouping by Similarity of Pitch 302
7. Octave Illusion 302
8. Chromatic Scale Illusion 302
9. Auditory Good Continuation 303
10. Melody Schema 303
11. Perceiving Interleaved Melodies 303
12. Layering Naturalistic Sounds 303
13. The Precedence Effect 304
14. Reverberation Time 305
15. Sound and Vision 1: Crossing or Colliding Balls 307
16. Sound and Vision 2: Rolling Ball 307
17. Sound and Vision 3: Flashing Dot 307
Chapter 13
1. Categorical Perception 316
2. The McGurk Effect 318
3. Speechreading 318
4. Statistical Learning Stimuli 321
5. Phantom Words 323
Chapter 14
1. Anatomy of the Skin 331
2. Surfing the Web With Touch 338
3. Gate Control System 345
4. Children and Chronic Pain 345
Chapter 15
1. The Sense of Smell 357
2. Olfactory System 359
3. Taste System 367
4. Anti–Sweet Tooth Gum 370
Chapter 16
1. Preferential Looking Procedure 381
2. Rod Moving Behind Occluder 390
3. Eye Movements Following Moving Ball 391
4. Testing Intermodal Perception in Infants 394
Virtual Lab Contents
xix
This page intentionally left blank
Demonstrations
Perceiving a Picture 10
Becoming Aware of What Is in Focus 45
Becoming Aware of the Blind Spot 52
Filling in the Blind Spot 52
Foveal Versus Peripheral Acuity 60
Creating Mach Bands in Shadows 64
Simultaneous Contrast 66
Making Illusory Contours Vanish 104
Finding Faces in a Landscape 107
Non-Accidental Properties 111
Shape From Shading 116
Visualizing Scenes and Objects 117
Change Detection 139
Searching for Conjunctions 145
Keeping Your Balance 158
Perceiving a Camouflaged Bird 179
Motion of a Bar Across an Aperture 185
Eliminating the Image Displacement Signal With an
Afterimage 190
Seeing Motion by Pushing on Your Eyelid 190
The Colors of the Flag 214
Afterimages and Simultaneous Contrast 214
Visualizing Colors 214
Color Perception Under Changing Illumination 218
Adapting to Red 219
Color and the Surroundings 220
The Penumbra and Lightness Perception 222
Perceiving Lightness at a Corner 223
Feelings in Your Eyes 231
Deletion and Accretion 234
Two Eyes: Two Viewpoints 235
Binocular Depth From a Picture, Without a Stereoscope
Perceiving Size at a Distance 247
Size–Distance Scaling and Emmert’s Law 247
Measuring the Müller-Lyer Illusion 249
The Müller-Lyer Illusion With Books 250
Sound Localization 292
The Precedence Effect 305
Perceiving Degraded Sentences 319
Organizing Strings of Sounds 320
Comparing Two-Point Thresholds 335
Perceiving Texture With a Pen 339
Identifying Objects 340
Naming and Odor Identification 358
“Tasting” With and Without the Nose 373
238
xxi
This page intentionally left blank
Preface
W
hen I first began working on this book, Hubel and
Wiesel were mapping orientation columns in the
striate cortex and were five years away from receiving their
Nobel Prize; Amoore’s stereochemical theory, based largely
on psychophysical evidence, was a prominent explanation
for odor recognition; and one of the hottest new discoveries
in perception was that the response properties of neurons
could be influenced by experience. Today, specialized areas
in the human brain have been mapped using brain imaging,
olfactory receptors have been revealed using genetic methods, and the idea that the perceptual system is tuned to regularities in the environment is now supported by a wealth of
both behavioral and physiological research.
But some things haven’t changed. Teachers still stand
in front of classrooms to teach students about perception,
and students still read textbooks that reinforce what they
are learning in the classroom. Another thing that hasn’t
changed is that teachers prefer texts that are easy for students to read, that present both classic studies and up-todate research, and that present both the facts of perception
and overarching themes and principles.
When I began teaching perception, I looked at the textbooks that were available and was disappointed, because
none of them seemed to be written for students. They presented “the facts,” but not in a way that seemed very interesting or inviting. I therefore wrote the first edition of Sensation
and Perception with the idea of involving students in their
study of perception by presenting the material as a story. The
story is a fascinating one, because it is a narrative of one discovery following from another, and a scientific “whodunit”
in which the goal is to uncover the hidden mechanisms responsible for our ability to perceive.
While my goal of writing this book has been to tell a
story, this is, after all, a textbook designed for teaching. So
in addition to presenting the story of perceptual research,
this book also contains a number of features, all of which
appeared in the seventh edition, that are designed to highlight specific material and to help students learn.
out with little trouble, thereby maximizing the probability that students will do them. Some examples:
Becoming Aware of the Blind Spot (Chapter 3); NonAccidental Properties (Chapter 5—new); The Penumbra and Lightness Perception (Chapter 9); The Precedence Effect (Chapter 12); Perceiving Texture With a
Pen (Chapter 14).
■
Methods It is important not only to present the facts
of perception, but also to make students aware of how
these facts were obtained. Highlighted Methods sections, which are integrated into the ongoing discussion, emphasize the importance of methods, and the
highlighting makes it easier to refer back to them
when referenced later in the book. Examples: Measuring Dark Adaptation (Chapter 3); Dissociations
in Neuropsychology (Chapter 4); Auditory Masking
(Chapter 11).
■
Something to Consider This end-of-chapter feature
offers the opportunity to consider especially interesting new findings. Examples: The Mind–Body
Problem (Chapter 2—new); How Do Neurons Become
Specialized? (Chapter 4); Interactions Between Vision
and Hearing (Chapter 12); Individual Differences in
Tasting (Chapter 15).
■
Test Yourself questions appear in the middle and at the
end of each chapter. These questions are broad enough
so students have to unpack the questions themselves,
thereby making them more active participants in their
studying.
■
Think About It The Think About It section at the end of
each chapter poses questions that require students to
apply what they have learned and that take them beyond the material in the chapter.
■
If You Want to Know More appears at the end of each
chapter, and invites students to look into topics that
were not fully covered in the chapter. A specific finding
is described and key references are presented to provide a starting point for further investigation.
■
Virtual Lab The Virtual Lab feature of this book enables students to view demonstrations and become
participants in mini-experiments. The Virtual Lab
has been completely revamped in this edition. More
Features
■
Demonstrations have been a popular feature of this
book for many editions. They are integrated into the
flow of the text and are easy enough to be carried
xxiii
than 80 new items have been added to the 150 items
carried over from the seventh edition. Most of these
new items have been generously provided by researchers in vision, hearing, and perceptual development.
Each item is indicated in the chapter by this numbered icon: VL . Students can access the Virtual Lab
in a number of ways: the CD-ROM, Perception PsychologyNow, or WebTutor resource at www.cengage
.com/psychology/goldstein.
■
Full-Color Illustrations Perception, of all subjects,
should be illustrated in color, and so I was especially
pleased when the seventh edition became “full-color.”
What pleases me about the illustrations is not only
how beautiful the color looks, but how well it serves
pedagogy. The 535 figures in this edition (140 of them
new) include photographs, which use color to illustrate both stimuli from experiments and perception
in real-world contexts; graphs and diagrams; anatomical diagrams; and the results of brain-imaging
experiments.
Supplement Package
CengageNOW™ for Goldstein’s Sensation
and Perception, Eighth Edition
0-495-80731-1
CengageNOW™ is an online teaching and learning resource
that gives you more control in less time and delivers better
outcomes—NOW. Flexible assignment and gradebook options provide you more control while saving you valuable
time in planning and managing your course assignments.
CengageNOW™ Personalized Study is a diagnostic tool consisting of chapter-specific pre- and post-tests and study plans
that utilize multimedia resources to help students master
the book’s concepts. The study plans direct students to interactive Virtual Labs featuring animations, experiments,
demonstrations, videos, and eBook pages from the text. Students can use the program on their own, or you can assign it
and track their progress in your online gradebook.
Changes in This Edition
Here are some of the changes in this edition, which have
been made both to make the book easier to read and to keep
current with the latest research.
Instructor’s Manual With Test Bank
Taking Student Feedback Into Account
0-495-60151-9
Written by Stephen Wurst of SUNY at Oswego. For each
chapter, this manual contains a detailed chapter outline,
learning objectives, a chapter summary, key terms with page
references, summary of labs on the Virtual Lab CD-ROM,
and suggested websites, films, demonstrations, activities,
and lecture topics. The test bank includes 40 multiple-choice
questions (with correct answer, page reference, and question
type) and 7 to 8 essay questions per chapter.
In past revisions I have made changes based on feedback
that professors have provided based on their knowledge of
the field and their experience in teaching from the book. In
this edition, I have, for the first time, made use of extensive
feedback provided by students based on their experience in
using the book. I asked each of the 150 students in my class
to write a paragraph in which they identified one thing in
each chapter they felt could be made clearer. My students
identified where and why they were having problems, and
often suggested changes in wording or organization. When
just one or two students commented on a particular section,
I often used their comments to make improvements, but I
paid the most attention when many students commented on
the same material. I could write a “Top Ten” list of sections
students thought should be revised, but instead I’ll just say
that student feedback resulted in numerous changes to every chapter in the book. Because of these changes, this is the
most “student friendly” edition yet.
PowerLecture With JoinIn™
and ExamView®
0-495-60319-8
This one-stop lecture and class preparation tool contains
ready-to-use Microsoft® PowerPoint® slides written by
Terri Bonebright of De Pauw University, and allows you to
assemble, edit, publish, and present custom lectures for
your course. PowerLecture lets you bring together textspecific lecture outlines along with videos of your own materials, culminating in a powerful, personalized, mediaenhanced presentation. The CD-ROM also includes JoinIn™,
an interactive tool that lets you pose book-specific questions and display students’ answers seamlessly within the
Microsoft® PowerPoint® slides of your own lecture, in conjunction with the “clicker” hardware of your choice, as well
as the ExamView® assessment and tutorial system, which
guides you step by step through the process of creating
tests.
xxiv
Preface
Improving Organization
The organization of material within every chapter has been
evaluated with an eye toward improving clarity of presentation. A few examples:
■
Chapters 2–4: These chapters set the stage for the
rest of the book by introducing students to the basic principles of vision and physiology. Responding
to feedback from users of the seventh edition, I now
introduce basic physiological processes in Chapter 2.
This means that topics such as sensory coding, neural circuits, and receptive fields that were formerly in
Chapters 3 and 4 are now introduced at the beginning
of the book. Vision is introduced in Chapter 3, focusing on the retina, and higher-order visual processes
are described in Chapter 4. This sequence of three
chapters now flows more smoothly than in the seventh
edition.
■
■
Chapter 5: Material on the physiology of object perception, which was formerly in the middle of the chapter,
has been moved to the end, allowing for an uninterrupted discussion of behavioral approaches to understanding the perception of objects and scenes.
Chapter 14: Discussion of gate control theory is no
longer at the end of the section on pain, but is now
introduced early in the section. We first consider what
motivated Melzack and Wall to propose the theory by
describing how pain perception was explained in the
early 1960s; then the theory is described, followed by a
discussion of new research on cognitive influences on
pain perception.
If you have used this book before, you will notice that
the final chapter of the sixth edition, “Clinical Aspects of
Vision and Hearing,” is no longer in the book. This chapter,
which was eliminated in the seventh edition to make room
for other material, such as a new chapter on visual attention,
described how vision and hearing can become impaired,
what happens during eye and ear examinations, and some
of the medical procedures that have been used to deal with
these problems. Some of this material has been included
in this edition, but for a fuller treatment, go to the book’s
website at www.cengage.com/psychology/goldstein for a
reprint of that chapter.
■
■
■
Chapter 6: Visual Attention
■
■
The updating of this edition is reflected in the inclusion of
more than 100 new references, most to recent research. In
addition, some earlier research has been added, and some
descriptions from the seventh edition have been updated.
Here are a few of these new additions.
Chapter 2: Introduction to the Physiology of Perception
■
■
Sparse coding
The mind–body problem
Chapter 4: The Visual Cortex and Beyond
■
■
■
■
■
■
■
■
■
What is a scene?
Perceiving the gist of a scene
Perceiving objects in scenes (the effect of context on
object perception)
Cortical response to the intention to take action
Neuropsychology of affordances
Mirror neurons and predicting another person’s
intentions
Behavioral and physiological responses during navigation by London taxi drivers
Neural prostheses: controlling movement with the
mind
Chapter 8: Perceiving Motion
■
Aperture problem (updated)
■
Transcranial magnetic stimulation and biological
motion
Chapter 9: Perceiving Color
■
■
■
Why two types of cones are necessary for color vision
(clarified)
Information that opponent neurons add to the trichromatic receptor response
Memory color (updated)
Chapter 10: Perceiving Depth and Size
■
■
Relative disparity added to discussion of absolute
disparity
Depth information across species
Is there a depth area in the brain?
Chapter 11: Sound, the Auditory System,
and Pitch Perception
■
■
■
■
■
Information flow in the lateral geniculate nucleus
Chapter 5: Perceiving Objects and Scenes
Perception without attention (updated)
Attention in autism
Chapter 7: Taking Action
■
Adding New Content
Regularities in the environment
Will robot vision ever be as good as human vision?
Models of brain activity that can predict what a person
is seeing
Ion flow and bending of inner hair cell cilia
Cochlear amplifier action of outer hair cells (updated)
Conductive hearing loss, sensorineural hearing loss,
presybcusis, and noise-induced hearing loss
Potential for hearing loss from listening to MP3 players
“Pitch neurons” in the cortex that respond to fundamental frequency even if the fundamental is missing
Chapter 12: Sound Localization and the Auditory Scene
■
■
■
Cone of confusion
Jeffress “coincidence detector” circuit for localization
Broadly tuned ITD neurons and localization added to
discussion of narrowly tuned neurons
Preface
xxv
■
Architectural acoustics expanded, including acoustics
in classrooms
Chapter 13: Speech Perception
■
Transitional probabilities as providing information
for speech segmentation
■
Dual-stream model of speech perception
■
Speech perception and action
■
Laura Cordova for her relentless quest for photo
permissions.
■
Lisa Torri, my art editor, for continuing the tradition
of working on my book, which started many editions
ago, and for all the care and creativity that went into
making all of the illustrations happen.
■
Mary Noel and Rita Jaramillo, senior content project
managers, who coordinated all of the elements of the
book during production and made sure everything
happened when it was supposed to so the book would
get to the printer on time.
■
Vernon Boes, art guru, who directed the design for the
book. Thanks, Vernon, for the therapeutic conversations, not to mention the great design and cover.
■
Lisa Buckley for the elegant design and Irene Morris
for the striking cover.
■
Precision Graphics for the beautiful art renderings.
■
Stephen Wurst, SUNY Oswego, for revising the Instructor’s Manual and Test Bank, which he wrote for
the previous edition.
Chapter 14: Cutaneous Senses
■
■
■
The case of Ian Waterman, who lost his senses of touch
and proprioception
Gate control theory placed in historical perspective
Brain activity in physically produced pain and pain
induced by hypnosis
Chapter 15: The Chemical Senses
■
■
■
■
Glomeruli as information-collecting units (updated)
Higher-level olfactory processing, including the perceptual organization of smell
Piriform cortex and perceptual learning
How the orbitofrontal cortex response to pleasantness
is affected by cognitive factors
Chapter 16: Perceptual Development
■
Measurement of contrast sensitivity function (clarified)
Acknowledgments
It is a pleasure to acknowledge the following people who
worked tirelessly to turn my manuscript into an actual
book! Without these people, this book would not exist, and I
am grateful to all of them.
■
Jeremy Judson, my developmental editor, for keeping
me on schedule, shepherding the book through the production process, and all those phone conversations.
■
Anne Draus of Scratchgravel Production Services, for
taking care of the amazing number of details involved
in turning my manuscript into a book in her usual
efficient and professional way.
■
Peggy Tropp, for her expert and creative copy editing.
■
Lauren Keyes for her attention to detail, for her technical expertise, and for being a pleasure to work with
on the updating of the Virtual Labs.
■
■
Armira Rezec, Saddelback College, for obtaining new
items for the Virtual Lab, and for revising the Virtual
Lab manual.
Trina Tom, for her work on the ancillaries, especially
the new Virtual Lab manual.
xxvi
Preface
In addition to the help I received from all of the above
people on the editorial and production side, I also received
a great deal of help from researchers and teachers. One of
the things I have learned in my years of writing is that other
people’s advice is crucial. The field of perception is a broad
one, and I rely heavily on the advice of experts in specific areas to alert me to emerging new research and to check my
writing for accuracy. Equally important are all of the teachers of perception who rely on textbooks in their courses.
They have read groups of chapters (and in a few cases, the
whole book), with an eye to both accuracy of the material
and pedagogy. I owe a great debt of thanks to this group of
reviewers for their advice about how to present the material
to their students. The following is a list of those who provided advice about content and teachability for this edition
of the book.
Christopher Brown
Arizona State University
Carol Colby
University of Pittsburgh
John Culling
University of Cardiff
Stuart Derbyshire
University of Birmingham
Diana Deutsch
University of California, San Diego
Laura Edelman
Muhlenberg College
Jack Gallant
University of California, Berkeley
Robert T. Weathersby
Eastern University
Mel Goodale
University of Western Ontario
Shannon N. Whitten
University of Central Florida
Mark Hollins
University of North Carolina
Donald Wilson
New York University
Marcel Just
Carnegie-Mellon University
Takashi Yamauchi
Texas A & M
Kendrick Kay
University of California, Berkeley
Thomas Yin
University of Wisconsin
Jeremy Loebach
Macalester College
William Yost
Arizona State University
David McAlpine
University College, London
I also thank the following people who donated photographs
and research records for illustrations that are new to this
edition.
Eriko Miyahara
California State University, Fullerton
Moshe Bar
Harvard University
Sam Musallam
University of Toronto
William Bosking
University of Texas
John Neuhoff
The College of Wooster
Mary Bravo
Rutgers University
Crystal Oberle
Texas State University
Paul Breslin
Monell Chemical Senses Center
Aude Oliva
Massachusetts Institute of Technology
Beatriz Calvo-Merino
University College, London
Andrew Parker
University of Oxford
Joseph Carroll
University of Wisconsin
Mary Peterson
University of Arizona
Stuart Derbyshire
University of Birmingham
David Pisoni
Indiana University
John Donahue
Brown University and Cyberkinetics, Inc.
Jan Schnupp
University of Oxford
Marc Ericson
Wright-Patterson Air Force Base
Bennett Schwartz
Florida International University
Li Fei-Fei
Princeton University
Alan Searleman
St. Lawrence University
David Furness
Keele University
Marc Sommer
University of Pittsburgh
Gregory Hickok
University of California, Irvine
Frank Tong
Vanderbilt University
Andrew Hollingworth
University of Iowa
Chris Urmson
Carnegie-Mellon University
David Laing
University of New South Wales
Preface
xxvii
Eleanor Maguire
University College, London
John Donahue
Brown University and Cyberkinetics, Inc.
Pascal Mammassion
Université Paris Descartes
Li Fei-Fei
Princeton University
Edward Morrison
Auburn University
Claire Murphy
San Diego State University
Aude Oliva
Massachusetts Institute of Technology
Kevin Pelphrey
Yale University
Andrea Pierno
University of Padua
Maryam Shahbake
University of Western Sydney
Frank Tong
Vanderbilt University
Antonio Torralba
Massachusetts Institute of Technology
Mary Hayhoe
University of Texas
Laurie Heller
Brown University
John Henderson
University of Edinburgh
George Hollich
Purdue University
Scott Johnson
University of California, Los Angeles
James Kalat
North Carolina State University
Stephen Neely
Boys Town Hospital, Omaha
Chris Urmson
Tartan Racing, Carnegie-Mellon University
Thomas Papathomas
Rutgers University
Brian Wandell
Stanford University
Phonak Corporation
Stafa, Switzerland
Donald Wilson
New York University
Andrea Pierno
University of Padua
Finally, I thank all of the people and organizations who generously provided demonstrations and movies for the revised
Virtual Lab CD-ROM.
ABC News
New York, New York
Edward Adelson
Massachusetts Institute of Technology
Leila Reddy
Massachusetts Institute of Technology
Ronald Rensink
University of British Columbia
Sensimetrics Corporation
Malden, Massachusetts
Michael Bach
University of Freiburg
Ladan Shams
University of California, Los Angeles
Colin Blakemore
Cambridge University
Nikolaus Troje
Queen’s University
Geoffrey Boynton
University of Washington
Chris Urmson
Tartan Racing, Carnegie-Mellon University
Diana Deutsch
University of California, San Diego
Peter Wenderoth
Macquarie University
xxviii Preface
Sensation
and Perception
Image not available due to copyright restrictions
Chapter Contents
C H A P T E R
1
WHY READ THIS BOOK?
THE PERCEPTUAL PROCESS
The Stimulus
Electricity
Experience and Action
Knowledge
DEMONSTRATION: Perceiving a Picture
HOW TO APPROACH
THE STUDY OF PERCEPTION
MEASURING PERCEPTION
Introduction
to Perception
Description
Recognition
METHOD: Recognition
Detection
METHOD: Determining the Absolute
Threshold
METHOD: Determining the Difference
Threshold
Magnitude Estimation
METHOD: Magnitude Estimation
Search
Other Methods of Measurement
SOMETHING TO CONSIDER:
THRESHOLD MEASUREMENT
CAN BE INFLUENCED BY HOW
A PERSON CHOOSES TO RESPOND
❚ TEST YOURSELF 1.1
Think About It
If You Want to Know More
Key Terms
Media Resources
VL VIRTUAL LAB
Image not available due to copyright restrictions
VL The Virtual Lab icons direct you to specific animations and videos
designed to help you visualize what you are reading about. The number
beside each icon indicates the number of the clip you can access through
your CD-ROM or your student website.
3
Some Questions We Will Consider:
❚ Why should you read this book? (p. 4)
❚ How are your perceptions determined by processes that
you are unaware of? (p. 5)
❚ What is the difference between perceiving something
and recognizing it? (p. 8)
❚ How can we measure perception? (p. 12)
I
magine that you have been given the following hypothetical science project.
Science project:
Design a device that can locate, describe, and identify all
objects in the environment, including their distance
from the device and their relationships to each other. In
addition, make the device capable of traveling from one
point to another, avoiding obstacles along the way.
Extra credit:
Make the device capable of having conscious experience,
such as what people experience when they look out at a
scene.
Warning:
This project, should you decide to accept it, is extremely
difficult. It has not yet been solved by the best computer
scientists, even though they have access to the world’s
most powerful computers.
Hint:
Humans and animals have solved the problems above
in a particularly elegant way. They use (1) two spherical sensors called “eyes,” which contain a light-sensitive
chemical, to sense light; (2) two detectors on the sides
of the head, which are fitted with tiny vibrating hairs
to sense pressure changes in the air; (3) small pressure
detectors of various shapes imbedded under the skin to
sense stimuli on the skin; and (4) two types of chemical
detectors to detect gases that are inhaled and solids and
liquids that are ingested.
Additional note:
Designing the detectors is just the first step in designing the system. An information processing system is
also needed. In the case of the human, this information
processing system is a “computer” called the brain, with
100 billion active units and interconnections so complex that they have still not been completely deciphered.
Although the detectors are an important part of the
project, the design of the computer is crucial, because
the information that is picked up by the detectors needs
to be analyzed. Note that operation of the human system is still not completely understood and that the best
scientific minds in the world have made little progress
with the extra credit part of the problem. Focus on the
4
CHAPTER 1
Introduction to Perception
main problem first, and leave conscious experience until
later.
The “science project” above is what this book is about.
Our goal is to understand the human model, starting with
the detectors—the eyes, ears, skin receptors, and receptors
in the nose and mouth—and then moving on to the computer—
the brain. We want to understand how we sense things in
the environment and interact with them. The paradox we face
in searching for this understanding is that although we still
don’t understand perception, perceiving is something that
occurs almost effortlessly. In most situations, we simply open
our eyes and see what is around us, or listen and hear sounds,
without expending any particular effort.
Because of the ease with which we perceive, many people
see perception as something that “just happens,” and don’t
see the feats achieved by our senses as complex or amazing. “After all,” the skeptic might say, “for vision, a picture
of the environment is focused on the back of my eye, and
that picture provides all the information my brain needs to
duplicate the environment in my consciousness.” But the
idea that perception is not complex is exactly what misled
computer scientists in the 1950s and 1960s to propose that
it would take only about a decade or so to create “perceiving machines” that could negotiate the environment with
humanlike ease. That prediction, made half a century ago,
has yet to come true, even though a computer defeated the
world chess champion in 1997. From a computer’s point of
view, perceiving a scene is more difficult than playing world
championship chess.
In this chapter we will begin by introducing some basic principles to help us understand the complexities of
perception. We will first consider a few practical reasons for
studying perception, then examine how perception occurs
in a sequence of steps, and fi nally consider how to measure
perception.
Why Read This Book?
The most obvious answer to the question “Why read this
book?” is that it is required reading for a course you are
taking. Thus, it is probably an important thing to do if
you want to get a good grade. But beyond that, there are
a number of other reasons for reading this book. For one
thing, the material will provide you with information that
may be helpful in other courses and perhaps even your future career. If you plan to go to graduate school to become
a researcher or teacher in perception or a related area, this
book will provide you with a solid background to build on.
In fact, a number of the research studies you will read about
were carried out by researchers who were introduced to the
field of perception by earlier editions of this book.
The material in this book is also relevant to future studies in medicine or related fields, since much of our discussion
is about how the body operates. A few medical applications
that depend on knowledge of perception are devices to restore perception to people who have lost vision or hearing,
and treatments for pain. Other applications include robotic
vehicles that can fi nd their way through unfamiliar environments, speech recognition systems that can understand
what someone is saying, and highway signs that are visible to
drivers under a variety of conditions.
But reasons to study perception extend beyond the
possibility of useful applications. Because perception is
something you experience constantly, knowing about how
it works is interesting in its own right. To appreciate why,
consider what you are experiencing right now. If you touch
the page of this book, or look out at what’s around you, you
might get the feeling that you are perceiving exactly what
is “out there” in the environment. After all, touching this
page puts you in direct contact with it, and it seems likely
that what you are seeing is what is actually there. But one
of the things you will learn as you study perception is that
everything you see, hear, taste, feel, or smell is created by
the mechanisms of your senses. This means that what you
are perceiving is determined not only by what is “out there,”
but also by the properties of your senses. This concept has
fascinated philosophers, researchers, and students for hundreds of years, and is even more meaningful now because of
recent advances in our understanding of the mechanisms
responsible for our perceptions.
Another reason to study perception is that it can help
you become more aware of the nature of your own perceptual experiences. Many of the everyday experiences that
you take for granted—such as listening to someone talking,
tasting food, or looking at a painting in a museum—can be
appreciated at a deeper level by considering questions such as
“Why does an unfamiliar language sound as if it is one continuous stream of sound, without breaks between words?”
“Why do I lose my sense of taste when I have a cold?” and
“How do artists create an impression of depth in a picture?”
This book will not only answer these questions but will
answer other questions that you may not have thought of,
such as “Why don’t I see colors at dusk?” and “How come the
scene around me doesn’t appear to move as I walk through
it?” Thus, even if you aren’t planning to become a physician
or a robotic vehicle designer, you will come away from reading this book with a heightened appreciation of both the
complexity and the beauty of the mechanisms responsible
for your perceptual experiences, and perhaps even with an
enhanced awareness of the world around you.
In one of those strange coincidences that occasionally
happen, I received an e-mail from a student (not one of my
own, but from another university) at exactly the same time
that I was writing this section of the book. In her e-mail,
“Jenny” made a number of comments about the book, but
the one that struck me as being particularly relevant to the
question “Why read this book?” is the following: “By reading your book, I got to know the fascinating processes that
take place every second in my brain, that are doing things I
don’t even think about.” Your reasons for reading this book
may turn out to be totally different from Jenny’s, but hopefully you will find out some things that will be useful, or
fascinating, or both.
The Perceptual Process
One of the messages of this book is that perception does not
just happen, but is the end result of complex “behind the
scenes” processes, many of which are not available to your
awareness. An everyday example of the idea of behind-thescenes processes is provided by what’s happening as you
watch a play in the theater. While your attention is focused
on the drama created by the characters in the play, another
drama is occurring backstage. An actress is rushing to complete her costume change, an actor is pacing back and forth
to calm his nerves just before he goes on, the stage manager
is checking to be sure the next scene change is ready to go,
and the lighting director is getting ready to make the next
lighting change.
Just as the audience sees only a small part of what is
happening during a play, your perception of the world
around you is only a small part of what is happening as you
perceive. One way to illustrate the behind-the-scenes processes involved in perception is by describing a sequence of
steps, which we will call the perceptual process.
The perceptual process, shown in Figure 1.1, is a sequence of processes that work together to determine our experience of and reaction to stimuli in the environment. We
will consider each step in the process individually, but fi rst
let’s consider the boxes in Figure 1.1, which divide the process into four categories: Stimulus, Electricity, Experience and
Action, and Knowledge.
Stimulus refers to what is out there in the environment,
what we actually pay attention to, and what stimulates our
receptors. Electricity refers to the electrical signals that are
created by the receptors and transmitted to the brain. Experience and Action refers to our goal—to perceive, recognize,
and react to the stimuli. Knowledge refers to knowledge we
bring to the perceptual situation. This box is located above
the other three boxes because it can have its effect at many
different points in the process. We will consider each box in
detail, beginning with the stimulus.
The Stimulus
The stimulus exists both “out there,” in the environment,
and within the person’s body.
Environmental Stimuli and Attended Stimuli
These two aspects of the stimulus are in the environment.
The environmental stimulus is all of the things in our
environment that we can potentially perceive. Consider,
The Perceptual Process
5
Knowledge
7 Perception
8 Recognition
9 Action
Experience
and
action
6 Processing
5 Transmission
1 Environmental
stimulus
Electricity
Stimulus
4 Transduction
2 Attended
stimulus
3 Stimulus
on the
receptors
for example, the potential stimuli that are presented to
Ellen, who is taking a walk in the woods. As she walks along
the trail she is confronted with a large number of stimuli
(Figure 1.2a)—trees, the path on which she is walking, rustling noises made by a small animal scampering through the
leaves. Because there is far too much happening for Ellen to
take in everything at once, she scans the scene, looking from
one place to another at things that catch her interest.
When Ellen’s attention is captured by a particularly distinctive looking tree off to the right, she doesn’t notice the
1. Environmental stimulus
2. Attended stimulus
Figure 1.1 ❚ The perceptual
process. The steps in this
process are arranged in a circle
to emphasize that the process is
dynamic and continually changing.
See text for descriptions of each
step in the process.
interesting pattern on the tree trunk at first, but suddenly
realizes that what she at first took to be a patch of moss is
actually a moth (Figure 1.2b). When Ellen focuses on this
moth, making it the center of her attention, it becomes the
attended stimulus. The attended stimulus changes from
moment to moment, as Ellen shifts her attention from place
to place.
The Stimulus on the Receptors When Ellen
focuses her attention on the moth, she looks directly at it,
3. Stimulus on the receptors
Image of moth
Figure 1.2 ❚ (a) We take
Retina
(a) The woods
6
CHAPTER 1
(b) Moth on tree
Introduction to Perception
(c) Image on Ellen’s retina
the woods as the starting
point for our description
of the perceptual process.
Everything in the woods is
the environmental stimulus.
(b) Ellen focuses on the moth,
which becomes the attended
stimulus. (c ) An image of the
moth is formed on Ellen’s
retina.
and this creates an image of the moth and its immediate
surroundings on the receptors of her retina, a 0.4-mm-thick
network of light-sensitive receptors and other neurons that
line the back of the eye (Figure 1.2c). (We will describe the
retina and neurons in more detail in Chapters 2 and 3.) This
step is important because the stimulus—the moth—is transformed into another form—an image on Ellen’s retina.
Because the moth has been transformed into an image,
we can describe this image as a representation of the moth.
It’s not the actual moth, but it stands for the moth. The
next steps in the perceptual process carry this idea of representation a step further, when the image is transformed
into electricity.
Electricity
One of the central principles of perception is that everything
we perceive is based on electrical signals in our nervous system. These electrical signals are created in the receptors,
which transform energy from the environment (such as the
light on Ellen’s retina) into electrical signals in the nervous
system—a process called transduction.
Transduction Transduction is the transformation of
one form of energy into another form of energy. For example, when you touch the “withdrawal” button on an ATM
machine, the pressure exerted by your fi nger is transduced
into electrical energy, which causes a device that uses mechanical energy to push your money out of the machine.
Transduction occurs in the nervous system when energy
in the environment—such as light energy, mechanical pressure, or chemical energy—is transformed into electrical energy. In our example, the pattern of light created on Ellen’s
4. Transduction
5. Transmission
retina by the moth is transformed into electrical signals in
thousands of her visual receptors (Figure 1.3a).
Transmission After the moth’s image has been trans-
formed into electrical signals in Ellen’s receptors, these
signals activate other neurons, which in turn activate more
neurons (Figure 1.3b). Eventually these signals travel out of
the eye and are transmitted to the brain. The transmission
step is crucial because if signals don’t reach the brain, there
is no perception.
Processing As
electrical signals are transmitted
through Ellen’s retina and then to the brain, they undergo
neural processing, which involves interactions between
neurons (Figure 1.3c). What do these interactions between
neurons accomplish? To answer this question, we will compare how signals are transmitted in the nervous system to
how signals are transmitted by your cell phone.
Let’s first consider the phone. When a person says
“hello” into a cell phone (right phone in Figure 1.4a), this
voice signal is changed into electrical signals, which are sent
out from the cell phone. This electrical signal, which represents the sound “hello,” is relayed by a tower to the receiving
cell phone (on the left), which transforms the signal into the
sound “hello.” An important property of cell phone transmission is that the signal that is received is the same as the
signal that was sent.
The nervous system works in a similar way. The image
of the moth is changed into electrical signals in the receptors, which eventually are sent out the back of the eye (Figure 1.4b). This signal, which represents the moth, is relayed
through a series of neurons to the brain, which transforms
this signal into a perception of the moth. Thus, with a cell
6. Processing
Light in
Figure 1.3 ❚ (a) Transduction
Electricity out
(a) Electricity created
(b) One neuron activates another
(c) Interactions between neurons
occurs when the receptors
create electrical energy in
response to light. (b) Transmission occurs as one neuron
activates the next one. (c) This
electrical energy is processed
through networks of neurons.
The Perceptual Process
7
Signal received
(same as sent)
Signal sent
“Hello”
“Hello”
Copy of stimulus
Transmission
Stimulus
(a)
Signal in the brain
(different than sent)
Signal sent
Processing
Perception
(b)
phone, electrical signals that represent a stimulus (“hello”)
are transmitted to a receiver (another cell phone), and in the
nervous system, electrical signals representing a stimulus
(the moth) are also transmitted to a receiver (the brain).
There are, however, differences between information
transmission in cell phones and in the nervous system. With
cell phones, the signal received is the same as the signal
sent. The goal for cell phones is to transmit an exact copy of
the original signal. However, in the nervous system, the signal that reaches the brain is transformed so that, although
it represents the original stimulus, it is usually very different
from the original signal.
The transformation that occurs between the receptors
and the brain is achieved by neural processing, which happens as the signals that originate in the receptors travel
through a maze of interconnected pathways between the receptors and the brain and within the brain. In the nervous
system, the original electrical representation of the stimulus
that is created by the receptors is transformed by processing
into a new representation of the stimulus in the brain. In
Chapter 2 we will describe how this transformation occurs.
Experience and Action
We have now reached the third box of the perceptual process, where the “backstage activity” of transduction, trans-
8
CHAPTER 1
Introduction to Perception
Stimulus
Figure 1.4 ❚ Comparison of signal
transmission by cell phones and the
nervous system. (a) The sending cell
phone on the right sends an electrical
signal that stands for “hello.” The
signal that reaches the receiving cell
phone on the left is the same as the
signal sent. (b) The nervous system
sends electrical signals that stand
for the moth. The nervous system
processes these electrical signals, so
the signal responsible for perceiving
the moth is different from the original
signal sent from the eye.
mission, and processing is transformed into things we are
aware of—perceiving, recognizing, and acting on objects in
the environment.
Perception Perception is conscious sensory experi-
ence. It occurs when the electrical signals that represent the
moth are transformed by Ellen’s brain into her experience of
seeing the moth (Figure 1.5a). In the past, some accounts of
the perceptual process have stopped at this stage. After all,
once Ellen sees the moth, hasn’t she perceived it? The answer
to this question is yes, she has perceived it, but other things
have happened as well—she has recognized the form as a
“moth” and not a “butterfly,” and she has taken action based
on her perception by walking closer to the tree to get a better look at the moth. These two additional steps—recognition
and action—are behaviors that are important outcomes of
the perceptual process.
Recognition Recognition is our ability to place an
object in a category, such as “moth,” that gives it meaning
(Figure 1.5b). Although we might be tempted to group perception and recognition together, researchers have shown
that they are separate processes. For example, consider
the case of Dr. P., a patient described by neurologist Oliver
Sacks (1985) in the title story of his book The Man Who Mistook His Wife for a Hat.
7. Perception
8. Recognition
9. Action
That is a
moth.
Figure 1.5 ❚ (a) Ellen has
(a) Ellen perceives something
on the tree.
(b) Ellen realizes it is a moth.
Dr. P., a well-known musician and music teacher, first
noticed a problem when he began having trouble recognizing his students visually, although he could immediately
identify them by the sound of their voices. But when Dr. P.
began misperceiving common objects, for example addressing a parking meter as if it were a person or expecting a carved
knob on a piece of furniture to engage him in conversation,
it became clear that his problem was more serious than just
a little forgetfulness. Was he blind, or perhaps crazy? It was
clear from an eye examination that he could see well and, by
many other criteria, it was obvious that he was not crazy.
Dr. P.’s problem was eventually diagnosed as visual
form agnosia—an inability to recognize objects—that
was caused by a brain tumor. He perceived the parts of objects but couldn’t identify the whole object, so when Sacks
showed him a glove, Dr. P. described it as “a continuous surface unfolded on itself. It appears to have five outpouchings,
if this is the word.” When Sacks asked him what it was, Dr. P.
hypothesized that it was “a container of some sort. It could
be a change purse, for example, for coins of five sizes.” The
normally easy process of object recognition had, for Dr. P.,
been derailed by his brain tumor. He could perceive the object and recognize parts of it, but couldn’t perceptually assemble the parts in a way that would enable him to recognize
the object as a whole. Cases such as this show that it is important to distinguish between perception and recognition.
Action Action includes motor activities such as mov-
ing the head or eyes and locomoting through the environment. In our example, Ellen looks directly at the moth and
walks toward it (Figure 1.5c). Some researchers see action
as an important outcome of the perceptual process because
of its importance for survival. David Milner and Melvyn
(c) Ellen walks toward the moth.
conscious perception of the
moth. (b) She recognizes the
moth. (c) She takes action by
walking toward the tree to get
a better view.
Goodale (1995) propose that early in the evolution of animals the major goal of visual processing was not to create a
conscious perception or “picture” of the environment, but
to help the animal control navigation, catch prey, avoid obstacles, and detect predators—all crucial functions for the
animal’s survival.
The fact that perception often leads to action—whether
it be an animal’s increasing its vigilance when it hears a twig
snap in the forest or a person’s deciding to look more closely
at something that looks interesting—means that perception
is a continuously changing process. For example, the scene
that Ellen is observing changes every time she shifts her
attention to something else or moves to a new location, or
when something in the scene moves.
The changes that occur as people perceive is the reason the steps of the perceptual process in Figure 1.1 are arranged in a circle. Although we can describe the perceptual
process as a series of steps that “begin” with the environmental stimulus and “end” with perception, recognition,
and action, the overall process is so dynamic and continually changing that it doesn’t really have a beginning point
or an ending point.
Knowledge
Our diagram of the perceptual process also includes a
fourth box—Knowledge. Knowledge is any information
that the perceiver brings to a situation. Knowledge is placed
above the circle because it can affect a number of the steps
in the perceptual process. Information that a person brings
to a situation can be things learned years ago, such as when
Ellen learned to tell the difference between a moth and a
butterfly, or knowledge obtained from events that have just
The Perceptual Process
9
Figure 1.6 ❚ See Perceiving a Picture in the Demonstration
box below for instructions. (Adapted from Bugelski &
Alampay, 1961.)
happened. The following demonstration provides an example of how perception can be influenced by knowledge that
has just been acquired.
D E M O N S T R AT I O N
Perceiving a Picture
After looking at the drawing in Figure 1.6, close your eyes,
turn the page, and open and shut your eyes rapidly to briefly
expose the picture that is in the same location on the
page as the picture above. Decide what the picture is; then
read the explanation below it. Do this now, before reading
further. ❚
Did you identify Figure 1.9 as a rat (or a mouse)? If you
did, you were influenced by the clearly rat- or mouselike
figure you observed initially. But people who first observe
Figure 1.11 (page 14) instead of Figure 1.6 usually identify
Figure 1.9 as a man. (Try this on someone else.) This demonstration, which is called the rat–man demonstration,
shows how recently acquired knowledge (“that pattern is a
rat”) can influence perception.
An example of how knowledge acquired years ago can
influence the perceptual process is the ability to categorize
objects. Thus, Ellen can say “that is a moth” because of her
knowledge of what moths look like. In addition, this knowledge can have perceptual consequences because it might
help her distinguish the moth from the tree trunk. Someone with little knowledge of moths might just see a tree
trunk, without becoming aware of the moth at all.
Another way to describe the effect of information that
the perceiver brings to the situation is by distinguishing
between bottom-up processing and top-down processing.
Bottom-up processing (also called data-based processing) is processing that is based on incoming data. Incoming data always provide the starting point for perception
because without incoming data, there is no perception. For
Ellen, the incoming data are the patterns of light and dark
on her retina created by light reflected from the moth and
the tree (Figure 1.7a).
Top-down processing (also called knowledge-based
processing) refers to processing that is based on knowledge (Figure 1.7b). For Ellen, this knowledge includes what
she knows about moths. Knowledge isn’t always involved in
(b) Existing knowledge
(top down)
“Moth”
(a) Incoming data
(bottom up)
Figure 1.7 ❚ Perception is determined by an interaction between bottom-up processing, which starts
with the image on the receptors, and top-down processing, which brings the observer’s knowledge into
play. In this example, (a) the image of the moth on Ellen’s retina initiates bottom-up processing; and
(b) her prior knowledge of moths contributes to top-down processing.
10
CHAPTER 1
Introduction to Perception
perception but, as we will see, it often is—sometimes without our even being aware of it.
Bottom-up processing is essential for perception because the perceptual process usually begins with stimulation of the receptors.1 Thus, when a pharmacist reads
what to you might look like an unreadable scribble on your
doctor’s prescription, she starts with the patterns that the
doctor’s handwriting creates on her retina. However, once
these bottom-up data have triggered the sequence of steps
of the perceptual process, top-down processing can come
into play as well. The pharmacist sees the squiggles the doctor made on the prescription and then uses her knowledge
of the names of drugs, and perhaps past experience with
this particular doctor’s writing, to help understand the
squiggles. Thus, bottom-up and top-down processing often
work together to create perception.
My students often ask whether top-down processing is
always involved in perception. The answer to this question
is that it is “very often” involved. There are some situations,
typically involving very simple stimuli, in which top-down
processing is probably not involved. For example, perceiving
a single flash of easily visible light is probably not affected
by a person’s prior experience. However, as stimuli become
more complex, the role of top-down processing increases. In
fact, a person’s past experience is usually involved in perception of real-world scenes, even though in most cases the person is unaware of this influence. One of the themes of this
book is that our knowledge of how things usually appear in
the environment can play an important role in determining
what we perceive.
How to Approach the Study
of Perception
The goal of perceptual research is to understand each of the
steps in the perceptual process that lead to perception, recognition, and action. (For simplicity, we will use the term
perception to stand for all of these outcomes in the discussion that follows.) To accomplish this goal, perception has
been studied using two approaches: the psychophysical approach and the physiological approach.
The psychophysical approach to perception was introduced by Gustav Fechner, a physicist who, in his book
Elements of Psychophysics (1860/1966), coined the term psychophysics to refer to the use of quantitative methods to
measure relationships between stimuli (physics) and perception (psycho). These methods are still used today, but because
a number of other, nonquantitative methods are also used,
we will use the term psychophysics more broadly in this book
1
Occasionally perception can occur without stimulation of the receptors.
For example, being hit on the head might cause you to “see stars,” or closing
your eyes and imagining something may cause an experience called “imagery,”
which shares many characteristics of perception (Kosslyn, 1994).
Experience
and action
PH2
Physiological
processes
PP
Stimuli
PH1
Figure 1.8 ❚ Psychophysical (PP) and physiological (PH)
approaches to perception. The three boxes represent the
three major components of the perceptual process (see
Figure 1.1). The three relationships that are usually measured
to study the perceptual process are the psychophysical (PP)
relationship between stimuli and perception, the physiological
(PH1) relationship between stimuli and physiological
processes, and the physiological (PH2) relationship between
physiological processes and perception.
to refer to any measurement of the relationship between
stimuli and perception (PP in Figure 1.8). An example of
research using the psychophysical approach would be measuring the stimulus–perception relationship (PP) by asking
an observer to decide whether two very similar patches of
color are the same or different (Figure 1.10a).
The physiological approach to perception involves
measuring the relationship between stimuli and physiological processes (PH1 in Figure 1.8) and between physiological
processes and perception (PH2 in Figure 1.8). These physiological processes are most often studied by measuring electrical responses in the nervous system, but can also involve
studying anatomy or chemical processes.
An example of measuring the stimulus–physiology
relationship (PH1) is measuring how different colored
lights result in electrical activity generated in neurons in
a cat’s cortex (Figure 1.10b).2 An example of measuring the
physiology–perception relationship (PH2) would be a study
in which a person’s brain activity is measured as the person
describes the color of an object he is seeing (Figure 1.10c).
You will see that although we can distinguish between the psychophysical approach and the physiological approach, these approaches are both working toward
2
Because a great deal of physiological research has been done on cats and
monkeys, students often express concerns about how these animals are
treated. All animal research in the United States follows strict guidelines
for the care of animals established by organizations such as the American
Psychological Association and the Society for Neuroscience. The central
tenet of these guidelines is that every effort should be made to ensure that
animals are not subjected to pain or distress. Research on animals has
provided essential information for developing aids to help people with sensory
disabilities such as blindness and deafness and for helping develop techniques
to ease severe pain.
How to Approach the Study of Perception
11
Perception
PP
Stimulus
PH 1
Stimulus
“Different”
(a)
Figure 1.9 ❚ Did you see a “rat” or a “man”? Looking at the
more ratlike picture in Figure 1.6 increased the chances that
you would see this one as a rat. But if you had first seen the
man version (Figure 1.11), you would have been more likely
to perceive this figure as a man. (Adapted from Bugelski &
Alampay, 1961.)
a common goal—to explain the mechanisms responsible for
perception. Thus, when we measure how a neuron responds
to different colors (relationship PH1) or the relationship
between a person’s brain activity and that person’s perception of colors (relationship PH2), our goal is to explain
the physiology behind how we perceive colors. Anytime we
measure physiological responses, our goal is not simply to
understand how neurons and the brain work; our goal is to
understand how neurons and the brain create perceptions.
As we study perception using both psychophysical and
physiological methods, we will also be concerned with how
the knowledge, memories, and expectations that people
bring to the situation influence their perceptions. These
factors, which we have described as the starting place for
top-down processing, are called cognitive influences on
perception. Researchers study cognitive influences by measuring how knowledge and other factors, such as memories
and expectations, affect each of the three relationships in
Figure 1.8.
For example, consider the rat–man demonstration. If
we were to measure the stimulus–perception relationship
by showing just Figure 1.9 to a number of people, we would
probably find that some people see a rat and some people see
a man. But if we add some “knowledge” by first presenting
the more ratlike picture in Figure 1.6, most people say “rat”
when we present Figure 1.9. Thus, in this example, knowledge has affected the stimulus–perception relationship. As
we describe research using the physiological approach, beginning in Chapter 2, we will see that knowledge can also
affect physiological responding.
One of the things that becomes apparent when we step
back and look at the psychophysical and physiological approaches is that each one provides information about different aspects of the perceptual process. Thus, to truly understand perception, we have to study it using both approaches,
and later in this book we will see how some researchers have
used both approaches in the same experiment. In the remainder of this chapter, we are going to describe some ways
to measure perception at the psychophysical level. In Chapter 2 we will describe basic principles of the physiological
approach.
12
CHAPTER 1
Introduction to Perception
Physiology
Nerve firing
(b)
Perception
PH 2
Physiology
Brain activity
“Red”
(c)
Figure 1.10 ❚ Experiments that measure the relationships
indicated by the arrows in Figure 1.8. (a) The psychophysical
relationship (PP) between stimulus and perception: Two
colored patches are judged to be different. (b) The physiological relationship (PH1) between the stimulus and the
physiological response: A light generates a neural response
in the cat’s cortex. (c) The physiological relationship (PH2)
between the physiological response and perception:
A person’s brain activity is monitored as the person indicates
what he is seeing.
Measuring Perception
We have seen that the psychophysical approach to perception focuses on the relationship between the physical properties of stimuli and the perceptual responses to these stimuli. There
are a number of possible perceptual responses to a stimulus.
Here are some examples taken from experiences that might
occur while watching a college football game.
■
■
Describing: Indicating characteristics of a stimulus.
“All of the people in the student section are wearing
red.”
Recognizing: Placing a stimulus in a specific category.
“Number 12 is the other team’s quarterback.”
■
Detecting: Becoming aware of a barely detectable aspect of a stimulus. “That lineman moved slightly just
before the ball was snapped.”
■
Perceiving magnitude: Being aware of the size or intensity of a stimulus. “That lineman looks twice as big as
our quarterback.”
■
Searching: Looking for a specific stimulus among a
number of other stimuli. “I’m looking for Susan in
the student section.”
We will now describe some of the methods that perception researchers have used to measure each of these ways of
responding to stimuli.
Description
When a researcher asks a person to describe what he or she
is perceiving or to indicate when a particular perception
occurs, the researcher is using the phenomenological
method. This method is a first step in studying perception
because it describes what we perceive. This description can
be at a very basic level, such as when we notice that we can
perceive some objects as being farther away than others,
or that there is a perceptual quality we call “color,” or that
there are different qualities of taste, such as bitter, sweet,
and sour. These are such common observations that we
might take them for granted, but this is where the study
of perception begins, because these are the basic properties
that we are seeking to explain.
Recognition
When we categorize a stimulus by naming it, we are measuring recognition.
M E T H O D ❚ Recognition
Every so often we will introduce a new method by describing
it in a “Method” section like this one. Students are sometimes
tempted to skip these sections because they think the content is
unimportant. However, you should resist this temptation because
these methods are essential tools for the study of perception. These
“Method” sections will help you understand the experiment that
follows and will provide the background for understanding other
experiments later in the book.
The procedure for measuring recognition is simple: A
stimulus is presented, and the observer indicates what it
is. Your response to the rat–man demonstration involved
recognition because you were asked to name what you
saw. This procedure is widely used in testing patients
with brain damage, such as the musician Dr. P. with visual agnosia, described earlier. Often the stimuli in these
experiments are pictures of objects rather than the actual object (thereby avoiding having to bring elephants
and other large objects into the laboratory!).
Describing perceptions using the phenomenological
method and determining a person’s ability to recognize objects provides information about what a person is perceiving. Often, however, it is useful to establish a quantitative
relationship between the stimulus and perception. One way
this has been achieved is by methods designed to measure
the amount of stimulus energy necessary for detecting a
stimulus.
Detection
In Gustav Fechner’s book Elements of Psychophysics, he described a number of quantitative methods for measuring
the relationship between stimuli and perception. These
methods—limits, adjustment, and constant stimuli—are
called the classical psychophysical methods because they
were the original methods used to measure the stimulus–
perception relationship.
The Absolute Threshold The absolute threshold
is the smallest amount of stimulus energy necessary to detect a stimulus. For example, the smallest amount of light
energy that enables a person to just barely detect a flash of
light would be the absolute threshold for seeing that light.
M E T H O D ❚ Determining the Absolute
Threshold
There are three basic methods for determining the absolute threshold: the methods of limits, adjustment, and constant stimuli. In the method of limits, the experimenter
presents stimuli in either ascending order (intensity is
increased) or descending order (intensity is decreased), as
shown in Figure 1.12, which indicates the results of an
experiment that measures a person’s threshold V 1, 2
L
for hearing a tone.
On the first series of trials, the experimenter presents a tone with an intensity of 103, and the observer indicates by a “yes” response that he hears the tone. This
response is indicated by a Y at an intensity of 103 on the
table. The experimenter then presents another tone, at a
lower intensity, and the observer responds to this tone.
This procedure continues, with the observer making a
judgment at each intensity, until he responds “no,” that
he did not hear the tone. This change from “yes” to “no,”
indicated by the dashed line, is the crossover point, and the
threshold for this series is taken as the mean between 99
and 98, or 98.5. The next series of trials begins below the
observer’s threshold, so that he says “no” on the first trial
(intensity 95), and continues until he says “yes” (when
the intensity reaches 100). By repeating this procedure a
number of times, starting above the threshold half the
time and starting below the threshold half the time, the
experimenter can determine the threshold by calculating
the average of all of the crossover points.
Measuring Perception
13
Percentage stimuli detected
100
Figure 1.11 ❚ Man version of the rat–man stimulus.
(Adapted from Bugelski & Alampay, 1961.)
In the method of adjustment, the observer or the experimenter adjusts the stimulus intensity continuously
until the observer can just barely detect the stimulus.
This method differs from the method of limits because
the observer does not say “yes” or “no” as each tone intensity is presented. Instead, the observer simply adjusts
the intensity until he or she can just barely hear the tone.
For example, the observer might be told to turn a knob to
decrease the intensity of a sound, until the sound can no
longer be heard, and then to turn the knob back again so
the sound is just barely audible. This just barely audible intensity is taken as the absolute threshold. This procedure
can be repeated several times and the threshold V 3, 4
L
determined by taking the average setting.
In the method of constant stimuli, the experimenter
presents five to nine stimuli with different intensities in
random order. The results of a hypothetical determi-
1
2
3
4
5
6
7
8
Intensity
103
Y
Y
Y
Y
102
Y
Y
Y
Y
101
Y
Y
Y
Y
Y
100
Y
Y
Y
Y
Y
Y
Y
99
Y
N
Y
N
Y
Y
Y
Y
98
N
N
Y
N
N
N
N
Y
97
N
N
N
N
N
96
N
N
N
N
95
N
N
N
N
Crossover
values
Threshold
0
150
160
170
180
190
200
Light intensity
Figure 1.13 ❚ Results of a hypothetical experiment in
which the threshold for seeing a light is measured by the
method of constant stimuli. The threshold—the intensity at
which the light is seen on half of its presentations—is 180
in this experiment.
nation of the threshold for seeing a light are shown in
Figure 1.13. The data points in this graph were determined by presenting six light intensities 10 times each
and determining the percentage of times that the observer perceived each intensity. The results indicate that
the light with an intensity of 150 was never detected, the
light with an intensity of 200 was always detected, and
lights with intensities in between were sometimes detected and sometimes not detected. The threshold is usually defined as the intensity that results in detection on
50 percent of the trials. Applying this definition to the results in Figure 1.13 indicates that the threshold is V
L 5
an intensity of 180.
The choice among the methods of limits, adjustment, and constant stimuli is usually determined by the
accuracy needed and the amount of time available. The
method of constant stimuli is the most accurate method
because it involves many observations and stimuli are
presented in random order, which minimizes how presentation on one trial can affect the observer’s judgment
of the stimulus presented on the next trial. The disadvantage of this method is that it is time-consuming. The
method of adjustment is faster because observers can determine their threshold in just a few trials by adjusting
the intensity themselves.
98.5 99.5 97.5 99.5 98.5 98.5 98.5 97.5
Threshold = Mean of crossovers = 98.5
Figure 1.12 ❚ The results of an experiment to determine
the threshold using the method of limits. The dashed lines
indicate the crossover point for each sequence of stimuli.
The threshold—the average of the crossover values—is
98.5 in this experiment.
14
50
CHAPTER 1
Introduction to Perception
When Fechner published Elements of Psychophysics, he
not only described his methods for measuring the absolute
threshold but also described the work of Ernst Weber (1795–
1878), a physiologist who, a few years before the publication
of Fechner’s book, measured another type of threshold, the
difference threshold.
The Difference Threshold The difference thresh-
old (called DL from the German Differenze Limen, which is
translated as “difference threshold”) is the smallest difference between two stimuli that a person can detect.
M E T H O D ❚ Determining the Difference
Threshold
Fechner’s methods can be used to determine the difference threshold, except that instead of being asked to
indicate whether they detect a stimulus, participants
are asked to indicate whether they detect a difference between two stimuli. For example, the procedure for measuring the difference threshold for sensing weight is as
100 g
100 g + 2 g
DL = 2 g
(a)
200 g
200 g + 4 g
DL = 4 g
(b)
follows: Weights are presented to each hand, as shown in
Figure 1.14; one is a standard weight, and the other is a
comparison weight. The observer judges, based on weight
alone (he doesn’t see the weights), whether the weights
are the same or different. Then the comparison weight is
increased slightly, and the observer again judges “same”
or “different.” This continues (randomly varying the side
on which the comparison is presented) until the observer
says “different.” The difference threshold is the difference
between the standard and comparison weights V
L 6
when the observer first says “different.”
Weber found that when the difference between the standard and comparison weights was small, his observers found
it difficult to detect the difference in the weights, but they
easily detected larger differences. That much is not surprising, but Weber went further. He found that the size of the DL
depended on the size of the standard weight. For example, if
the DL for a 100-gram weight was 2 grams (an observer could
tell the difference between a 100- and a 102-gram weight,
but could not detect smaller differences), then the DL for a
200-gram weight was 4 grams. Thus, as the magnitude of
the stimulus increases, so does the size of the DL.
Research on a number of senses has shown that over
a fairly large range of intensities, the ratio of the DL to the
standard stimulus is constant. This relationship, which
is based on Weber’s research, was stated mathematically
by Fechner as DL/S ⫽ K and was called Weber’s law. K is a
constant called the Weber fraction, and S is the value of the
standard stimulus. Applying this equation to our previous
example of lifted weights, we find that for the 100-gram
standard, K ⫽ 2 g/100 g ⫽ 0.02, and for the 200-gram standard, K ⫽ 4 g/200 g ⫽ 0.02. Thus, in this example, the Weber
fraction (K) is constant. In fact, numerous modern investigators have found that Weber’s law is true for most senses,
as long as the stimulus intensity is not too close V
L 7, 8
to the threshold (Engen, 1972; Gescheider, 1976).
The Weber fraction remains relatively constant for a
particular sense, but each type of sensory judgment has its
own Weber fraction. For example, from Table 1.1 we can see
that people can detect a 1 percent change in the intensity of
an electric shock but that light intensity must be increased
by 8 percent before they can detect a difference.
TABLE 1.1
❚ Weber Fractions for a Number of
Different Sensory Dimensions
Figure 1.14 ❚ The difference threshold (DL). (a) The
Electric shock
0.01
person can detect the difference between a 100-gram
standard weight and a 102-gram weight but cannot detect
a smaller difference, so the DL is 2 grams. (b) With a 200gram standard weight, the comparison weight must be 204
grams before the person can detect a difference, so the
DL is 4 grams. The Weber fraction, which is the ratio of DL
to the weight of the standard, is constant.
Lifted weight
0.02
Sound intensity
0.04
Light intensity
0.08
Taste (salty)
0.08
Source: Teghtsoonian (1971).
Measuring Perception
15
Magnitude Estimation
If we double the intensity of a tone, does it sound twice
as loud? If we double the intensity of a light, does it look
twice as bright? Although a number of researchers, including Fechner, proposed equations that related perceived
magnitude and stimulus intensity, it wasn’t until 1957 that
S. S. Stevens developed a technique called scaling, or magnitude estimation, that accurately measured this relationship
(S. S. Stevens, 1957, 1961, 1962).
M E T H O D ❚ Magnitude Estimation
The procedure for a magnitude estimation experiment
is relatively simple: The experimenter first presents a
“standard” stimulus to the observer (let’s say a light of
moderate intensity) and assigns it a value of, say, 10; he
or she then presents lights of different intensities, and
the observer is asked to assign a number to each of these
lights that is proportional to the brightness of the standard stimulus. If the light appears twice as bright as
the standard, it gets a rating of 20; half as bright, a 5;
and so on. Thus, each light intensity has a brightness
assigned to it by the observer. There are also magnitude
estimation procedures in which no “standard” is used.
But the basic principle is the same: The observer assigns
numbers to stimuli that are proportional to perceived
magnitude.
The results of a magnitude estimation experiment on
brightness are plotted as the red curve in Figure 1.15. This
graph plots the average magnitude estimates made by
16
CHAPTER 1
Introduction to Perception
80
70
Magnitude estimate
Fechner’s proposal of three psychophysical methods
for measuring the threshold and his statement of Weber’s
law for the difference threshold were extremely important
events in the history of scientific psychology because they
demonstrated that mental activity could be measured quantitatively, which many people in the 1800s thought was impossible. But perhaps the most significant thing about these
methods is that even though they were proposed in the
1800s, they are still used today. In addition to being used
to determine thresholds in research laboratories, simplified
versions of the classical psychophysical methods have been
used to measure people’s detail vision when determining
prescriptions for glasses and measuring people’s hearing
when testing for possible hearing loss.
The classical psychophysical methods were developed
to measure absolute and difference thresholds. But what
about perceptions that occur above threshold? Most of
our everyday experience consists of perceptions that are
far above threshold, when we can easily see and hear what
is happening around us. Measuring these above-threshold
perceptions involves a technique called magnitude estimation.
60
50
40
30
Brightness
Line length
Electric shock
20
10
0
0
10
20
30
40
50
60
70
80
90 100
Stimulus intensity
Figure 1.15 ❚ The relationship between perceived
magnitude and stimulus intensity for electric shock, line
length, and brightness. (Adapted from Stevens, 1962.)
a number of observers of the brightness of a light. This curve
indicates that doubling the intensity does not necessarily
double the perceived brightness. For example, when intensity
is 20, perceived brightness is 28. If we double the intensity to
40, perceived brightness does not double, to 56, but instead
increases only to 36. This result is called response compression. As intensity is increased, the magnitude increases, but
not as rapidly as the intensity. To double the brightness, it is
necessary to multiply the intensity by about 9.
Figure 1.15 also shows the results of magnitude estimation experiments for the sensation caused by an electric shock presented to the finger and for the perception of
length of a line. The electric shock curve bends up, indicating that doubling the strength of a shock more than doubles the sensation of being shocked. Increasing the intensity
from 20 to 40 increases perception of shock sensation from
6 to 49. This is called response expansion. As intensity is
increased, perceptual magnitude increases more than intensity. The curve for estimating line length is straight, with
a slope of close to 1.0, meaning that the magnitude of the
response almost exactly matches increases in the stimulus
(i.e., if the line length is doubled, an observer says it appears
to be twice as long).
The beauty of the relationships derived from magnitude estimation is that the relationship between the intensity of a stimulus and our perception of its magnitude
follows the same general equation for each sense. These
functions, which are called power functions, are described
by the equation P ⫽ KSn. Perceived magnitude, P, equals
a constant, K, times the stimulus intensity, S, raised to a
power, n. This relationship is called Stevens’s power law.
For example, if the exponent, n, is 2.0 and the constant,
K, is 1.0, the perceived magnitude, P, for intensities 10 and
20 would be calculated as follows:
Intensity 10: P ⫽ (1.0) ⫻ (10)2 ⫽ 100
Intensity 20: P ⫽ (1.0) ⫻ (20)2 ⫽ 400
1.8
Brightness
Line length
Electric shock
1.7
Log magnitude estimate
1.6
1.5
1.4
1.3
1.2
1.1
Figure 1.16 ❚ The three functions
from Figure 1.15 plotted on log-log
coordinates. Taking the logarithm of the
magnitude estimates and the logarithm of
the stimulus intensity turns the functions
into straight lines. (Adapted from Stevens,
1962.)
1.0
.9
.8
1.1
1.2
1.3
1.4
1.5
1.6
1.7
Log stimulus intensity
In this example, doubling the intensity results in a
fourfold increase in perceived magnitude, an example of response expansion.
One of the properties of power functions is that taking the logarithm of the terms on the left and right sides
of the equation changes the function into a straight line.
This is shown in Figure 1.16. Plotting the logarithm of the
magnitude estimates in Figure 1.15 versus the logarithm of
the stimulus intensities causes all three curves to become
straight lines. The slopes of the straight lines indicate n, the
exponent of the power function. Remembering our discussion of the three types of curves in Figure 1.15, we can see that
the curve for brightness has a slope of less than 1.0 (response
compression), the curve for estimating line length has a slope
of about 1.0, and the curve for electric shock has a slope of
greater than 1.0 (response expansion). Thus, the relationship
between response magnitude and stimulus intensity is described by a power law for all senses, and the exponent of the
power law indicates whether doubling the stimulus intensity
causes more or less than a doubling of the response.
These exponents not only illustrate that all senses follow the same basic relationship, they also illustrate how the
operation of each sense is adapted to how organisms function in their environment. Consider, for example, your experience of brightness. Imagine you are inside looking at
a page in a book that is brightly illuminated by a lamp on
your desk. Now imagine that you are looking out the window at a bright sidewalk that is brightly illuminated by sunlight. Your eye may be receiving thousands of times more
light from the sidewalk than from the page of your book,
but because the curve for brightness bends down (exponent 0.6), the sidewalk does not appear thousands of times
brighter than the page. It does appear brighter, but not so
much that you are blinded by the sunlit sidewalk. 3
The opposite situation occurs for electric shock, which
has an exponent of 3.5, meaning that small increases in
shock intensity cause large increases in pain. This rapid
increase in pain even to small increases in shock intensity
serves to warn us of impending danger, and we therefore
tend to withdraw even from weak shocks.
Search
So far, we have been describing methods in which the observer is able to make a relatively leisurely perceptual judgment. When a person is asked to indicate whether he or she
can see a light or tell the difference between two weights,
the accuracy of the judgment is what is important, not the
speed at which it is made. However, some perceptual research uses methods that require the observer to respond as
quickly as possible. One example of such a method is visual
search, in which the observer’s task is to find one stimulus
among many, as quickly as possible.
An everyday example of visual search would be searching for a friend’s face in a crowd. If you’ve ever done this, you
know that sometimes it is easy (if you know your friend is
wearing a bright red hat, and no one else is), and sometimes
it is difficult (if there are lots of people and your friend
doesn’t stand out). When we consider visual attention in
Chapter 6, we will describe visual search experiments in
which the observer’s task is to find as rapidly as possible,
a target stimulus that is hidden among a number of other
stimuli. We will see that measuring reaction time—the time
between presentation of the stimulus and the observer’s response to the stimulus—has provided important information about mechanisms responsible for perception.
Other Methods of Measurement
3
Another mechanism that keeps you from being blinded by high-intensity
lights is that your eye adjusts its sensitivity in response to different light levels.
Numerous other methods have been used to measure the
stimulus–perception relationship. For example, in some
Measuring Perception
17
experiments, observers are asked to decide whether two
stimuli are the same or different, or to adjust the brightness
or the colors of two lights so they appear the same, or to
close their eyes and walk, as accurately as possible, to a distant target stimulus in a field. We will encounter methods
such as these, and others as well, as we describe perceptual
research in the chapters that follow.
We’ve seen that we can use psychophysical methods to determine the absolute threshold. For example, by randomly
presenting lights of different intensities, we can use the
method of constant stimuli to determine the intensity to
which a person reports “I see the light” 50 percent of the
time. What determines this threshold intensity? Certainly,
the physiological workings of the person’s eye and visual
system are important. But some researchers have pointed
out that perhaps other characteristics of the person may
also influence the determination of threshold intensity.
To illustrate this idea, let’s consider a hypothetical experiment in which we use the method of constant stimuli
to measure Julie’s and Regina’s thresholds for seeing a light.
We pick five different light intensities, present them in random order, and ask Julie and Regina to say “yes” if they see
the light and “no” if they don’t see it. Julie thinks about
these instructions and decides that she wants to be sure she
doesn’t miss any presentations of the light. She therefore
decides to say “yes” if there is even the slightest possibility that she sees the light. However, Regina responds more
conservatively because she wants to be totally sure that she
sees the light before saying “yes.” She is not willing to report
that she sees the light unless it is clearly visible.
The results of this hypothetical experiment are shown
in Figure 1.17. Julie gives many more “yes” responses than
Regina and therefore ends up with a lower threshold. But
given what we know about Julie and Regina, should we conclude that Julie’s visual system is more sensitive to the lights
than Regina’s? It could be that their actual sensitivity to the
lights is exactly same, but Julie’s apparently lower threshold
occurs because she is more willing than Regina to report
that she sees a light. A way to describe this difference between these two people is that each has a different response
criterion. Julie’s response criterion is low (she says “yes” if
there is the slightest chance a light is present), whereas Regina’s response criterion is high (she says “yes” only when she
is sure that she sees the light).
What are the implications of the fact that people
may have different response criteria? If we are interested
in how one person responds to different stimuli (for example, measuring how a particular person’s threshold varies for
18
CHAPTER 1
Introduction to Perception
Percent “yes” responses
Something to Consider:
Threshold Measurement
Can Be Influenced by How
a Person Chooses to Respond
100
Julie
Regina
50
0
Low
High
Light intensity
Figure 1.17 ❚ Data from experiments is which the threshold
for seeing a light is determined for Julie (green points) and
Regina (red points) by means of the method of constant
stimuli. These data indicate that Julie’s threshold is lower
than Regina’s. But is Julie really more sensitive to the light
than Regina, or does she just appear to be more sensitive
because she is a more liberal responder?
different colors of light), then we don’t need to take response
criterion into account because we are comparing responses
within the same person. Response criterion is also not very
important if we are testing many people and averaging their
responses. However, if we wish to compare two people’s responses, their differing response criteria could influence the
results. Luckily, there is a way to take differing response criteria
into account. This procedure is described in the Appendix,
which discusses signal detection theory.
T E S T YO U R S E L F 1.1
1. What are some of the reasons for studying
perception?
2. Describe the process of perception as a series of
3.
4.
5.
6.
steps, beginning with the environmental stimulus
and culminating in the behavioral responses of perceiving, recognizing, and acting.
What is the role of higher-level or “cognitive”
processes in perception? Be sure you understand
the difference between bottom-up and top-down
processing.
What does it mean to say that perception can be
studied using different approaches?
Describe the different ways people respond perceptually to stimuli and how each of these types of
perceptual response can be measured.
What does it mean to say that a person’s threshold
may be determined by more than the physiological
workings of his or her sensory system?
Sacks, O. (1985). The man who mistook his wife for a hat.
London: Duckworth.
Kolb, B., & Whishaw, I. Q. (2003). Fundamentals of
human neuropsycholog y (5th ed.). New York: Worth.
(Especially see Chapters 13–17, which contain numerous descriptions of how brain damage affects
sensory functioning.)
THINK ABOUT IT
1.
2.
This chapter argues that although perception seems
simple, it is actually extremely complex when we consider “behind the scenes” activities that are not obvious
as a person is experiencing perception. Cite an example
of a similar situation from your own experience, in
which an “outcome” that might seem as though it was
achieved easily actually involved a complicated process
that most people are unaware of. (p. 5)
3.
Phenomenological method. David Katz’s book provides
excellent examples of how the phenomenological
method has been used to determine the experiences
that occur under various stimulus conditions. He
also describes how surfaces, color, and light combine
to create many different perceptions. (p. 13)
Katz, D. (1935). The world of color (2nd ed., R. B. MacLeod & C. W. Fox, Trans.). London: Kegan Paul,
Trench, Truber.
4.
Top-down processing. There are many examples of how
people’s knowledge can influence perception, ranging
from early research, which focused on how people’s
motivation can influence perception, to more recent
research, which has emphasized the effects of context
and past learning on perception. (p. 10)
Postman, L., Bruner, J. S., & McGinnis, E. (1948).
Personal values as selective factors in perception. Journal of Abnormal and Social Psycholog y, 43,
142–154.
Vernon, M. D. (1962). The psycholog y of perception.
Baltimore: Penguin. (See Chapter 11, “The Relation
of Perception to Motivation and Emotion.”)
Describe a situation in which you initially thought you
saw or heard something but then realized that your
initial perception was in error. What was the role of
bottom-up and top-down processing in this example of
first having an incorrect perception and then realizing
what was actually there? (p. 10)
IF YOU WANT TO KNOW MORE
1.
History. The study of perception played an extremely
important role in the development of scientific psychology in the fi rst half of the 20th century. (p. 16)
Boring, E. G. (1942). Sensation and perception in
the history of experimental psycholog y. New York:
Appleton-Century-Crofts.
2.
Disorders of recognition. Dr. P.’s case, in which he had
problems recognizing people, is just one example of
many such cases of people with brain damage. In addition to reports in the research literature, there are
a number of popular accounts of these cases written
for the general public. (p. 13)
KEY TERMS
Absolute threshold (p. 13)
Action (p. 9)
Attended stimulus (p. 6)
Bottom-up processing (data-based
processing) (p. 10)
Classical psychophysical methods
(p. 13)
Cognitive influences on perception
(p. 12)
Difference threshold (p. 15)
Environmental stimulus (p. 5)
Knowledge (p. 9)
Magnitude estimation (p. 16)
Method of adjustment (p. 14)
Method of constant stimuli (p. 14)
Method of limits (p. 13)
Neural processing (p. 7)
Perception (p. 8)
Perceptual process (p. 5)
Phenomenological method (p. 13)
Physiological approach to perception
(p. 11)
Power function (p. 16)
Psychophysical approach to
perception (p. 11)
Psychophysics (p. 11)
Rat–man demonstration (p. 10)
Reaction time (p. 17)
MEDIA RESOURCES
The Sensation and Perception
Book Companion Website
www.cengage.com/psychology/goldstein
See the companion website for flashcards, practice quiz
questions, Internet links, updates, critical thinking
exercises, discussion forums, games, and more!
Recognition (p. 8)
Response compression (p. 16)
Response criterion (p. 18)
Response expansion (p. 16)
Signal detection theory (p. 18)
Stevens’s power law (p. 16)
Top-down processing (knowledgebased processing) (p. 10)
Transduction (p. 7)
Visual form agnosia (p. 9)
Visual search (p. 17)
Weber fraction (p. 15)
Weber’s law ( p. 15)
CengageNOW
www.cengage.com/cengagenow
Go to this site for the link to CengageNOW, your one-stop
shop. Take a pre-test for this chapter, and CengageNOW
will generate a personalized study plan based on your test
results. The study plan will identify the topics you need
Media Resources
19
to review and direct you to online resources to help you
master those topics. You can then take a post-test to help
you determine the concepts you have mastered and what
you will still need to work on.
VL
Virtual Lab
Your Virtual Lab is designed to help you get the most out
of this course. The Virtual Lab icons direct you to specific
media demonstrations and experiments designed to help
you visualize what you are reading about. The number
beside each icon indicates the number of the media element
you can access through your CD-ROM, CengageNOW, or
WebTutor resource.
The following lab exercises are related to material in
this chapter:
1. The Method of Limits How a “typical” observer might
respond using the method of limits procedure to measure
absolute threshold.
2. Measuring Illusions An experiment that enables you to
measure the size of the Müller-Lyer, horizontal–vertical,
20
CHAPTER 1
Introduction to Perception
and simultaneous contrast illusions using the method
of constant stimuli. The simultaneous contrast illusion
is described in Chapter 3, and the Müller-Lyer illusion is
described in Chapter 10.
3. Measurement Fluctuation and Error How our judgments
of size can vary from trial to trial.
Adjustment and PSE Measuring the point of subjective
equality for line length using the method of adjustment.
5. Method of Constant Stimuli Measuring the difference
threshold for line length using the method of constant
stimuli.
6. Just Noticeable Difference Measuring the just noticeable
difference (roughly the same thing as difference threshold)
for area, length, and saturation of color.
7. Weber’s Law and Weber Fraction Plotting the graph that
shows how Weber’s fraction remains constant for different
weights.
8. DL vs. Weight Plotting the graph that shows how the difference threshold changes for different weights.
4.
This page intentionally left blank
Chapter Contents
C H A P T E R
2
THE BRAIN: THE MIND’S COMPUTER
Brief History of the Physiological Approach
Basic Structure of the Brain
NEURONS: CELLS THAT CREATE AND
TRANSMIT ELECTRICAL SIGNALS
Structure of Neurons
Recording Electrical Signals in Neurons
METHOD: Recording From a Neuron
Chemical Basis of Action Potentials
Basic Properties of Action Potentials
Events at the Synapse
❚ TEST YOURSELF 2.1
NEURAL PROCESSING: EXCITATION,
INHIBITION, AND INTERACTIONS
BETWEEN NEURONS
Excitation, Inhibition, and Neural
Responding
Introduction to Receptive Fields
Introduction
to the
Physiology
of Perception
METHOD: Determining a Neuron’s
Receptive Field
THE SENSORY CODE: HOW THE
ENVIRONMENT IS REPRESENTED BY
THE FIRING OF NEURONS
Specificity Coding: Representation by the
Firing of Single Neurons
Distributed Coding: Representation by the
Firing of Groups of Neurons
Sparse Coding: Distributed Coding With
Just a Few Neurons
SOMETHING TO CONSIDER: THE
MIND–BODY PROBLEM
❚ TEST YOURSELF 2.2
Think About It
If You Want to Know More
Key Terms
Media Resources
VL VIRTUAL LAB
Neurons, such as the ones shown here, form the
communication and processing network of the nervous system.
Understanding how neurons respond to perceptual stimuli is central
to our understanding of the physiological basis of perception.
OPPOSITE PAGE
Copyright 2006 National Academy of Sciences, USA.
VL The Virtual Lab icons direct you to specific animations and videos
designed to help you visualize what you are reading about. The number beside
each icon indicates the number of the clip you can access through your
CD-ROM or your student website.
23
est to us—perceptions. But the idea that the brain controls
mental functioning is a relatively modern idea.
Some Questions We Will Consider:
❚ How are physiological processes involved in
perception? (p. 24)
Brief History of the Physiological
Approach
❚ How can electrical signals in the nervous system
represent objects in the environment? (p. 36)
Early thinking about the physiology of the mind focused on
determining the anatomical structures involved in the operation of the mind.
I
n Chapter 1 we saw that electrical signals are the link
between the environment and perception. The stimulus,
first in the environment and then on the receptors, creates
electrical signals in the nervous system, which through a
miraculous and still not completely understood process become transformed into experiences like the colors of a sunset, the roughness of sandpaper, or smells from the kitchen.
Much of the research you will read about in this book is
concerned with understanding the connection between
electrical signals in the nervous system and perception. The
purpose of this chapter is to introduce you to the physiological approach to the study of perception and to provide
the background you will need to understand the physiological material in the rest of the book.
Early Hypotheses About the Seat of the
Mind In the fourth century B.C. the philosopher Aristo-
tle (384–322 B.C.) stated that the heart was the seat of the
mind and the soul (Figure 2.1a). The Greek physician Galen
(ca. A.D. 130–200) saw human health, thoughts, and emotions as being determined by four different “spirits” flowing from the ventricles—cavities in the center of the brain
(Figure 2.1b). This idea was accepted all the way through the
Middle Ages and into the Renaissance in the 1500s and early
1600s. In the early 1630s the philosopher Rene Descartes,
although still accepting the idea of flowing spirits, specified
the pineal gland, which was thought to be located over the
ventricles, as the seat of the soul (Figure 2.1c; Zimmer, 2004).
The Brain: The Mind’s
Computer
The Brain As the Seat of the Mind In 1664
Thomas Willis, a physician at the University of Oxford,
published a book titled The Anatomy of the Brain, which was
based on dissections of the brains of humans, dogs, sheep,
and other animals. Willis concluded that the brain was responsible for mental functioning, that different functions
Today we take it for granted that the brain is the seat of the
mind: the structure responsible for mental functions such
as memory, thoughts, language, and—of particular inter-
Ventricles
Pineal
gland
Heart
Brain
“Spirits”
Aristotle, 4th century B.C.
(a)
Galen, 2nd century
(b)
Golgi stained neuron
Adrian, 1920s
(f)
Willis, 1664
(d)
(c)
Single-neuron recording
Golgi, 1870s
(e)
Descartes, 1630s
Neural networks
Modern
(g)
Figure 2.1 ❚ Some notable ideas and events regarding the physiological workings of the mind.
24
CHAPTER 2
Introduction to the Physiology of Perception
were located in different regions of the brain, and that disorders of the brain were disorders of chemistry (Figure 2.1d).
Although these conclusions were correct, details of the
mechanisms involved had to await the development of new
technologies that would enable researchers to see the brain’s
microstructure and record its electrical signals.
Signals Traveling in Neurons One of the most
From Neurons Details about how
single neurons operate had to await the development of
electronic amplifiers that were powerful enough to make
visible the extremely small electrical signals generated by
the neuron. In the 1920s Edgar Adrian (1928, 1932) was able
to record electrical signals from single sensory neurons, an
achievement for which he was awarded the Nobel Prize in
1932 (Figure 2.1f).
We can appreciate the importance of being able to record from single neurons by considering the following
analogy: You walk into a large room in which hundreds of
people are talking about a political speech they have just
heard. There is a great deal of noise and commotion in the
room as people react to the speech. However, based on just
hearing this “crowd noise,” all you can say about what is
Recording
© Clouds Hill Imaging Ltd./CORBIS
important problems to be solved was determining the structure of the nervous system. In the 1800s, there were two
opposing ideas about the nervous system. One idea, called
reticular theory, held that the nervous system consisted of
a large network of fused nerve cells. The other idea, neuron
theory, stated that the nervous system consisted of distinct
elements or cells.
An important development that led to the acceptance
of neuron theory was the discovery of staining, a chemical
technique that caused nerve cells to become colored so they
stood out from surrounding tissue. Camillo Golgi (1873)
developed a technique in which immersing a thin slice of
brain tissue in a solution of silver nitrate created pictures
like the one in Figure 2.2 in which individual cells were randomly stained (Figure 2.1e). What made this technique so
useful was that only a few cells were stained, and the ones
that were stained were stained completely, so it was possible
to see the structure of the entire neuron. Golgi received the
Nobel Prize for his research in 1906.
What about the signals in these neurons? By the late
1800s, researchers had shown that a wave of electricity is
transmitted in groups of neurons, such as the optic nerve.
To explain how these electrical signals result in different perceptions, Johannes Mueller in 1842 proposed the
doctrine of specific nerve energies, which stated that our
perceptions depend on “nerve energies” reaching the brain
and that the specific quality we experience depends on
which nerves are stimulated. Thus he proposed that activity
in the optic nerve results in seeing, activity in the auditory
nerve results in hearing, and so on. By the end of the 1800s,
this idea had expanded to conclude that nerves from each of
these senses reach different areas of the brain. This idea
of separating different functions is still a central principle
of nervous system functioning.
Figure 2.2 ❚ A portion of the brain that has been treated with Golgi stain shows the shapes of a few neurons.
The arrow points to a neuron’s cell body. The thin lines are dendrites or axons (see Figure 2.4).
The Brain: The Mind’s Computer
25
going on is that the speech seems to have generated a great
deal of excitement. To get more specific information about
the speech, you need to listen to what individual people are
saying.
Just as listening to individual people provides valuable information about what is happening in a large crowd,
recording from single neurons provides valuable information about what is happening in the nervous system. Recording from single neurons is like listening to individual voices.
It is, of course, important to record from as many neurons
as possible because just as individual people may have different opinions about the speech, different neurons may respond differently to a particular stimulus or situation.
The ability to record electrical signals from individual
neurons ushered in the modern era of brain research, and
in the 1950s and 1960s development of more sophisticated
electronics and the availability of computers and the electron microscope made more detailed analysis of how neurons function possible. Most of the physiological research
we will describe in this book had its beginning at this point,
when it became possible to determine how individual neurons respond to stimuli in the environment and how neurons work together in neural networks (Figure 2.1g). We will
now briefly describe the overall layout of the brain, to give
you an overview of the entire system, and then zoom in to
look in detail at some basic principles of neuron structure
and operation.
Basic Structure of the Brain
In the first American textbook of psychology, Harvard
psychologist William James (1890/1981) described the
brain as the “most mysterious thing in the world” because
of the amazing feats it achieves and the intricacies of how
it achieves them. Although we are far from understanding
all of the details of how the brain operates, we have learned
a tremendous amount about how the brain determines our
perceptions. Much of the research on the connection between the brain and perception has focused on activity in
the cerebral cortex, the 2-mm-thick layer that covers the
surface of the brain and contains the machinery for creating perception, as well as for other functions, such as language, memory, and thinking. A basic principle of cortical
function is modular organization—specific functions are
served by specific areas of the cortex.
One example of modular organization is how the senses
are organized into primary receiving areas, the first areas
in the cerebral cortex to receive the signals initiated by each
sense’s receptors (Figure 2.3). The primary receiving area for
vision occupies most of the occipital lobe; the area for hearing is located in part of the temporal lobe; and the area for
the skin senses—touch, temperature, and pain—is located in
an area in the parietal lobe. The frontal lobe receives signals
from all of the senses, and plays an important role in perceptions that involve the coordination of information received
through two or more senses. As we study each sense in detail,
we will see that other areas in addition to the primary receiving areas are also associated with each sense. For example,
26
CHAPTER 2
Parietal lobe
(skin senses)
Frontal
lobe
Occipital
lobe
(vision)
Temporal lobe
(hearing)
Figure 2.3 ❚ The human brain, showing the locations of
the primary receiving areas for the senses in the temporal,
occipital, and parietal lobes, and the frontal lobe, which is
involved with integrating sensory functions.
there is an area in the temporal lobe concerned with the perception of form. We will consider the functioning of various
areas of the brain in Chapters 3 and 4. In this chapter we will
focus on describing the properties of neurons.
Neurons: Cells That Create and
Transmit Electrical Signals
One purpose of neurons that are involved in perception is
to respond to stimuli from the environment, and transduce
these stimuli into electrical signals (see the transduction step
of the perceptual process, page 7). Another purpose is to
communicate with other neurons, so that these signals can
travel long distances (see the transmission step of the perceptual process, page 7).
Structure of Neurons
The key components of neurons are shown in the neuron on
the right in Figure 2.4. The cell body contains mechanisms
to keep the cell alive; dendrites branch out from the cell
body to receive electrical signals from other neurons; and
the axon, or nerve fiber, is fi lled with fluid that conducts
electrical signals. There are variations on this basic neuron
structure: Some neurons have long axons; others have short
axons or none at all. Especially important for perception
are a type of neuron called receptors, which are specialized
to respond to environmental stimuli such as pressure for
touch, as in the neuron on the left in Figure 2.4. Figure 2.5
shows examples of receptors that are specialized for
responding to (a) light (vision); (b) pressure changes in the
air (hearing); (c) pressure on the skin (touch); (d) chemicals
in the air (smell); and (e) chemicals in liquid form (taste).
Although these receptors look different, they all have
Introduction to the Physiology of Perception
Touch receptor
Stimulus from
environment
Dendrite
Nerve fiber
Axon or nerve fiber
Synapse
Cell body
Electrical
signal
Figure 2.4 ❚ The neuron on the right consists of a cell body, dendrites, and an axon, or nerve fiber. The neuron on the left that
receives stimuli from the environment has a receptor in place of the cell body.
*
*
*
*
*
(a) Vision
(b) Hearing
(c) Touch
(d) Smell
(e) Taste
Figure 2.5 ❚ Receptors for (a) vision, (b) hearing, (c) touch, (d) smell, and (e) taste. Each of these receptors is specialized to
transduce a specific type of environmental energy into electricity. Stars indicate the place on the receptor neuron where the
stimulus acts to begin the process of transduction.
something in common: Part of each receptor, indicated by the
star, reacts to environmental stimuli and triggers the generation of electrical signals, which eventually are transmitted to neurons with axons, like the one on the right in
VL 1
Figure 2.4.
M E T H O D ❚ Recording From a Neuron
Microelectrodes, small shafts of glass or metal with very
fine tips, are used to record signals from single neurons.
The key principle for understanding electrical signals in
neurons is that we are always measuring the difference in
charge between two electrodes. One of these electrodes, located where the electrical signals will occur, is the recording electrode, shown on the left in Figure 2.7a.1 The other
one, located some distance away so it is not affected by
the electrical signals, is the reference electrode. The difference in charge between the recording and reference electrodes is displayed on an oscilloscope, which indicates the
Recording Electrical Signals
in Neurons
We will be particularly concerned with recording the electrical signals from the axons of neurons. It is important
to distinguish between single neurons, like the ones shown
in Figure 2.4, and nerves. A nerve, such as the optic nerve,
which carries signals out the back of the eye, consists of the
axons (or nerve fibers) of many neurons (Figure 2.6), just as
many individual wires make up a telephone cable. Thus, recording from an optic nerve fiber involves recording not from
the optic nerve as a whole, but from one of the small fibers
within the optic nerve.
1
In practice, most recordings are achieved with the tip of the electrode
positioned just outside the neuron because it is technically difficult to insert
electrodes into the neuron, especially if it is small. However, if the electrode tip
is close enough to the neuron, the electrode can pick up the signals generated
by the neuron.
Neurons: Cells That Create and Transmit Electrical Signals
27
difference in charge by a small dot that creates a line as it
moves across the screen, as shown on the right in
VL 2
Figure 2.7a.
When the nerve fiber is at rest, the oscilloscope records
a difference in potential of ⫺70 millivolts (where a millivolt
is 1/1000 of a volt), as shown on the right in Figure 2.7a. This
value, which stays the same as long as there are no signals in
the neuron, is called the resting potential. In other words,
the inside of the neuron is 70 mV negative compared to the
outside, and remains that way as long as the neuron
VL 3
is at rest.
Figure 2.7b shows what happens when the neuron’s receptor is stimulated so that a signal is transmitted down
the axon. As the signal passes the recording electrode, the
charge inside the axon rises to ⫹40 millivolts compared to
the outside. As the signal continues past the electrode, the
charge inside the fiber reverses course and starts becoming
negative again (Figure 2.7c), until it returns to the resting
level (Figure 2.7d). This signal, which is called the action
potential, lasts about 1 millisecond (1/1000 second).
Nerve
Nerve fiber
Figure 2.6 ❚ Nerves contain many nerve fibers. The optic
nerve transmits signals out the back of the eye. Shown here
schematically in cross section, the optic nerve actually
contains about 1 million nerve fibers.
Recording electrode
(inside axon)
Push
Meter
Reference
electrode
(outside
axon)
Resting
potential
–70
Time
Pressure-sensitive receptor
Charge inside fiber relative to outside (mV)
(a)
Nerve
impulse
(b)
+40
–70
–70
(c)
Back at
resting
level
–70
(d)
28
CHAPTER 2
Introduction to the Physiology of Perception
Figure 2.7 ❚ (a) When a nerve fiber is at
rest, there is a difference in charge
of ⫺70 mV between the inside and the
outside of the fiber. This difference is
measured by the meter on the left; the
difference in charge measured by the meter
is displayed on the right. (b) As the nerve
impulse, indicated by the red band, passes
the electrode, the inside of the fiber near
the electrode becomes more positive. This
positivity is the rising phase of the action
potential. (c) As the nerve impulse moves
past the electrode, the charge inside the
fiber becomes more negative. This is the
falling phase of the action potential.
(d) Eventually the neuron returns to its
resting state.
Chemical Basis of Action Potentials
in water. For example, adding table salt (sodium chloride,
NaCl) to water creates positively charged sodium ions (Na⫹)
and negatively charged chlorine ions (Cl⫺). The solution
outside the axon of a neuron is rich in positively charged
sodium (Na⫹) ions, whereas the solution inside the axon is
rich in positively charged potassium (K⫹) ions.
Remember that the action potential is a rapid increase
in positive charge until the inside of the neuron is ⫹40
mV compared to the outside, followed by a rapid return to
the baseline of ⫺70 mV. These changes are caused by the
flow of sodium and potassium ions across the cell membrane. Figure 2.9 shows the action potential from Figure
2.7, and also shows how the action potential is created by
the flow of sodium and potassium ions. First sodium flows
into the fiber, then potassium flows out, and this sequence of
sodium-in, potassium-out continues as the action
VL 4
potential travels down the axon.
The records on the right in Figure 2.9 show how this
flow of sodium and potassium is translated into a change of
the charge inside the axon. The upward phase of the action
potential—the change from ⫺70 to ⫹40 mV—occurs when
positively charged sodium ions rush into the axon (Figure
2.9a). The downward phase of the potential—the change from
⫹40 back to ⫺70 mV—occurs when positively charged potassium ions rush out of the axon (Figure 2.9b). Once the action
potential has passed the electrode, the charge inside the fiber
returns to the resting potential of ⫺70 mV (Figure 2.9c).
The changes in sodium and potassium flow that create
the action potential are caused by changes in the fiber’s permeability to sodium and potassium. Permeability is a property of the cell membrane that refers to the ease with which
When most people think of electrical signals, they imagine signals conducted along electrical power lines or along
the wires used for household appliances. We learn as young
children that we should keep electrical wires away from
liquid. But the electrical signals in neurons are created by
and conducted through liquid.
The key to understanding the “wet” electrical signals
transmitted by neurons is understanding the components of
the neuron’s liquid environment. Neurons are surrounded
by a solution rich in ions, molecules that carry an electrical
charge (Figure 2.8). Ions are created when molecules gain or
lose electrons, as happens when compounds are dissolved
K+
Na+
Na+
Na+
Na+
K+
K+
K+
Na+
K+
Na+
K+
Na+
Na+
Na+
Figure 2.8 ❚ A nerve fiber, showing the high concentration
of sodium outside the fiber and potassium inside the fiber.
Other ions, such as negatively charged chlorine, are not
shown.
+40
K+
Charge inside fiber relative to outside (mV)
Na+
(a)
K+
Na+
(b)
K+
Na+
(c)
Sodium
flows
into axon
–70
Potassium
flows out
of axon
–70
Back at
resting
level
–70
Figure 2.9 ❚ How the flow of sodium and
potassium create the action potential.
(a) As positively charged sodium (Na⫹)
flows into the axon, the inside of the neuron
becomes more positive (rising phase of the
action potential). (b) As positively charged
potassium (K⫹) flows out of the axon, the
inside of the axon becomes more negative
(falling phase of the action potential). (c) The
fiber’s charge returns to the resting level
after the flow of Na⫹ and K⫹ has moved past
the electrode.
Neurons: Cells That Create and Transmit Electrical Signals
29
a molecule can pass through the membrane. Selective permeability occurs when a membrane is highly permeable to
one specific type of molecule, but not to others.
Before the action potential occurs, the membrane’s permeability to sodium and potassium is low, so there is little
flow of these molecules across the membrane. Stimulation
of the receptor triggers a process that causes the membrane
to become selectively permeable to sodium, so sodium flows
into the axon. When the action potential reaches ⫹40 mV,
the membrane suddenly becomes selectively permeable to
potassium, so potassium flows out of the axon. The action
potential, therefore, is caused by changes in the axon’s selective permeability to sodium and potassium.2
Basic Properties of Action Potentials
An important property of the action potential is that it is
a propagated response—once the response is triggered,
it travels all the way down the axon without decreasing in
size. This means that if we were to move our recording electrode in Figure 2.7 or 2.9 to a position nearer the end of the
axon, the response recorded as the action potential passed
the electrode would still be an increase from ⫺70 mV to
⫹40 mV and then a decrease back to ⫺70 mV. This is an extremely important property of the action potential because
it enables neurons to transmit signals over long distances.
Another property is that the action potential remains
the same size no matter how intense the stimulus is. We can
demonstrate this by determining how the neuron fires to
different stimulus intensities. Figure 2.10 shows what happens when we do this. Each action potential appears as a
(a)
(b)
(c)
Time
Pressure on
Pressure off
Figure 2.10 ❚ Response of a nerve fiber to (a) soft, (b)
medium, and (c) strong stimulation. Increasing the stimulus
strength increases both the rate and the regularity of nerve
firing in this fiber.
2
If this process were to continue, there would be a buildup of sodium inside
the axon and potassium outside the axon. This buildup is prevented by a
mechanism called the sodium–potassium pump, which is constantly transporting
sodium out of the axon and potassium into the axon.
30
CHAPTER 2
sharp spike in these records because we have compressed
the time scale to display a number of action potentials.
The three records in Figure 2.10 represent the axon’s response to three intensities of pushing on the skin. Figure
2.10a shows how the axon responds to gentle stimulation applied to the skin, and Figures 2.10b and 2.10c show how the
response changes as the pressure is increased. Comparing
these three records leads to an important conclusion: Changing the stimulus intensity does not affect the size of V
L 5
the action potentials but does affect the rate of firing.
Although increasing the stimulus intensity can increase
the rate of firing, there is an upper limit to the number of
nerve impulses per second that can be conducted down an
axon. This limit occurs because of a property of the axon
called the refractory period—the interval between the time
one nerve impulse occurs and the next one can be generated
in the axon. Because the refractory period for most neurons
is about 1 ms, the upper limit of a neuron’s firing rate is
about 500 to 800 impulses per second.
Another important property of action potentials is
illustrated by the beginning of each of the records in Figure
2.10. Notice that a few action potentials are occurring even
before the pressure stimulus is applied. The action potentials that occur in the absence of stimuli from the environment is called spontaneous activity. This spontaneous activity establishes a baseline level of firing for the neuron. The
presence of stimulation usually causes an increase in activity above this spontaneous level, but under some conditions
it can cause firing to decrease below the spontaneous level.
What do these properties of the action potential
mean in terms of their function for perceiving? The action
potential’s function is to communicate information. Increasing the stimulation of a receptor can cause a change in
the rate of nerve firing, usually an increase in firing above
the baseline level, but sometimes a decrease below the baseline level. These changes in nerve firing can therefore provide
information about the intensity of a stimulus. But if this
information remains within a single neuron, it serves no
function. In order to be meaningful, this information must
be transmitted to other neurons and eventually to the brain
or other organs that can react to the information.
The idea that the action potential in one neuron must
be transmitted to other neurons poses the following problem: Once an action potential reaches the end of the axon,
how is the message that the action potential carries transmitted to other neurons? The problem is that there is a very
small space between the neurons, known as a synapse (Figure 2.11). The discovery of the synapse raised the question of
how the electrical signals generated by one neuron are transmitted across the space separating the neurons. As we will
see, the answer lies in a remarkable chemical process
VL 6
that involves molecules called neurotransmitters.
Events at the Synapse
Early in the 1900s, it was discovered that when action potentials reach the end of a neuron, they trigger the release
Introduction to the Physiology of Perception
of chemicals called neurotransmitters that are stored in
structures called synaptic vesicles in the sending neuron
(Figure 2.11b). The neurotransmitter molecules flow into
the synapse to small areas on the receiving neuron called
receptor sites that are sensitive to specific neurotransmitters (Figure 2.11c). These receptor sites exist in a variety of
shapes that match the shapes of particular neurotransmitter molecules. When a neurotransmitter makes contact with
a receptor site matching its shape, it activates the receptor
site and triggers a voltage change in the receiving neuron.
A neurotransmitter is like a key that fits a specific lock. It
has an effect on the receiving neuron only when its shape
matches that of the receptor site.
Thus, when an electrical signal reaches the synapse, it
triggers a chemical process that in turn triggers a change in
voltage in the receiving neuron. The direction of this voltage
change depends on the type of transmitter that is released
Receiving
neuron
Receptor
Axon
Stimulus
Nerve
impulse
and the nature of the cell body of the receiving neuron.
Excitatory transmitters cause the inside of the neuron
to become more positive, a process called depolarization.
Figure 2.12a shows this effect. In this example, the neuron
becomes slightly more positive. Notice, however, that this
response is much smaller than the positive action potential. To generate an action potential, enough excitatory neurotransmitter must be released to increase depolarization to
the level indicated by the dashed line. Once depolarization
reaches that level, an action potential is triggered. Because
depolarization can trigger an action potential, it is called
an excitatory response.
Inhibitory transmitters cause the inside of the neuron
to become more negative, a process called hyperpolarization. Figure 2.12b shows this effect. Hyperpolarization is
considered an inhibitory response because it can prevent
the neuron from reaching the level of depolarization needed
to generate action potentials.
We can summarize this description of the effects of excitatory and inhibitory transmitters as follows: The release of
excitatory transmitters increases the chances that a neuron
will generate action potentials and is associated with high
rates of nerve firing. The release of inhibitory transmitters
decreases the chances that a neuron will generate action potentials and is associated with lowering rates of nerve firing.
Since a typical neuron receives both excitatory and inhibi-
Neurotransmitter
molecules
Charge inside fiber
(a)
Receiving
neuron
Synaptic vesicle
Axon of
sending neuron
0
Level of
depolarization
needed to
trigger an
action potential
–70
Depolarization
(Excitatory)
(a)
(b)
Neurotransmitter
molecules
Receptor site
Charge inside fiber
0
Level to
trigger an
action potential
–70
Hyperpolarization
(Inhibitory)
(c)
(b)
Figure 2.11 ❚ Synaptic transmission from one neuron to
another. (a) A signal traveling down the axon of a neuron
reaches the synapse at the end of the axon. (b) The nerve
impulse causes the release of neurotransmitter molecules
from the synaptic vesicles of the sending neuron. (c) The
neurotransmitters fit into receptor sites and cause a voltage
change in the receiving neuron.
Figure 2.12 ❚ (a) Excitatory transmitters cause
depolarization, an increased positive charge inside the
neuron. (b) Inhibitory transmitters cause hyperpolarization, an
increased negative charge inside the axon. The charge inside
the axon must reach the dashed line to trigger an action
potential.
Neurons: Cells That Create and Transmit Electrical Signals
31
tory transmitters, the response of the neuron is determined
by the interplay of excitation and inhibition, as illustrated in
Figure 2.13. In Figure 2.13a excitation (E) is much stronger
than inhibition (I), so the neuron’s firing rate is high. However, as inhibition becomes stronger and excitation becomes
weaker, the neuron’s firing decreases, until in Figure 2.13e
inhibition has eliminated the neuron’s spontaneous V
L 7
activity and has decreased firing to zero.
Why does inhibition exist? If one of the functions of a
neuron is to transmit its information to other neurons, why
would the action potentials in one neuron cause the release
of inhibitory transmitter that decreases or eliminates nerve
firing in the next neuron? The answer to this question is
that the function of neurons is not only to transmit information but also to process it (see the processing step of the
perceptual process, page 7), and both excitation and inhibition are necessary for this processing.
T E S T YO U R S E L F 2 .1
1. Describe the history of the physiological approach,
2.
3.
4.
5.
Excitation stronger
6.
Electrode
E
(a)
7.
I
starting with the idea that the heart is the seat of the
mind and leading to recording from single neurons.
Define “modular organization of the brain” and give
some examples.
Describe the basic structure of a neuron.
Describe how to record electrical signals from a
neuron.
Describe what happens when an action potential
travels along an axon. In your description indicate
how the flow of chemicals causes electrical signals.
What are some of the basic properties of action
potentials?
How are electrical signals transmitted from one
neuron to another? Be sure you understand the
difference between excitatory and inhibitory
responses.
E
(b)
Neural Processing: Excitation,
Inhibition, and Interactions
Between Neurons
I
E
(c)
In our description of perceptual processing in Chapter 1 we
said that neural processing can transform the signals generated by the receptors (see page 7). The first step in understanding this process is to look at how excitation and inhibition work together in neural circuits. Neural circuits are
groups of interconnected neurons. A neural circuit can consist of just a few neurons or many hundreds or thousands of
neurons. To introduce the basic principles of neural circuits
we will describe a few simple circuits. For our example we
will use circuits that have receptors that respond to light.
I
E
(d)
I
E
(e)
I
Inhibition stronger
Stimulus on
Stimulus off
Figure 2.13 ❚ Effect of excitatory (E) and inhibitory (I)
input on the firing rate of a neuron. The amount of excitatory
and inhibitory input to the neuron is indicated by the size of
the arrows at the synapse. The responses recorded by the
electrode are indicated by the records on the right. The firing
that occurs before the stimulus is presented is spontaneous
activity. In (a) the neuron receives only excitatory transmitter,
which causes the neuron to fire. In (b) to (e) the amount
of excitatory transmitter decreases while the amount of
inhibitory transmitter increases. As inhibition becomes
stronger relative to excitation, firing rate decreases, until
eventually the neuron stops firing.
32
CHAPTER 2
Excitation, Inhibition, and
Neural Responding
We will first describe a simple neural circuit and then increase the complexity of this circuit in two stages, noting
how this increased complexity affects the circuit’s response
to the light. First, consider the circuit in Figure 2.14. This
circuit shows seven receptors (indicated by blue ellipses),
each of which synapses with another neuron (cell bodies,
indicated by red circles). All seven of the synapses are
VL 8
excitatory (indicated by Y’s).
We begin by illuminating receptor 4 with a spot of light
and recording the response of neuron B. We then change
this spot into a bar of light by adding light to illuminate receptors 3, 4, and 5 (3 through 5), then receptors 2 through 6,
Introduction to the Physiology of Perception
2
3
4
5
6
7
7
Firing rate of “B”
1
B
6
5
4
3
2
1
4
3–5
2–6
1–7
Receptors stimulated
and finally receptors 1 through 7. The response of neuron
B, indicated on the graph in Figure 2.14, indicates that this
neuron fires when we stimulate receptor 4 but that stimulating the other receptors has no effect on neuron B because
it is only connected to receptor 4. Thus, the firing of neuron
B simply indicates that its receptor has been stimulated and
doesn’t provide any further information about the light.
We now increase the complexity of the circuit by adding
a property called convergence—the synapsing of more than
one neuron onto a single neuron. In this circuit, shown
in Figure 2.15, receptors 1 and 2 converge onto neuron A;
6 and 7 converge onto C; and 3, 4, and 5, and A and C converge onto B. As in the previous circuit, all of the synapses
are excitatory; but, with the addition of convergence, cell B
now collects information from all of the receptors. When we
monitor the firing rate of neuron B, we find that each time
we increase the length of the stimulus, neuron B’s firing rate
2
3
4
5
6
A
C
increases, as shown in the graph in Figure 2.15. This occurs
because stimulating more receptors increases the amount of
excitatory transmitter released onto neuron B. Thus, in this
circuit, neuron B’s response provides information about the
length of the stimulus.
We now increase the circuit’s complexity further by
adding two inhibitory synapses (indicated by T’s) to create
the circuit in Figure 2.16, in which neurons A and C inhibit
neuron B. Now consider what happens as we increase the
number of receptors stimulated. The spot of light stimulates receptor 4, which, through its excitatory connection,
increases the firing rate of neuron B. Extending the illumination to include receptors 3 through 5 adds the output
of two more excitatory synapses to B and increases its firing. So far, this circuit is behaving similarly to the circuit
in Figure 2.15. However, when we extend the illumination further to also include receptors 2 and 6, something
7
7
Firing rate of “B”
1
B
6
5
4
3
2
1
4
3–5
2–6
1–7
Receptors stimulated
2
3
4
5
6
A
C
B
Inhibitory
synapse
Figure 2.15 ❚ Circuit with convergence added.
Neuron B now receives inputs from all of the
receptors, so increasing the size of the stimulus
increases the size of neuron B’s response.
7
7
Firing rate of “B”
1
Figure 2.14 ❚ Left: A circuit with no convergence.
Right: Response of neuron B as we increase the number
of receptors stimulated.
6
5
4
3
2
Figure 2.16 ❚ Circuit with convergence and
1
4
3–5
2–6
1–7
Receptors stimulated
inhibition. Because stimulation of the receptors on
the side (1, 2, 6, and 7) sends inhibition to neuron B,
neuron B responds best when just the center
receptors (3–5) are stimulated.
Neural Processing: Excitation, Inhibition, and Interactions Between Neurons
33
different happens: Receptors 2 and 6 stimulate neurons A
and C, which, in turn, releases inhibitory transmitter onto
neuron B, which decreases its firing rate. Increasing the size
of the stimulus again to also illuminate receptors 1 and 7
increases the amount of inhibition and further decreases
the response of neuron B.
In this circuit, neuron B fires weakly to small stimuli (a
spot illuminating only receptor 4) or longer stimuli (a long
bar illuminating receptors 1 through 7) and fires best to a
stimulus of medium length (a shorter bar illuminating receptors 3 through 5). The combination of convergence and
inhibition has therefore caused neuron B to respond best to a
light stimulus of a specific size. The neurons that synapse with
neuron B are therefore doing much more than simply transmitting electrical signals; they are acting as part of a neural
circuit that enables the firing of neuron B to provide information about the stimulus falling on the receptors. The firing
of a neuron like B might, for example, help signal the presence of a small spot of light, or a detail of a larger pattern.
So far our example has been theoretical. However, there
is evidence that processing occurs in the nervous system,
which involves convergence and inhibition, as shown in
Figures 2.15 and 2.16. This evidence has been obtained by
measuring a property of neurons called the neuron’s receptive field.
C′
A
B
B′
C
Record signal
from optic
nerve fiber
A′
A
B
C
Figure 2.17 ❚ Recording electrical signals from a fiber in
the optic nerve of an anesthetized cat. Each point on the
screen corresponds to a point on the cat’s retina.
A
A
Introduction to Receptive Fields
C –
– B –
+ –
– –
The receptive field of a neuron is the area on the receptors
that influences the firing rate of the neuron. To describe
receptive fields we will use the example of visual receptors,
which line the back of the eye within the retina, and the optic nerve, which transmits signals out of the eye (Figure 2.6).
B
C
(a)
On
Off
A
M E T H O D ❚ Determining a Neuron’s
–
– + + –
– + + + –
+ +
–
–
–
Receptive Field
We will measure a receptive field of a neuron by stimulating a cat’s retina with light and recording from a nerve
fiber in the cat’s optic nerve. Our goal is to determine the
areas of the retina that, when stimulated, affect the firing of this neuron. The cat is anesthetized, and its eyes
are focused on a screen like the one shown in Figure 2.17.
Stimuli are presented on the screen, and since the cat’s
eye remains stationary, each of the stimuli on the screen
is imaged on points on the cat’s retina that correspond
to points on the screen. Thus, a stimulus at point A on
the screen creates an image on point A' on the retina, B
creates an image on B', and C on C'.
The first step in determining the receptive field is
to flash a small spot of light on the screen. Figure 2.18a
shows that when light is flashed anywhere within area
A on the screen, nothing happens (the signal shown is
spontaneous activity). However, flashing spots of light in
area B causes an increase in the neuron’s firing (indicated
by ⫹), and flashing lights in area C causes a decrease in
34
CHAPTER 2
(b)
Receptive
field
Figure 2.18 ❚ (a) Response of a ganglion cell in the cat’s
retina to stimulation: outside the cell’s receptive field
(area A on the screen); inside the excitatory area of the
cell’s receptive field (area B); inside the inhibitory area of
the cell’s receptive field (area C). (b) The receptive field is
shown without the screen.
firing (indicated by –). Since stimulating anywhere in
area B causes an increase in the neuron’s firing rate, this
is called the excitatory area of the neuron’s receptive field.
Since stimulating in area C causes a decrease in firing rate,
this is called the inhibitory area of the neuron’s receptive
field. Remembering that the definition of the receptive
Introduction to the Physiology of Perception
field is any area, stimulation of which influences the firing of
the neuron, we conclude that areas B and C make up the
receptive field of the neuron, as shown in
VL 9, 10, 11
Figure 2.18b.
The receptive field in Figure 2.18 is called a centersurround receptive field because the areas of the receptive field are arranged in a center region that responds
one way and a surround region that responds in the opposite way. This particular receptive field is an excitatorycenter-inhibitory-surround receptive field. There are also
inhibitory-center-excitatory-surround receptive fields in
which stimulating the center decreases firing and stimulating the surround increases firing.
The fact that the center and the surround of the receptive field respond in opposite ways causes an effect called
center-surround antagonism. This effect is illustrated in
Figure 2.19, which shows what happens as we increase the
size of a spot of light presented to the neuron’s receptive
field. A small spot that is presented to the excitatory center of the receptive field causes a small increase in the rate
of nerve firing (a), and increasing the light’s size so that it
covers the entire center of the receptive field increases the
cell’s response, as shown in (b). (We have used the term cell
in place of neuron here. Because neurons are a type of cell,
the word cell is often substituted for neuron in the research
literature. In this book, we will often use these V
L 12, 13
terms interchangeably.)
Center-surround antagonism comes into play when the
spot of light becomes large enough so that it begins to cover
the inhibitory area, as in (c) and (d). Stimulation of the inhibitory surround counteracts the center’s excitatory response, causing a decrease in the neuron’s firing rate. Thus,
this neuron responds best to a spot of light that is the size of
the excitatory center of the receptive field. Notice that this
–
–
sequence of increased firing when the spot size is increased
and then decreased firing when the spot size is increased further is similar to what happened when we increased the number of receptors stimulated in the circuit of Figure 2.16. The
neural circuit that created the receptive field in Figure 2.18
is a more complex version of our hypothetical circuit in
Figure 2.16, but the basic principle of how excitation, inhibition, and convergence can determine how a neuron responds to stimuli is the same.
Center-surround receptive fields also occur in neurons
in the skin. Figure 2.20 shows the receptive field of a neuron
Inhibitory
surround
Excitatory
center
Figure 2.20 ❚ An excitatory-center-inhibitory-surround
receptive field of a neuron in the monkey’s thalamus. Note
that just as in the visual system, the receptive field of the
neuron is the area on the receptors (which are located
just below the skin in this example) that, when stimulated,
influence the responding of the neuron.
–
+
+
–
+
+
–
–
On
On
(a)
On
(b)
On
(c)
(d)
Figure 2.19 ❚ Response of an excitatory-center-inhibitory-surround receptive field as
stimulus size is increased. Shading indicates the area stimulated with light. The response
to the stimulus is indicated below each receptive field. The largest response occurs
when the entire excitatory area is illuminated, as in (b). Increasing stimulus size further
causes a decrease in firing due to center-surround antagonism. (Adapted from Hubel
and Wiesel, 1961.)
Neural Processing: Excitation, Inhibition, and Interactions Between Neurons
35
receiving signals from receptors in a monkey’s skin that increases its firing when the center area of a monkey’s arm is
touched, and decreases firing when the surrounding area is
touched. Just as for the visual neuron, center-surround antagonism also occurs for this neuron. This means that this
neuron responds best to a small stimulus presented to the
center of the neuron’s receptive field on the skin.
A human perceives a stimulus (a sound, a taste,
etc.). This is explained by the electrical impulses
sent to the brain. This is so incomprehensible,
so amazing. How can one electrical impulse be
perceived as the taste of a sour lemon, another
impulse as a jumble of brilliant blues and greens
and reds, and still another as bitter, cold wind?
Can our whole complex range of sensations be
explained by just the electrical impulses stimulating the brain? How can all of these varied and
very concrete sensations—the ranges of perceptions of heat and cold, colors, sounds, fragrances
and tastes—be merely and so abstractly explained
by differing electrical impulses?
The Sensory Code: How the
Environment Is Represented
by the Firing of Neurons
We have seen that receptive fields enable us to specify a neuron’s response. A neuron’s receptive field indicates the location on the receptor surface (the retina or skin) that causes a
neuron to respond and the size or shape of the stimulus that
causes the best response.
But while acknowledging the importance of how receptive fields indicate the properties of neurons, let’s not
lose sight of the fact that we are interested in perception!
We are interested not just in how neurons work, but in how
the information in nerve impulses represents the things we
perceive in the environment. The idea that nerve impulses
can represent things in the environment is what is behind
the following statement, written by a student in my class,
Bernita Rabinowitz.
Stimulus
Neuron 1
Neuron 2
Bernita’s question is an eloquent statement of the problem of
sensory coding: How does the firing of neurons represent various characteristics of the environment? One answer that
has been proposed to this question is called specificity coding.
Specificity Coding: Representation
by the Firing of Single Neurons
Specificity coding is the representation of particular objects
in the environment by the firing of neurons that are tuned
to respond specifically to that object. To illustrate how this
works, let’s consider how specificity coding might signal the
presence of the people’s faces in Figure 2.21. According to
Neuron 3
(a) Bill
(b) Mary
Figure 2.21 ❚ How faces could be coded
by specificity coding. Each face causes one
specialized neuron to respond.
(c) Raphael
36
CHAPTER 2
Introduction to the Physiology of Perception
Parahippocampal
cortex
From left to right: © Evan Agostini/Getty Images; © Doane Gregory/Warner Bros./Bureau L.A. Collection/Corbis
specificity coding, Bill’s face would be represented by the firing of a neuron that responds only to Bill’s face (Figure 2.21a).
Notice that neuron 1, which we could call a “Bill neuron,”
does not respond to Mary’s face (Figure 2.21b) or Raphael’s
face (Figure 2.21c). In addition, other faces or types of objects
would not affect this neuron. It fires only to Bill’s face.
One of the requirements of specificity theory is that
there are neurons that are specifically tuned to each object in
the environment. This means that there would also have to
be a neuron that fires only to Mary’s face (Figure 2.21b) and
another neuron that fires only to Raphael’s face (right column of Figure 2.21c). The idea that there are single neurons
that each respond only to a specific stimulus was proposed
in the 1960s by Jerzy Konorski (1967) and Jerry Lettvin (see
Barlow, 1995; Gross, 2002; Rose, 1996). Lettvin coined the
term “grandmother cell” to describe this highly specific type
of cell. A grandmother cell, according to Lettvin, is a neuron that responds only to a specific stimulus. This stimulus
could be a specific image, such as a picture of your grandmother, or a concept, such as the idea of grandmothers in
general (Gross, 2002). The neurons in Figure 2.21 would
qualify as grandmother cells.
Is there any evidence for grandmother-type cells? Until
recently, there was little evidence for neurons that respond
to only one specific stimulus. However, recently, R. Quian
Quiroga and coworkers (2005) recorded from neurons that
respond to very specific stimuli. They recorded from eight
patients with epilepsy who had electrodes implanted in
their hippocampus or medial temporal lobe (MTL) to help
localize precisely where their seizures originated, in preparation for surgery (Figure 2.22).
Patients saw a number of different views of specific individuals and objects plus pictures of other things, such as
faces, buildings, and animals. Not surprisingly, a number
of neurons responded to some of these stimuli. What was
surprising, however, was that some neurons responded to a
number of different views of just one person or building. For
example, one neuron responded to all pictures of the actress
Jennifer Aniston alone, but did not respond to faces of other
famous people, nonfamous people, landmarks, animals, or
other objects. Another neuron responded to pictures of actress Halle Berry. This neuron responded not only to photographs of Halle Berry, but to drawings of her, pictures of her
dressed as Catwoman from Batman, and also to the words
“Halle Berry” (Figure 2.22b). A third neuron responded to
numerous views of the Sydney Opera House.
What is amazing about these neurons is that they respond to many different views of the stimulus, different
modes of depiction, and even words signifying the stimulus. These neurons, therefore, are not responding to visual
features of the pictures, but to concepts—“Jennifer Aniston,”
“Halle Berry,” “Sydney Opera House”—that the stimuli
represent.
It is important to note that these neurons were in the
hippocampus and MTL—areas associated with the storage
of memories. This function of these structures, plus the fact
that the people who were tested had a history of past expe-
Amygdala
Entorhinal
cortex
Hippocampus
(a)
Halle Berry
(b)
Figure 2.22 ❚ (a) Location of the hippocampus and some
of the other structures that were studied by Quiroga and
coworkers (2005). (b) Some of the stimuli that caused a
neuron in the hippocampus to fire.
riences with these stimuli, may mean that familiar, wellremembered objects may be represented by the firing of just
a few very specialized neurons.
But are these neurons grandmother cells? According
to Quiroga and coworkers (2008), the answer is “no.” They
point out that it is unlikely that these neurons respond to
only a single object or concept, for a number of reasons. First,
if there were only one (or a few) neurons that responded to
a particular person or concept, it would be extremely difficult to find it among the many hundreds of millions of neurons in the structures they were studying. One way to think
about this is in terms of the proverbial difficulty of finding
a needle in a haystack. Just as it would be extremely difficult
to find a needle in a haystack, it would also be very difficult
to find a neuron that responded only to Jennifer Aniston
among the huge number of neurons in the hippocampus.
Quiroga and coworkers (2005, 2008) also point out that
if they had had time to present more pictures, they might
have found other stimuli that caused their neurons to fire;
they estimate that cells in the areas they were studying probably respond to 50–150 different individuals or objects. The
idea that neurons respond to a number of different stimuli
is the basis of distributed coding, the idea that a particular object is represented not by the firing of a single neuron, but
by the firing of groups of neurons.
The Sensory Code: How the Environment Is Represented by the Firing of Neurons
37
Distributed Coding: Representation
by the Firing of Groups of Neurons
Sparse Coding: Distributed Coding
With Just a Few Neurons
Distributed coding is the representation of a particular object by the pattern of firing of groups of neurons. According
to this idea, Bill’s face might be represented by the pattern
of firing of neurons 1, 2, and 3 shown in Figure 2.23a. Reading across the top row, we see that neuron 1 has a high firing
rate, and neurons 2 and 3 have lower firing rates. Mary’s face
would be represented by a different pattern (Figure 2.23b),
and Raphael’s face by another pattern (Figure 2.23c). One
of the advantages of distributed coding is that it doesn’t require a specialized neuron for every object in the environment, as specificity coding does. Instead, distributed coding allows the representation of a large number of stimuli
by the firing of just a few neurons. In our example, the firing
of three neurons signals three faces, but these three neurons
could help signal other faces as well. For example, these
neurons can also signal Roger’s face, with another pattern
of firing (Figure 2.23d).
One question we can ask about distributed coding is, “If an
object is represented by the pattern of firing in a group of
neurons, how many neurons are there in this group?” Is a
particular face indicated by the pattern of firing in thousands of neurons or in just a few? The idea that a particular
object is represented by the firing of a relatively small number of neurons is called sparse coding.
One way to describe how sparse coding works in the
nervous system is that it is somewhere between specificity
coding, in which an object is represented by the firing of one
type of very specialized neuron, and distributed coding, in
which an object is represented by the pattern of firing of a
large group of neurons. The neurons described by Quiroga
and coworkers (2005, 2008) that probably respond to a small
number of objects provide an example of sparse coding, and
there is other evidence that the code for representing objects in the visual system, tones in the auditory system, and
Stimulus
Neuron 1
Neuron 2
Neuron 3
(a) Bill
(b) Mary
(c) Raphael
Figure 2.23 ❚ How faces could be coded
by distributed coding. Each face causes all
the neurons to fire, but the pattern of firing
is different for each face. One advantage
of this method of coding is that many faces
could be represented by the firing of the
three neurons.
(d) Roger
38
CHAPTER 2
Introduction to the Physiology of Perception
odors in the olfactory system may, in fact, involve the pattern of activity across a relatively small number of neurons,
as sparse coding suggests (Olshausen & Field, 2004).
Correlation
Something to Consider:
The Mind–Body Problem
One of the most famous problems in science is called the
mind–body problem: How do physical processes such as
nerve impulses or sodium and potassium molecules flowing
across membranes (the body part of the problem) become
transformed into the richness of perceptual experience (the
mind part of the problem)?
The mind–body problem is what my student Bernita
was asking about when she posed her question about how
heat and cold, colors, sounds, fragrances and tastes can be
explained by differing electrical impulses. One way to answer Bernita’s question would be to explain how different
perceptions might be represented by the firing of specialized neurons or by the pattern of firing of groups of neurons. Research that focuses on determining connections between stimuli in the environment and the firing of neurons
is often referred to as research on the neural correlate of
consciousness (NCC), where consciousness can be roughly
defined as our experiences.
Does determining the NCC qualify as a solution to
the mind–body problem? Researchers often call fi nding
the NCC the easy problem of consciousness because it has
been possible to discover many connections between neural firing and experience (Figure 2.24a). But if NCC is the
“easy” problem, what is the “hard” problem? We encounter
the hard problem when we approach Bernita’s question at
a deeper level by asking not how physiological responses
correlate with experience, but how physiological responses
cause experience. To put it another way, how do physiological responses become transformed into experience? We can
appreciate why this is called the hard problem of consciousness by stating it in molecular terms: How do sodium and potassium ions flowing across a membrane or the
nerve impulses that result from this flow become the perception of a person’s face or the experience of the color red
(Figure 2.24b)?
Although researchers have been working to determine
the physiological basis of perception for more than a century, the hard version of the mind–body problem is still
unsolved. The first difficulty lies in figuring out how to
go about studying the problem. Just looking for correlations may not be enough to determine how physiological
processes cause experience. Because of the difficulty, most
researchers have focused on determining the NCC. That
doesn’t mean the hard version of the mind–body problem
will never be solved. Many researchers believe that doing research on the easy problem (which, after all, isn’t really that
easy) will eventually lead to a solution to the hard problem (see Baars, 2001; Block, in press; Crick & Koch, 2003).
“red”
Susan’s face
“Experience”
(a) “Easy” problem
Na+
Cause
“red”
Susan’s face
“Experience”
(b) “Hard” problem
Figure 2.24 ❚ (a) Solving the “easy” problem of
consciousness involves looking for connections between
physiological responding and experiences such as perceiving
“red” or “Susan’s face.” This is also called the search
for the neural correlate of consciousness. (b) Solving the
“hard” problem of consciousness involves determining how
physiological processes such as ions flowing across the
nerve membrane cause us to have experiences.
For now, there is quite a bit of work to be done on the easy
problem. This approach to the physiology of perception is
what the rest of this book is about.
T E S T YO U R S E L F 2 . 2
1. Why is interaction between neurons necessary for
2.
3.
4.
5.
6.
neural processing? Be sure to understand how convergence and inhibition work together to achieve
processing.
What is a neuron’s receptive field, and how is it
measured?
Describe a center-surround receptive field, and explain how center-surround antagonism affects firing
as a stimulus spot is made bigger.
What is the sensory code? Describe specificity coding and distributed coding. Which type of coding is
most likely to operate in sensory systems?
What is sparse coding? Can coding be both distributed and sparse?
What is the mind–body problem? What is the difference between the “easy” problem of consciousness
and the “hard” problem of consciousness?
Something to Consider: The Mind–Body Problem
39
But recent research has revealed that for some neurons, there are areas outside of the “classic” receptive
field that cause no change in the fi ring of a neuron
when stimulated alone, but can nonetheless influence
the neuron’s response to stimulation of an area inside
the “classic” receptive field. (p. 34)
Vinje, B. V., & Gallant, J. L. (2002). Natural stimulation of the non-classical receptive field increases
information transmission efficiency in V1. Journal
of Neuroscience, 22, 2904–2915.
Zipser, K., Lamme, V. A. F., & Schiller, P. H. (1996).
Contextual modulation in primary visual cortex.
Journal of Neuroscience, 16, 7376–7389.
THINK ABOUT IT
1.
Because the long axons of neurons look like electrical
wires, and both neurons and electrical wires conduct
electricity, it is tempting to equate the two. Compare
the functioning of axons and electrical wires in terms
of their structure and the nature of the electrical signals they conduct. (p. 29)
IF YOU WANT TO KNOW MORE
1.
Beyond “classic” receptive fields. We defi ned a neuron’s receptive field as “the area on the receptors that,
when stimulated, influences the fi ring of the neuron.”
KEY TERMS
Action potential (p. 28)
Axon (p. 26)
Cell body (p. 26)
Center-surround antagonism (p. 35)
Center-surround receptive field
(p. 35)
Cerebral cortex (p. 26)
Convergence (p. 33)
Dendrites (p. 26)
Depolarization (p. 31)
Distributed coding (p. 38)
Doctrine of specific nerve energies
(p. 25)
Easy problem of consciousness (p. 39)
Excitatory area (p. 34)
Excitatory response (p. 31)
Excitatory transmitter (p. 31)
Excitatory-center-inhibitorysurround receptive field (p. 35)
Frontal lobe (p. 26)
Grandmother cell (p. 37)
Hard problem of consciousness
(p. 39)
Hyperpolarization (p. 31)
Inhibitory area (p. 34)
Inhibitory response (p. 31)
Inhibitory transmitter (p. 31)
Inhibitory-center-excitatorysurround receptive field (p. 35)
Ions (p. 29)
Microelectrode (p. 27)
Mind–body problem (p. 39)
Modular organization (p. 26)
Nerve (p. 27)
Nerve fiber (p. 26)
Neural circuits (p. 32)
Neural correlate of consciousness
(NCC) (p. 39)
Neuron theory (p. 25)
Neurotransmitter (p. 31)
MEDIA RESOURCES
The Sensation and Perception
Book Companion Website
www.cengage.com/psychology/goldstein
See the companion website for flashcards, practice quiz
questions, Internet links, updates, critical thinking exercises, discussion forums, games, and more!
CengageNOW
www.cengage.com/cengagenow
Go to this site for the link to CengageNOW, your one-stop
shop. Take a pre-test for this chapter, and CengageNOW
will generate a personalized study plan based on your test
results. The study plan will identify the topics you need to
review and direct you to online resources to help you master those topics. You can then take a post-test to help you
determine the concepts you have mastered and what you
will still need to work on.
40
CHAPTER 2
Occipital lobe (p. 26)
Parietal lobe (p. 26)
Permeability (p. 29)
Pineal gland (p. 24)
Primary receiving areas (p. 26)
Propagated response (p. 30)
Receptive field (p. 34)
Receptor sites (p. 31)
Receptors (p. 26)
Refractory period (p. 30)
Resting potential (p. 28)
Reticular theory (p. 25)
Selective permeability (p. 30)
Sparse coding (p. 38)
Specificity coding (p. 36)
Spontaneous activity (p. 30)
Staining (p. 25)
Synapse (p. 30)
Temporal lobe (p. 26)
Ventricles (p. 24)
Virtual Lab
VL
Your Virtual Lab is designed to help you get the most out
of this course. The Virtual Lab icons direct you to specific
media demonstrations and experiments designed to help
you visualize what you are reading about. The number
beside each icon indicates the number of the media element
you can access through your CD-ROM, CengageNOW, or
WebTutor resource.
The following lab exercises are related to material in
this chapter:
Structure of a Neuron Functions of different parts of a
neuron.
1.
Oscilloscopes and Intracellular Recording How nerve
potentials are displayed on an oscilloscope.
2.
Resting Potential Demonstrates the difference in charge
between the inside and outside of the neuron when it is not
conducting impulses.
3.
Introduction to the Physiology of Perception
4.
Phases of Action Potential How sodium and potassium flow across the axon membrane during the action
potential.
5. Nerve Impulse Coding and Stimulus Strength How neural
activity changes as the intensity of a stimulus is varied.
9.
Synaptic Transmission How electrical signals are transmitted from one neuron to another.
7. Excitation and Inhibition How excitation and inhibition
interact to determine the firing rate of the postsynaptic
neuron.
8. Simple Neural Circuits Presenting lights to receptors in
three neural circuits illustrates how adding convergence
and inhibition influences neural responding.
10.
6.
Receptive Fields of Retinal Ganglion Cells A classic 1972
fi lm in which vision research pioneer Colin Blakemore describes the neurons in the retina and how center-surround
receptive fields of ganglion cells are recorded from the cat’s
retina.
Mapping Receptive Fields Mapping the receptive field of
a retinal ganglion cell.
11. Receptive Field Mapping Mapping the receptive fields of
ganglion cells, LGN neurons, and cortical neurons.
12. Stimulus Size and Receptive Fields How the size of a
stimulus relative to the receptive field affects the size of the
neural response.
Media Resources
41
Chapter Contents
C H A P T E R
3
FOCUSING LIGHT ONTO THE RETINA
Light: The Stimulus for Vision
The Eye
Light Is Focused by the Eye
DEMONSTRATION : Becoming Aware of
What Is in Focus
TRANSFORMING LIGHT INTO
ELECTRICITY
The Visual Receptors and Transduction
How Does Transduction Occur?
Introduction
to Vision
PIGMENTS AND PERCEPTION
Distribution of the Rods and Cones
DEMONSTRATION : Becoming Aware of the
Blind Spot
DEMONSTRATION : Filling in the Blind
Spot
Dark Adaptation of the Rods and Cones
METHOD : Measuring Dark Adaptation
Spectral Sensitivity of the Rods and Cones
❚ TEST YOURSELF 3.1
NEURAL CONVERGENCE AND
PERCEPTION
Why Rods Result in Greater Sensitivity
Than Cones
Why We Use Our Cones to See Details
DEMONSTRATION : Foveal Versus
Peripheral Acuity
LATERAL INHIBITION AND
PERCEPTION
What the Horseshoe Crab Teaches Us About
Inhibition
Lateral Inhibition and Lightness
Perception
DEMONSTRATION : Creating Mach Bands
in Shadows
DEMONSTRATION : Simultaneous Contrast
A Display That Can’t Be Explained by
Lateral Inhibition
SOMETHING TO CONSIDER:
PERCEPTION IS INDIRECT
❚ TEST YOURSELF 3.2
Think About It
If You Want to Know More
Key Terms
Media Resources
VL VIRTUAL LAB
This painting, Arcturus II by Victor Vasarely, consists
of colored squares stacked one on top of the other. The diagonals we
perceive radiating from the center of these patterns are not actually in
the physical stimulus, but they are perceived because of interactions
between excitation and inhibition in the visual system.
OPPOSITE PAGE
Hirshhorn Museum and Sculpture Garden, Smithsonian Institution, Gift of Joseph H. Hirshhorn, 1972.
Photographer, Lee Stalsworth.
VL The Virtual Lab icons direct you to specific animations and videos
designed to help you visualize what you are reading about. The number beside
each icon indicates the number of the clip you can access through your
CD-ROM or your student website.
43
of a meter) to long-wavelength radio waves (wavelength about 104 meters, or 10,000 meters).
Visible light, the energy within the electromagnetic
spectrum that humans can perceive, has wavelengths ranging from about 400 to 700 nanometers (nm), where 1 nanometer 109 meters. For humans and some other animals,
the wavelength of visible light is associated with the different colors of the spectrum. Although we will usually specify
light in terms of its wavelength, light can also be described as
consisting of small packets of energy called photons, with one
photon being the smallest possible packet of light energy.
Some Questions We Will Consider:
❚ How do chemicals in the eye called visual pigments
affect our perception? (p. 47)
❚ How does the way neurons are “wired up” affect our
perception? (p. 58)
❚ What do we mean when we say that perception is
indirect? (p. 68)
N
ow that we know something about the psychophysical approach to perception (Chapter 1) and basic
physiological principles (Chapter 2), we are ready to apply
these approaches to the study of perception. In this chapter we describe what happens at the very beginning of the
visual system, beginning when light enters the eye, and in
Chapter 4 we will consider processes that occur in V
L 1
the visual areas of the brain.
The Eye
The eye is where vision begins. Light reflected from objects
in the environment enters the eye through the pupil and is
focused by the cornea and lens to form sharp images of the
objects on the retina, which contains the receptors V
L 2
for vision (Figure 3.2a).
There are two kinds of visual receptors, rods and
cones, which contain light-sensitive chemicals called visual
pigments that react to light and trigger electrical signals.
These signals flow through the network of neurons that
make up the retina (Figure 3.2b). The signals then emerge
from the back of the eye in the optic nerve, which conducts
signals toward the brain. The cornea and lens at the front of
the eye and the receptors and neurons in the retina lining the
back of the eye shape what we see by creating the transformations that occur at the beginning of the perceptual process.
Focusing Light Onto
the Retina
Vision begins when visible light is reflected from objects
into the eye.
Light: The Stimulus for Vision
Vision is based on visible light, which is a band of energy
within the electromagnetic spectrum. The electromagnetic spectrum is a continuum of electromagnetic energy
that is produced by electric charges and is radiated as waves
(Figure 3.1). The energy in this spectrum can be described by
its wavelength—the distance between the peaks of the electromagnetic waves. The wavelengths in the electromagnetic
spectrum range from extremely short–wavelength gamma
rays (wavelength about 1012 meters, or one ten-billionth
400
Light Is Focused by the Eye
Once light is reflected from an object into the eye, it needs
to be focused onto the retina. The cornea, the transparent
covering of the front of the eye, accounts for about 80 percent of the eye’s focusing power, but like the lenses in eyeglasses, it is fi xed in place, so can’t adjust its focus. The lens,
which supplies the remaining 20 percent of the eye’s focus-
500
700
Visible
light
Ultraviolet
Gamma rays
10–3
600
Ultraviolet
rays
X-rays
10–1
101
Infrared
Infrared
rays
103
105
Radar
FM
TV
AM
AC
circuits
107
109
1011
1013
1015
Wavelength (nm)
Figure 3.1 ❚ The electromagnetic spectrum, showing the wide range of energy in the environment and the
small range within this spectrum, called visible light, that we can see. The wavelength is in nanometers (nm),
where 1 nm 10 9 meters.
44
CHAPTER 3
Introduction to Vision
Optic nerve fibers
Receptor cells
(rods and cones)
Back of eye
Light
Rod
Pupil
Cornea
Cone
Fovea (point of
central focus)
Optic nerve
Retina
Pigment
epithelium
Lens
Retina
(a)
(b)
Figure 3.2 ❚ An image of the cup is focused on the retina, which lines the back of the eye. The close-up of the retina on the
right shows the receptors and other neurons that make up the retina.
ing power, can change its shape to adjust the eye’s focus for
stimuli located at different distances.
We can understand how the lens adjusts its focus by first
considering what happens when the eye is relaxed and a person views a small object that is far away. If the object is located more than about 20 feet away, the light rays that reach
the eye are essentially parallel (Figure 3.3a), and these parallel
rays are brought to a focus on the retina at point A. But if the
object moves closer to the eye, the light rays reflected from
this object enter the eye at more of an angle, which pushes the
focus point back to point B (Figure 3.3b). However, the light
is stopped by the back of the eye before it reaches point B, so
the image on the retina is out of focus. If things remained in
this state, the person would see the object as blurred.
A process called accommodation keeps this from happening. The ciliary muscles at the front of the eye tighten
and increase the curvature of the lens so that it gets thicker
(Figure 3.3c). This increased curvature bends the light rays
passing through the lens to pull the focus point back to A
to create a sharp image on the retina.
Lens
Retina
Cornea
A
(a) Object far—
eye relaxed
Focus on retina
Moving object
closer pushes
focus point back
B
(b) Object near—
eye relaxed
Focus behind retina
Accommodation
brings focus
point forward
D E M O N S T R AT I O N
A
Becoming Aware of What Is in Focus
Accommodation occurs unconsciously, so you are usually
unaware that the lens is constantly changing its focusing
power so you can see clearly at different distances. This
unconscious focusing process works so efficiently that most
people assume that everything, near and far, is always in
focus. You can demonstrate that this is not so by holding a
pencil point up, at arm’s length, and looking at an object that
is at least 20 feet away. As you look at the faraway object,
move the pencil point toward you without actually looking at
it (stay focused on the far object). The pencil will probably
appear blurred.
Then move the pencil closer, while still looking at the far
object, and notice that the point becomes more blurred and
(c) Object near—
accommodation
Focus on retina
Figure 3.3 ❚ Focusing of light rays by the eye. (a) Rays
of light coming from a small light source that is more than
20 feet away are approximately parallel. The focus point for
parallel light is at A on the retina. (b) Moving an object closer
to the relaxed eye pushes the focus point back. Here the
focus point is at B, but light is stopped by the back of the
eye. (c) Accommodation of the eye (indicated by the fatter
lens) increases the focusing power of the lens and brings the
focus point for a near object back to A on the retina.
Focusing Light Onto the Retina
45
appears double. When the pencil is about 12 inches away,
focus on the pencil point. You now see the point sharply, but
the faraway object you were focusing on before has become
blurred. Now, bring the pencil even closer until you can’t
see the point sharply no matter how hard you try. Notice the
strain in your eyes as you try unsuccessfully to bring the point
into focus. ❚
into focus at a point in front of the retina so that the image
reaching the retina is blurred. This problem can be caused
by either of two factors: (1) refractive myopia, in which the
cornea and/or the lens bends the light too much, or (2) axial
myopia, in which the eyeball is too long. Either way, images
of faraway objects are not focused sharply, so objects look
blurred.
How can we deal with this problem? One way to create
a focused image on the retina is to move the stimulus closer.
This pushes the focus point farther back (see Figure 3.3b),
and if we move the stimulus close enough, we can push
the focus point onto the retina (Figure 3.5b). The distance
at which the spot of light becomes focused on the retina
is called the far point; when the spot of light is at the far
point, a myope can see it clearly.
Although a person with myopia can see nearby objects
clearly (which is why a myopic person is called nearsighted),
objects beyond the far point are still out of focus. The solution to this problem is well known to anyone with myopia: corrective eyeglasses or contact lenses. These corrective
lenses bend incoming light so that it is focused as if it were
at the far point, as illustrated in Figure 3.5c. Notice that the
lens placed in front of the eye causes the light to enter the
eye at exactly the same angle as light coming from the far
point in Figure 3.5b.
Although glasses or contact lenses are the major route
to clear vision for people with myopia, surgical procedures
in which lasers are used to change the shape of the cornea have been introduced that enable people to experience
good vision without corrective lenses. More than 1 million
Americans a year have laser-assisted in situ keratomileusis
(LASIK) surgery. LASIK involves sculpting the cornea with
a type of laser called an excimer laser, which does not heat
tissue. A small flap, less than the thickness of a human hair,
is cut into the surface of the cornea. The flap is folded out of
the way, the cornea is sculpted by the laser so that it focuses
light onto the retina, and the flap is then folded back into
place. The result, if the procedure is successful, is good vision without the need for glasses.
A person with hyperopia, or farsightedness, can see
distant objects clearly but has trouble seeing nearby objects.
When you changed focus during this demonstration,
you were changing your accommodation. Accommodation
enables you to bring both near and far objects into focus,
although objects at different distances are not in focus at
the same time. But accommodation has its limits. When the
pencil was too close, you couldn’t see it clearly, even though
you were straining to accommodate. The distance at which
your lens can no longer adjust to bring close objects into focus is called the near point.
The distance of the near point increases as a person
gets older, a condition called presbyopia (for “old eye”). The
near point for most 20-year-olds is at about 10 cm, but it
increases to 14 cm by age 30, 22 cm at 40, and 100 cm at 60
(Figure 3.4). This loss of ability to accommodate occurs because the lens hardens with age, and the ciliary muscles become weaker. These changes make it more difficult for the
lens to change its shape for vision at close range.
Though this gradual decrease in accommodative ability poses little problem for most people before the age of 45,
at around that age the ability to accommodate begins to decrease rapidly, and the near point moves beyond a comfortable reading distance. There are two solutions to this problem. One is to hold reading material farther away. If you’ve
ever seen someone holding a book or newspaper at arm’s
length, the person is employing this solution. The other solution is to wear glasses that add to the eye’s focusing power,
so it can bring light to a focus on the retina.
Of course, many people who are far younger than 45
need to wear glasses to see clearly. Most of these people
have myopia, or nearsightedness, an inability to see distant objects clearly. The reason for this difficulty, which
affects more than 70 million Americans, is illustrated in
Figure 3.5a: The myopic eye brings parallel rays of light
Comfortable
reading distance
Age in years
70
60
400
100
50
75
50
40
25
30 2010
10
Distance of near point (cm)
Figure 3.4 ❚ Vertical lines show how the distance of the near point (green numbers) increases with
increasing age. When the near point becomes farther than a comfortable reading distance, corrective
lenses (reading glasses) become necessary.
46
CHAPTER 3
Introduction to Vision
(a)
Focus in front of retina
experience headaches and may therefore require a corrective
lens that brings the focus point forward onto the retina.
Focusing the image clearly onto the retina is the initial
step in the process of vision. But it is important to realize that although a sharp image on the retina is essential
for clear vision, we do not see the image on the retina. Vision occurs not in the retina, but in the brain, and before
the brain can create vision, the light on the retina must be
transformed into electricity.
Transforming Light
Into Electricity
A)
(b)
The Visual Receptors and Transduction
Far point
A)
(c)
The transformation of light into electricity is the process of
transduction we introduced in Chapter 1 (p. 7).
Corrective lens
Figure 3.5 ❚ Focusing of light by the myopic (nearsighted)
eye. (a) Parallel rays from a distant spot of light are brought
to a focus in front of the retina, so distant objects appear
blurred. (b) As the spot of light is moved closer to the eye,
the focus point is pushed back until, at the far point, the
rays are focused on the retina, and vision becomes clear.
(c) A corrective lens, which bends light so that it enters the
eye at the same angle as light coming from the far point,
brings light to a focus on the retina. Angle A is the same in
(b) and (c).
In the hyperopic eye, the focus point for parallel rays of light
is located behind the retina, usually because the eyeball is
too short. By accommodating to bring the focus point back
to the retina, people with hyperopia are able to see distant
objects clearly.
Nearby objects, however, are more difficult for a person with hyperopia to deal with because a great deal of accommodation is required to return the focus point to the
retina. The constant need to accommodate when looking at
nearby objects (as in reading or doing close-up work) results
in eyestrain and, in older people, headaches. Headaches do
not usually occur in young people because they can accommodate easily, but older people, who have more difficulty
accommodating because of presbyopia, are more likely to
Transduction is carried out by receptors, neurons specialized
for receiving environmental energy and transforming this
energy into electricity (see page 7). The receptors for vision
are the rods and the cones. As we will see shortly, the rods and
cones have different properties that affect our perception.
However, they both function similarly during transduction,
so to describe transduction we will focus on the rod receptor shown in Figure 3.6.
The key part of the rod for transduction is the outer
segment, because it is here that the light acts to create
electricity. Rod outer segments contain stacks of discs
(Figure 3.6a). Each disc contains thousands of visual pigment molecules, one of which is highlighted in Figure 3.6b.
Zooming in on an individual molecule, we can see that the
molecule is a long strand of protein called opsin, which
loops back and forth across the disc membrane seven times
(Figure 3.6c). Our main concern is one particular place
where a molecule called retinal is attached. Each visual pigment molecule contains only one of these tiny retinal molecules. The retinal is crucial for transduction, because it is
the part of the visual pigment that is sensitive to light.
Transduction is triggered when the light-sensitive retinal absorbs one photon of light. (Remember that a photon is
the smallest possible packet of light energy.) Figure 3.7 shows
what happens. Before light is absorbed, the retinal is next to
the opsin (Figure 3.7a). (Only a small part of the opsin, where
the retinal is attached, is shown here). When a photon of light
hits the retinal, it changes shape, so it is sticking out from
the opsin. This change in shape is called isomerization, and
it is this step that triggers the transformation of the light entering the eye into electricity in the receptors.
How Does Transduction Occur?
Saying that isomerization of the visual pigment results in
transduction is just the first step in explaining how light is
Transforming Light Into Electricity
47
Outer
segment wall
Disc
Visual pigment
molecule
Outer
segment
Disc
membrane
(b)
Disc
Disc interior
Inner
segment
Opsin–
Protein strand
Disc
membrane
Retinal–
Place where
light-sensitive
retinal is attached
Disc
interior
Synaptic
pedicle
(a)
(c)
Molecule in dark
Figure 3.6 ❚ (a) Rod
receptor showing discs
in the outer segment.
(b) Close-up of one disc
showing one visual pigment
molecule in the membrane.
(c) Close-up showing how
the protein opsin in one
visual pigment molecule
crosses the disc membrane
seven times. The lightsensitive retinal molecule is
attached to the opsin at the
place indicated.
Isomerized by light
Bruce Goldstein
Retinal
Opsin
Figure 3.7 ❚ Model of a visual pigment molecule. The horizontal part of the model shows a tiny portion of
the huge opsin molecule near where the retinal is attached. The smaller molecule on top of the opsin is the
light-sensitive retinal. The model on the left shows the retinal molecule’s shape before it absorbs light. The
model on the right shows the retinal molecule’s shape after it absorbs light. This change in shape is one of
the steps that results in the generation of an electrical response in the receptor.
transformed into electricity. Because isomerization of the
visual pigment molecule is a chemical process, one way to
approach the problem of transduction would be to study
the chemistry of visual pigments in a chemistry or physiology laboratory or to study physiological relationships PH1
and PH2 in Figure 3.8, which is our diagram of the perceptual process from Chapter 1 (Figure 1.8). But there is also
another way to approach this problem. We can learn something about the physiological process of transduction by
doing psychophysical experiments, in which we measure relationship PP to provide information about the underlying
physiology.
48
CHAPTER 3
Introduction to Vision
How can measuring a psychophysical relationship tell
us about physiology? We can appreciate how this is possible by considering what happens when a doctor listens to
a person’s heartbeat during a physical exam. As the doctor
listens, he is using his perception of the heartbeat to draw
conclusions about the physiological condition of the heart.
For example, a clicking sound in the heartbeat can indicate
that one or more of the heart’s valves may not be operating
properly.
Just as a doctor can draw conclusions about the physiology of the heart by listening to the sounds the heart
is making, the psychologist Selig Hecht (Hecht, Shlaer, &
50 photons reflected
or absorbed by eye
structures
Experience
and action
Hecht’s experiment—
psychophysical
PP
PH2
100 photons
to eye
Physiological
processes
Vitreous
humor
50 photons
reach retina
7 photons
absorbed by
visual pigment
Stimuli
PH1
Hecht’s conclusions—
physiological
Figure 3.8 ❚ The three main components of the perceptual
Figure 3.9 ❚ The observer in Hecht et al.’s (1942)
experiment could see a spot of light containing 100 photons.
Of these, 50 photons reached the retina, and 7 photons were
absorbed by visual pigment molecules.
process (see Figures 1.1 and 1.10). Hecht was able to draw
physiological (PH) conclusions based on the measurement of
a psychophysical (PP) relationship.
Light flash
Pirenne, 1942) was able to draw conclusions about the physiology of transduction by determining a person’s ability to
see dim flashes of light.
Psychophysical Experiment The
starting point for Hecht’s experiment was his knowledge
that transduction is triggered by the isomerization of visual
pigment molecules and that it takes just one photon of light
to isomerize a visual pigment molecule. With these facts in
hand, Hecht did a psychophysical experiment that enabled
him to determine how many visual pigment molecules need
to be isomerized for a person to see. He accomplished this
by using the method of constant stimuli (see page 14) to determine a person’s absolute threshold for seeing a brief flash
of light. What was special about this experiment is that
Hecht used a precisely calibrated light source, so he could
determine the threshold in terms of the number of photons
needed to see.
Hecht found that a person could detect a flash of light
that contained 100 photons. To determine how many visual
pigment molecules were isomerized by this flash, he considered what happened to those 100 photons before they
reached the visual pigment. The first thing that happens is
that about half the photons bounce off the cornea or are
absorbed by the lens and by the vitreous humor, a jellylike
substance that fi lls the inside of the eye (Figure 3.9). Thus,
only 50 of the original 100 photons actually reach the retina at the back of the eye. But of these 50, only about 7 are
absorbed by the light-sensitive retinal part of the visual pigment. The rest hit the larger opsin (which is not sensitive to
light) or may slip between the visual receptors. This means
that a person sees a flash of light when only 7 visual pigment molecules are isomerized (also see Sackett, 1972, who
obtained a similar result).
But Hecht wasn’t satisfied just to show that a person
sees a light when 7 visual pigment molecules are activated.
Hecht’s
7 photons
that will each
isomerize a
single visual
pigment
molecule
500 receptors
Figure 3.10 ❚ How Hecht reasoned about what happened
at threshold, when observers were able to see a flash of light
when 7 photons were absorbed by visual pigment molecules.
The 7 photons that were absorbed are shown poised above
500 rod receptors. Hecht reasoned that because there were
only 7 photons but 500 receptors, it is likely that each photon
entered a separate receptor. Thus, only one visual pigment
molecule was isomerized per rod. Because the observer
perceived the light, each of 7 rods must have been activated.
He also wanted to determine how many visual pigment molecules must be isomerized to activate a single rod receptor.
We can understand how he determined this by looking at
Figure 3.10, which shows that the light flash Hecht’s observers saw covered about 500 receptors. Because Hecht had determined that the observers saw the light when only 7 visual
pigment molecules were isomerized, the figure shows the
7 photons that cause this isomerization approaching the
500 receptors.
With this picture of 7 photons approaching 500 receptors in mind, Hecht asked the following question: What is
the likelihood that any two of these photons would enter the
same receptor? The answer to this question is “very small.” It
Transforming Light Into Electricity
49
is therefore unlikely that 2 of the 7 visual pigment molecules
that each absorbed a photon Hecht’s experiment would be in
the same receptor. Hecht concluded that only 1 visual pigment molecule per receptor was isomerized when his observer’s reported seeing the light; therefore, a rod receptor can be
activated by the isomerization of only 1 visual pigment molecule. Hecht’s conclusions can be summarized as follows:
1. A person can see a light if 7 rod receptors are activated
simultaneously.
2. A rod receptor can be activated by the isomerization
of just 1 visual pigment molecule.
The beauty of Hecht’s experiment is that he used the
psychophysical approach, measuring relationship PP in
Figure 3.8, to draw conclusions about the physiological
operation of the visual system. You will see, as you read
this book, that this technique of discovering physiological
mechanisms from psychophysical results has been used to
study the physiological mechanisms responsible for perceptions ranging from color and motion in vision to the pitch
of sounds for hearing to the ability to perceive textures with
the sense of touch.
The Physiology of Transduction Hecht’s dem-
onstration that it takes only one photon to activate a rod
receptor posed a challenge for physiological researchers,
because they needed to explain how isomerization of just
one of the millions of visual pigment molecules in a rod can
activate the receptor. Hecht carried out his experiment in
the 1940s when physiological and chemical tools were not
available to solve this problem, so it wasn’t until 30 years
later that researchers in physiology and chemistry laboratories were able to discover the mechanism that explained
Hecht’s result.
Physiological and chemical research determined that
isomerization of a single visual pigment molecule triggers
thousands of chemical reactions, which in turn trigger
thousands more (Figure 3.11). A biological chemical that in
small amounts facilitates chemical reactions in this way is
called an enzyme; therefore, the sequence of reactions triggered by the activated visual pigment molecule is called the
enzyme cascade. Just as lighting one match to a fuse can
trigger a fireworks display consisting of thousands of points
of light, isomerizing one visual pigment molecule can cause
a chemical effect that is large enough to activate the entire
rod receptor. For more specific details as to how this is accomplished, see “If You Want to Know More #3” at the end
of this chapter.
Pigments and Perception
Vision can occur only if the rod and cone visual pigments
transform the light entering the eye into electricity. We will
now see, however, that these pigments not only determine
whether or not we see, but also shape specific aspects of
our perceptions. We will show how the properties of visual
pigments help determine how sensitive we are to light, by
comparing perception determined by the rod receptors to
perception determined by the cone receptors. To accomplish this, we need to consider how the rods and cones are
distributed in the retina.
Distribution of the Rods and Cones
The enzyme cascade
Visual pigment molecule
1. There is one small area, the fovea, that contains only
cones. When we look directly at an object, its image
falls on the fovea.
Figure 3.11 ❚ This sequence symbolizes the enzyme
cascade that occurs when a single visual pigment molecule
is activated by absorption of a quantum of light. In the actual
sequence of events, each visual pigment molecule activates
hundreds more molecules, which, in turn, each activate about
a thousand more molecules. Thus, isomerization of one visual
pigment molecule activates about a million other molecules.
50
CHAPTER 3
From the cross section of the retina in Figure 3.2b you can see
that the rods and cones are interspersed in the retina. In the
part of the retina shown in this picture, there are more rods
than cones. The ratio of rods and cones depends, however,
on location in the retina. Figure 3.12, which shows how the
rods and cones are distributed in the retina, indicates that
Introduction to Vision
2. The peripheral retina, which includes all of the retina
outside of the fovea, contains both rods and cones.
It is important to note that although the fovea is the
place where there are only cones, there are many cones
in the peripheral retina. The fovea is so small (about
the size of this “o”) that it contains only about 1 percent, or 50,000, of the 6 million cones in the retina
(Tyler, 1997a, 1997b).
3. There are many more rods than cones in the peripheral retina because most of the retina’s receptors are
located there and because there are about 120 million
rods and 6 million cones.
Fovea
Blind spot
(no receptors)
Cones
Rods
180,000
80°
80°
60°
60°
Blind
spot
40°
20°
40°
20°
0°
Number of receptors
per square millimeter
160,000
Optic nerve
Fovea
140,000
120,000
100,000
80,000
60,000
40,000
20,000
0
70° 60° 50° 40° 30° 20° 10° 0° 10° 20° 30° 40° 50° 60° 70° 80°
Angle (degree)
Figure 3.12 ❚ The distribution of rods and cones in the retina. The eye on the left indicates locations in
degrees relative to the fovea. These locations are repeated along the bottom of the chart on the right. The
vertical brown bar near 20 degrees indicates the place on the retina where there are no receptors because this
is where the ganglion cells leave the eye to form the optic nerve. (Adapted from Lindsay & Norman, 1977.)
tors and results in poor vision in the peripheral visual field
(Figure 3.13b). Eventually, in severe cases, the foveal cone
receptors are also attacked, resulting in complete blindness.
Before leaving the rod–cone distribution shown in
Figure 3.12, note that there is one area in the retina, indicated by the vertical brown bar on the graph, where there
are no receptors. Figure 3.14 shows a close-up of the place
where this occurs, which is where the optic nerve leaves the
eye. Because of the absence of receptors, this place is called
the blind spot. Although you are not normally aware of the
blind spot, you can become aware of it by doing the following demonstration.
Bruce Goldstein
One way to appreciate the fact that the rods and cones
are distributed differently in the retina is by considering
what happens when functioning receptors are missing from
one area of the retina. A condition called macular degeneration, which is most common in older people, destroys the
cone-rich fovea and a small area that surrounds it. This creates a “blind spot” in central vision, so when a person looks
at something he or she loses sight of it (Figure 3.13a).
Another condition, called retinitis pigmentosa, is a degeneration of the retina that is passed from one generation
to the next (although not always affecting everyone in a family). This condition first attacks the peripheral rod recep-
(a)
(b)
Figure 3.13 ❚ (a) In a condition called macular degeneration, the fovea and surrounding area degenerate, so the person
cannot see whatever he or she is looking at. (b) In retinitis pigmentosa, the peripheral retina initially degenerates and
causes loss of vision in the periphery. The resulting condition is sometimes called “tunnel vision.”
Pigments and Perception
51
D E M O N S T R AT I O N
Receptors
Ganglion
cell fibers
Blind spot
Filling in the Blind Spot
Close your right eye and, with the cross in Figure 3.16 lined
up with your left eye, move the “wheel” toward you. When
the center of the wheel falls on your blind spot, notice how
the spokes of the wheel fill in the hole (Ramachandran, V
L 3
1992). ❚
Optic nerve
Figure 3.14 ❚ There are no receptors at the place where
the optic nerve leaves the eye. This enables the receptor’s
ganglion cell fibers to flow into the optic nerve. The absence
of receptors in this area creates the blind spot.
D E M O N S T R AT I O N
Becoming Aware of the Blind Spot
Place the book on your desk. Close your right eye, and position yourself above the book so that the cross in Figure 3.15 is
aligned with your left eye. Be sure the page is flat and, while
looking at the cross, slowly move closer. As you move closer,
be sure not to move your eye from the cross, but at the same
time keep noticing the circle off to the side. At some point,
around 3 to 9 inches from the book, the circle should disappear. When this happens the image of the circle is falling on
your blind spot. ❚
Figure 3.16 ❚ View the pattern as described in the text, and
observe what happens when the center of the wheel falls on
your blind spot. (From Ramachandran, 1992.)
These demonstrations show that the brain does not fi ll in
the area served by the blind spot with “nothing”; rather,
it creates a perception that matches the surrounding pattern—the white page in the first demonstration, and the
spokes of the wheel in the second one.
Dark Adaptation of the Rods and Cones
Figure 3.15 ❚
Why aren’t we usually aware of the blind spot? One reason is that the blind spot is located off to the side of our
visual field, where objects are not in sharp focus. Because of
this and because we don’t know exactly where to look for it
(as opposed to the demonstration, in which we are focusing
our attention on the circle), the blind spot is hard to detect.
But the most important reason that we don’t see
the blind spot is that some mechanism in the brain “fills
in” the place where the image disappears (Churchland &
Ramachandran, 1996). The next demonstration illustrates
an important property of this filling-in process.
52
CHAPTER 3
Introduction to Vision
A recent episode of the Mythbusters program on the Discovery Channel (2007) was devoted to investigating myths
about pirates (Figure 3.17). One of the myths explored was
that pirates wore eye patches to preserve night vision in one
eye, so when they went from the bright light outside to the
darkness belowdecks they could see with their previously
patched eye. To determine whether this works, the mythbusters carried out some tasks in a dark room just after
both of their eyes had been in the light and did some different tasks with an eye that had just previously been covered
with a patch for 30 minutes. It isn’t surprising that they
completed the tasks much more rapidly when using the eye
that had been patched. Anyone who has taken sensation
and perception could have told the mythbusters that the eye
patch would work because keeping an eye in the dark trig-
Peripheral retina
Fixation point
Fovea
Test light
Figure 3.18 ❚ Viewing conditions for a dark adaptation
experiment. The image of the fixation point falls on the fovea,
and the image of the test light falls in the peripheral retina.
Figure 3.17 ❚ Why did pirates wear eye patches? Did they
all have exactly the same eye injury? Were they trying to look
scary? Or were they dark adapting the patched eye?
gers a process called dark adaptation, which causes the eye
to increase its sensitivity in the dark. (Whether pirates actually used patches to dark adapt their eyes to help them see
when going belowdecks remains a plausible, but unproven,
hypothesis.) We are going to describe dark adaptation and
show how it can be used to illustrate a difference between
the rods and cones.
You may have noticed that when the lights are turned
off it is difficult to see at first, but that eventually you begin
seeing things that were previously not visible. However, as
you experience your eye’s increasing sensitivity in the dark,
it is probably not obvious that your eyes increase their sensitivity in two distinct stages: an initial rapid stage and a later,
slower stage. These two stages are revealed by measurement
of the dark adaptation curve—a plot of how visual sensitivity changes in the dark, beginning with when the lights are
extinguished.
We will now describe three ways of measuring the dark
adaptation curve, to show that the initial rapid stage is due
to adaptation of the cone receptors and the second, slower
stage is due to adaptation of the rod receptors. We will first
describe how to measure a two-stage dark adaptation curve
that is caused by both the rods and the cones. We will then
measure the dark adaptation of the cones alone and of the
rods alone and show how the different adaptation rates of
the rods and the cones can be explained by differences in
their visual pigments.
In all of our dark adaptation experiments, we ask our
observer to adjust the intensity of a small, flashing test light
so that he or she can just barely see it. This is similar to the
psychophysical method of adjustment that we described in
Chapter 1 (see page 14). In the first experiment, our observer
looks at a small fi xation point while paying attention to a
flashing test light that is off to the side (Figure 3.18). Because the observer is looking directly at the fi xation point,
its image falls on the fovea, and the image of the test light
falls in the periphery. Thus, the test light stimulates both
rods and cones. The dark adaptation curve is measured as
follows.
METHOD ❚ Measuring Dark Adaptation
The first step in measuring dark adaptation is to light
adapt the observer by exposure to light. While the adapting light is on, the observer indicates his or her sensitivity by adjusting the intensity of a test light so it can just
barely be seen. This is called the light-adapted sensitivity,
because it is measured while the eyes are adapted to the
light. Once the light-adapted sensitivity is determined,
the adapting light is extinguished, so the observer is in
the dark. The course of dark adaptation is usually measured by having the observer turn a knob to adjust the
intensity of the test light so it can just barely be seen.
Because the observer is becoming more sensitive to the
light, he or she must continually decrease the light’s intensity to keep it just barely visible. The result, shown as
the red curve in Figure 3.19, is a dark adaptation curve.
The dark adaptation curve shows that as dark adaptation proceeds, the observer becomes more sensitive to the
light. Note that higher sensitivity is at the bottom of this
graph, so as the dark adaptation curve moves downward,
the observer’s sensitivity is increasing.
The dark adaptation curve indicates that the observer’s sensitivity increases in two phases. It increases rapidly
for the first 3 to 4 minutes after the light is extinguished
and then levels off; it begins increasing again at about
7 to 10 minutes and continues to do so until about 20 or
30 minutes after the light was extinguished (red curve in
Figure 3.19). The sensitivity at the end of dark adaptation,
labeled dark-adapted sensitivity, is about 100,000 times
greater than the light-adapted sensitivity measured before
dark adaptation began.
Measuring Cone Adaptation To measure dark
adaptation of the cones alone, we have to ensure that the
image of the test light stimulates only cones. We achieve
this by having the observer look directly at the test light so
its image will fall on the all-cone fovea, and by making the
test light small enough so that its entire image falls within
the fovea. The dark adaptation curve determined by this
procedure is indicated by the green line in Figure 3.19. This
Pigments and Perception
53
Pure cone curve
Pure rod curve
Both rods and cones
Rod light-adapted sensitivity
Low
Figure 3.19 ❚ Three dark
Logarithm of sensitivity
Cone light-adapted sensitivity
Rod–cone break
C
Maximum cone sensitivity
Dark-adapted
sensitivity
High
10
20
Time in dark (min)
curve, which reflects only the activity of the cones, matches
the initial phase of our original dark adaptation curve but
does not include the second phase. Does this mean that the
second part of the curve is due to the rods? We can show
that the answer to this question is “yes” by doing another
experiment.
Measuring Rod Adaptation We know that the
green curve of Figure 3.19 is due only to cone adaptation
because our test light was focused on the all-cone fovea. Because the cones are more sensitive to light at the beginning
of dark adaptation, they control our vision during the early
stages of dark adaptation, so we don’t see what is happening to the rods. In order to reveal how the sensitivity of the
rods is changing at the very beginning of dark adaptation,
we need to measure dark adaptation in a person who has no
cones. Such people, who have no cones due to a rare genetic
defect, are called rod monochromats. Their all-rod retinas
provide a way for us to study rod dark adaptation without
interference from the cones. (Students sometimes wonder
why we can’t simply present the test flash to the peripheral
retina, which contains mostly rods. The answer is that there
are a few cones in the periphery, which influence the beginning of the dark adaptation curve.)
Because the rod monochromat has no cones, the lightadapted sensitivity we measure just before we turn off the
lights is determined by the rods. The sensitivity we deter-
54
R
Maximum rod sensitivity
CHAPTER 3
Introduction to Vision
adaptation curves. The red line is the
two-stage dark adaptation curve,
with an initial cone branch and a
later rod branch. The green line
is the cone adaptation curve. The
purple curve is the rod adaptation
curve. Note that the downward
movement of these curves
represents an increase in sensitivity.
The curves actually begin at the
points indicating “light-adapted
sensitivity,” but there is a slight
delay between the time the lights are
turned off and when measurement of
the curves begin. (Partial data from
“Rhodopsin Measurement and Dark
Adaptation in a Subject Deficient in
Cone Vision,” by W. A. H. Ruston,
1961, Journal of Psychology, 156,
193–205. Copyright © 1961 by
Wiley–Blackwell. All rights reserved.
Reproduced by permission.)
mine, which is labeled “rod light-adapted sensitivity” in
Figure 3.19, is much lower than the light-adapted sensitivity
we measured in the original experiment. Once dark adaptation begins, the rods increase their sensitivity and reach
their final dark-adapted level in about 25 minutes (purple
curve in Figure 3.19) (Rushton, 1961).
Based on the results of our three dark adaptation experiments, we can summarize the process of dark adaptation
in a normal observer as follows: As soon as the light is extinguished, the sensitivity of both the cones and the rods begins
increasing. However, because our vision is controlled by the
receptor system that is most sensitive, the cones, which are
more sensitive at the beginning of dark adaptation, determine the early part of the dark adaptation curve.
But what is happening to the sensitivity of the rods during this early part of dark adaptation? The rods are increasing their sensitivity in the dark during the cone part of the
dark adaptation curve. After about 3 to 5 minutes, the cones
are finished adapting, so their curve levels off. Meanwhile,
the rods’ sensitivity continues to increase, until by about
7 minutes of dark adaptation the rods have caught up to
the cones and then become more sensitive than the cones.
Once the rods become more sensitive, they begin controlling the person’s vision, and the course of rod dark adaptation becomes visible. The place where the rods begin to
determine the dark adaptation curve is called the rod–cone
break.
Why do the rods take about 20 to 30 minutes to reach
their maximum sensitivity (point R on the curve), compared
to only 3 to 4 minutes for the cones (point C)? The answer
to this question involves a process called visual pigment
regeneration, which occurs more rapidly in the cones than in
the rods.
be no more isomerization, so eventually your retina would
contain only intact (unbleached) visual pigment molecules.
As retinal combines with opsin in the dark, the pigment
regains its darker red color. William Rushton (1961) devised
a procedure to measure the regeneration of visual pigment
in humans by measuring this darkening of the visual pigment that occurs during dark adaptation. Rushton’s measurements showed that cone pigment takes 6 minutes to
regenerate completely, whereas rod pigment takes more
than 30 minutes. When he compared the course of pigment
regeneration to the rate of psychophysical dark adaptation,
he found that the rate of cone dark adaptation matched the
rate of cone pigment regeneration and the rate of rod dark
adaptation matched the rate of rod pigment regeneration.
Rushton’s result demonstrated two important connections between perception and physiology:
Visual Pigment Regeneration When light hits
the light-sensitive retinal part of the visual pigment molecule, it is isomerized and triggers the transduction process (Figure 3.7). It then separates from the opsin part of
the molecule. This separation causes the retina to become
lighter in color, a process called visual pigment bleaching.
This bleaching is shown in Figure 3.20, which shows a picture of a frog retina that was taken moments after it was illuminated with light (Figure 3.20a). The red color is the visual pigment. As the light remains on, more and more of the
pigment’s retinal is isomerized and breaks away from the
opsin, so the retina’s color changes (Figures 3.20b and c).
Does this mean that all of our pigment eventually becomes bleached if we stay in the light? This would be a bad
situation because we need intact visual pigment molecules
to see. Luckily, even in the light, as some molecules are absorbing light, isomerizing, and splitting apart, molecules
that have been split apart are undergoing a process called
visual pigment regeneration in which the retinal and opsin
become rejoined.
As you look at the page of this book, some of your visual
pigment molecules are isomerizing and bleaching, as shown
in Figure 3.20, and others are regenerating. This means that
under most normal light levels your eye always contains
some bleached visual pigment and some intact visual pigment. If you were to turn out the lights, then bleached visual pigment would continue to regenerate, but there would
1. Our sensitivity to light depends on the concentration
of a chemical—the visual pigment.
2. The speed at which our sensitivity is adjusted in the
dark depends on a chemical reaction—the regeneration of the visual pigment.
We can appreciate the fact that the increase in sensitivity we experience during dark adaptation is caused by visual
pigment regeneration by considering what happens when
the visual pigment can’t regenerate because of a condition
called detached retina. A major cause of detached retinas
is traumatic injuries of the eye or head, as when a baseball player is hit in the eye by a line drive. When part of the
retina becomes detached, it has become separated from
a layer that it rests on, called the pigment epithelium, which
contains enzymes that are necessary for pigment regeneration (see Figure 3.2b). The result is that once visual pigments
Retinal
Opsin
Opsin
Bruce Goldstein
Opsin
(a)
(b)
(c)
Figure 3.20 ❚ A frog retina
was dissected from the eye in
the dark and then exposed to
light. (a) This picture was taken
just after the light was turned on.
The dark red color is caused by
the high concentration of visual
pigment in the receptors that
are still in the unbleached state,
as indicated by the closeness
of the retinal and opsin in the
diagram above the retina. Only a
small part of the opsin molecule
is shown. (b, c) As the pigment
isomerizes, the retinal and opsin
break apart, and the retina
becomes bleached, as indicated
by the lighter color.
Pigments and Perception
55
are bleached, so the retinal and opsin are separated, they
can no longer be recombined in the detached part of the retina, and the person becomes blind in the area of the visual
field served by this area of the retina.
Threshold curve
Relative threshold
High
Spectral Sensitivity of
the Rods and Cones
Another way to show that perception is determined by the
properties of the visual pigments is to compare rod and
cone spectral sensitivity—an observer’s sensitivity to light
at each wavelength across the visible spectrum.
56
CHAPTER 3
Introduction to Vision
(a)
500
600
700
Wavelength (nm)
Spectral sensitivity curve
High
Relative sensitivity
adaptation experiments, we used a white test light, which
contains all wavelengths in the visible spectrum. To determine spectral sensitivity, we use flashes of monochromatic
light, light that contain only a single wavelength. We determine the threshold for seeing these monochromatic lights
for wavelengths across the visible spectrum (see Figure 3.1).
For example, we might first determine the threshold for seeing a 420-nm (nanometer) light, then a 440-nm light, and
so on, using one of the psychophysical methods for measuring threshold described in Chapter 1. The result is the curve
in Figure 3.21a, which shows that the threshold for seeing
light is lowest in the middle of the spectrum; that is, less
light is needed to see wavelengths in the middle of the spectrum than to see wavelengths at either the short- or longwavelength ends of the spectrum.
The ability to see wavelengths across the spectrum is
often plotted not in terms of threshold versus wavelength
as in Figure 3.21a, but in terms of sensitivity versus wavelength. We can convert threshold to sensitivity with the following equation: sensitivity 1/threshold. When we do this
for the curve in Figure 3.21a, we obtain the curve in Figure
3.21b, which is called the spectral sensitivity curve.
We measure the cone spectral sensitivity curve by having people look directly at the test light, so that it stimulates only the cones in the fovea, and presenting test flashes
of wavelengths across the spectrum. We measure the rod
spectral sensitivity curve by measuring sensitivity after the
eye is dark adapted (so the rods control vision because they
are the most sensitive receptors) and presenting test flashes
off to the side of the fi xation point.
The cone and rod spectral sensitivity curves, shown in
Figure 3.22, show that the rods are more sensitive to shortwavelength light than are the cones, with the rods being
most sensitive to light of 500 nm and the cones being most
sensitive to light of 560 nm. This difference in the sensitivity of the cones and the rods to different wavelengths means
that as vision shifts from the cones to the rods during dark
adaptation, we become relatively more sensitive to shortwavelength light—that is, light nearer the blue and green
end of the spectrum.
You may have noticed an effect of this shift to shortwavelength sensitivity if you have observed how green foliage seems to stand out more near dusk. The shift from cone
400
Low
400
(b)
500
600
700
Wavelength (nm)
Figure 3.21 ❚ (a) The threshold for seeing a light versus
wavelength. (b) Relative sensitivity versus wavelength—the
spectral sensitivity curve. (Adapted from Wald, 1964.)
Rod vision
1.0
Relative sensitivity
Measuring Spectral Sensitivity In our dark
Low
Cone vision
0.8
0.6
0.4
0.2
0
400
500
600
Wavelength (nm)
700
Figure 3.22 ❚ Spectral sensitivity curves for rod vision
(left) and cone vision (right). The maximum sensitivities of
these two curves have been set equal to 1.0. However, the
relative sensitivities of the rods and the cones depend on the
conditions of adaptation: The cones are more sensitive in the
light, and the rods are more sensitive in the dark. The circles
plotted on top of the rod curve are the absorption spectrum
of the rod visual pigment. (From Wald, 1964; Wald &
Brown, 1958.)
vision to rod vision that causes this enhanced perception
of short wavelengths during dark adaptation is called the
Purkinje (Pur-kin'-jee) shift, after Johann Purkinje, who
described this effect in 1825. You can experience this shift
in color sensitivity that occurs during dark adaptation by
closing one of your eyes for about 5–10 minutes, so it dark
adapts, and then switching back and forth between your eyes
and noticing how the blue flower in Figure 3.23 is brighter
compared to the red flower in your dark-adapted eye.
Rod and Cone Absorption Spectra The dif-
ference between the rod and cone spectral sensitivity curves
is caused by differences in the absorption spectra of the rod
and cone visual pigments. An absorption spectrum is a
plot of the amount of light absorbed by a substance versus
the wavelength of the light. The absorption spectra of the
rod and cone pigments are shown in Figure 3.24. The rod
pigment absorbs best at 500 nm, the blue-green area of the
spectrum.
There are three absorption spectra for the cones because
there are three different cone pigments, each contained in
its own receptor. The short-wavelength pigment (S) absorbs
light best at about 419 nm; the medium-wavelength pigment (M) absorbs light best at about 531 nm; and the longwavelength pigment (L) absorbs light best at about V
L 4
558 nm.
The absorption of the rod visual pigment closely
matches the rod spectral sensitivity curve (Figure 3.22),
Figure 3.23 ❚ Flowers for demonstrating the Purkinje shift.
and the short-, medium-, and long-wavelength cone pigments that absorb best at 419, 531, and 558 nm, respectively,
add together to result in a psychophysical spectral sensitivity curve that peaks at 560 nm. Because there are fewer
short-wavelength receptors and therefore much less of the
short-wavelength pigment, the spectral sensitivity curve is
determined mainly by the medium- and long-wavelength
pigments (Bowmaker & Dartnall, 1980; Stiles, 1953).
It is clear from the evidence we have presented that the
rod and cone sensitivity in the dark (dark adaptation) and
sensitivity to different wavelengths (spectral sensitivity)
are determined by the properties of the rod and cone visual
pigments. But, of course, perception is not determined just
by what is happening in the receptors. Signals travel from
the receptors through a network of neurons in the retina,
and then leave the eye in the optic nerve. Next we consider how what happens in this network of neurons affects
perception.
T E S T YO U R S E L F 3 .1
1. Describe the structure of the eye and how moving an
object closer to the eye affects how light entering the
eye is focused on the retina.
2. How does the eye adjust the focusing of light by accommodation? Describe the following conditions
that can cause problems in focusing: presybopia,
myopia, hyperopia. Be sure you understand the difference between the near point and the far point, and
can describe the various solutions to focusing problems, including corrective lenses and surgery.
3. Describe the structure of a rod receptor. What is the
structure of a visual pigment molecule, and where
are visual pigments located in the receptor? What
must happen in order for the visual pigment to be
isomerized?
4. Describe the psychophysical experiment that showed
that it takes 7 photons to see and 1 photon to excite
a rod, and the physiological mechanism that explains
how this is possible.
See text for explanation.
S
R
M
L
Relative proportion of
light absorbed
1.0
.75
.50
.25
0
400
450
500
550
Wavelength (nm)
600
650
Figure 3.24 ❚ Absorption spectra of the rod
pigment (R), and the short- (S), medium- (M),
and long-wavelength (L) cone pigments. (From
Dartnall, Bowmaker, & Mollon, 1983.)
Pigments and Perception
57
5. Where on the retina does a researcher need to pres-
6.
7.
8.
9.
ent a stimulus to test dark adaptation of the cones?
How can adaptation of the rods be measured without
any interference from the cones?
Describe how rod and cone sensitivity changes
starting when the lights are turned off and how this
change in sensitivity continues for 20–30 minutes in
the dark.
What happens to visual pigment molecules when
they (a) absorb light and (b) regenerate? What is the
connection between visual pigment regeneration and
dark adaptation?
What is spectral sensitivity? How is a cone spectral
sensitivity curve determined? A rod spectral sensitivity curve?
What is an absorption spectrum? How do rod and
cone pigment absorption spectra compare, and
what is their relationship to rod and cone spectral
sensitivity?
Neural Convergence
and Perception
We’ve seen how perception can be shaped by properties of
the visual pigments in the receptors. We now move past the
receptors to show how perception is also shaped by neural
circuits in the retina.
Figure 3.25a is a cross section of the retina that has been
stained to reveal the retina’s layered structure. Figure 3.25b
shows the five types of neurons that make up these layers.
Signals generated in the receptors (R) travel to the bipolar
cells (B) and then to the ganglion cells (G). The receptors
and bipolar cells do not have long axons, but the ganglion
cells have axons like the neurons in Figure 2.4. These axons
transmit signals out of the retina in the optic V
L 5, 6
nerve (see Figure 3.14).
In addition to the receptors, bipolars, and ganglion
cells, there are two other types of neurons, the horizontal
cells and amacrine cells, which connect neurons across
the retina. Signals can travel between receptors through
the horizontal cells and between bipolar cells and between
ganglion cells through the amacrine cells. We will return to
the horizontal and amacrine cells later in the chapter. For
now we will focus on the direct pathway from the receptors
to the ganglion cells. We focus specifically on the property
of neural convergence (or just convergence for short) that
occurs when one neuron receives signals from many other
neurons. We introduced convergence in Chapter 2 (page 33).
Now let’s see how it applies to the neurons in the retina.
In Figure 3.25b the ganglion cell on the right receives signals from three receptors (indicated by light color). A great
deal of convergence occurs in the retina because there are 126
million receptors, but only 1 million ganglion cells. Thus, on
the average, each ganglion cell receives signals from 126 receptors. We can show how convergence can affect perception
58
CHAPTER 3
Introduction to Vision
by continuing our comparison of the rods and cones. An important difference between rods and cones is that the signals
from the rods converge more than do the signals from the
cones. We can appreciate this difference by noting that there
are 120 million rods in the retina, but only 6 million cones.
On the average, about 120 rods pool their signals to one
ganglion cell, but only about 6 cones send signals to a single
ganglion cell.
This difference between rod and cone convergence becomes even greater when we consider the foveal cones. (Remember that the fovea is the small area that contains only
cones.) Many of the foveal cones have “private lines” to ganglion cells, so that each ganglion cell receives signals from
only one cone, with no convergence. The greater convergence of the rods compared to the cones translates into two
differences in perception: (1) the rods result in better sensitivity than the cones, and (2) the cones result in better detail
vision than the rods.
Why Rods Result in Greater
Sensitivity Than Cones
One reason rod vision is more sensitive than cone vision is
that it takes less light to generate a response from an individual rod receptor than from an individual cone receptor
(Barlow & Mollon, 1982; Baylor, 1992). But there is another
reason as well: The rods have greater convergence than the
cones.
We can understand why the amount of convergence is
important for determining sensitivity by expanding our discussion of neurotransmitters from Chapter 2 (see page 31).
We saw that the release of excitatory transmitter at the synapse increases the chances that the receiving neuron will fire.
This means that if a neuron receives excitatory transmitter
from a number of neurons it will be more likely to fire.
Keeping this basic principle in mind, we can see how the
difference in rod and cone convergence translates into differences in the maximum sensitivities of the cones and the
rods. In the two circuits in Figure 3.26, five rod receptors
converge onto one ganglion cell and five cone receptors each
send signals onto their own ganglion cells. We have left out
the bipolar, horizontal, and amacrine cells in these circuits
for simplicity, but our conclusions will not be affected by
these omissions.
For the purposes of our discussion, we will assume
that we can present small spots of light to individual rods
and cones. We will also make the following additional
assumptions:
1. One unit of light intensity causes the release of one
unit of excitatory transmitter, which causes one unit
of excitation in the ganglion cell.
2. The threshold for ganglion cell fi ring is 10 units of
excitation. That is, the ganglion cell must receive
10 units of excitation to fi re.
3. The ganglion cell must fi re before perception of the
light can occur.
Image not available due to copyright restrictions
When we present spots of light with an intensity of 1
to each receptor, the rod ganglion cell receives 5 units of
excitation, 1 from each of the 5 rod receptors. Each of the
cone ganglion cells receives 1 unit of excitation, 1 from each
cone receptor. Thus, when intensity 1, neither the rod
nor the cone ganglion cells fire. If, however, we increase the
intensity to 2, as shown in the figure, the rod ganglion cell
receives 2 units of excitation from each of its 5 receptors,
for a total of 10 units of excitation. This total reaches the
threshold for the rods’ ganglion cell, it fires, and we see the
light. Meanwhile, at the same intensity, the cones’ ganglion
cells are still below threshold, each receiving only 2 units
of excitation. For the cones’ ganglion cells to fire, we V
L 7
must increase the intensity to 10.
The operation of these circuits demonstrates that one
reason for the rods’ high sensitivity compared to the cones’
is the rods’ greater convergence. Many rods summate their
responses by feeding into the same ganglion cell, but only
one or a few cones send their responses to a single ganglion
cell. The fact that rod and cone sensitivity is determined
Neural Convergence and Perception
59
which are imaged on or near the fovea, you can read only a
few of the letters that are off to the side, which are imaged on
the peripheral retina. ❚
2
2
2
2
2
2
2
2
Visual acuity can be measured in a number of ways,
one of which is to determine how far apart two dots have to
be before a space can be seen between them. We make this
measurement by presenting a pair of closely spaced dots
and asking whether the person sees one or two dots. We can
also measure acuity by determining how large the elements
of a checkerboard or a pattern of alternating black and
white bars must be for the pattern to be detected. Perhaps
the most familiar way of measuring acuity involves the eye
chart in an optometrist’s or ophthalmologist’s office.
In the demonstration above, we showed that acuity is
better in the fovea than in the periphery. Because you were
light adapted, the comparison in this demonstration was
between the foveal cones, which are tightly packed, and the
peripheral cones, which are more widely spaced. Comparing
the foveal cones to the rods results in even greater differences in acuity. We can make this comparison by measuring
how acuity changes during dark adaptation.
The picture of the bookcase in Figure 3.27 simulates
the change in acuity that occurs during dark adaptation.
2
2
+10
Response
No response
Figure 3.26 ❚ The wiring of the rods (left) and the cones
(right). The dot and arrow above each receptor represents
a “spot’’ of light that stimulates the receptor. The numbers
represent the number of response units generated by the
rods and the cones in response to a spot intensity of 2.
not by individual receptors but by groups of receptors converging onto other neurons means that when we describe
“rod vision” and “cone vision” we are actually referring to
the way groups of rods and cones participate in determining
our perceptions.
Why We Use Our Cones to See Details
While rod vision is more sensitive than cone vision because
the rods have more convergence, the cones have better visual
acuity—detail vision—because they have less convergence.
One way to appreciate the high acuity of the cones is to
think about the last time you were looking for one thing
that was hidden among many other things. This could be
searching for an eraser on the clutter of your desk or locating your friend’s face in a crowd. To find what you are looking for, you usually need to move your eyes from one place
to another. When you move your eyes to look at different
things in this way, what you are doing is scanning with your
cone-rich fovea (remember that when you look directly at
something, its image falls on the fovea). This is necessary
because your visual acuity is highest in the fovea; objects
that are imaged on the peripheral retina are not seen as
clearly.
D E M O N S T R AT I O N
DIHCNRLAZIFWNSMQPZKDX
You can demonstrate that foveal vision is superior to
peripheral vision for seeing details by looking at the X on
the right and, without moving your eyes, seeing how many
letters you can identify to the left. If you do this without
cheating (resist the urge to look to the left!), you will find
that although you can read the letters right next to the X,
60
CHAPTER 3
Introduction to Vision
Bruce Goldstein
Foveal Versus Peripheral Acuity
Figure 3.27 ❚ Simulation of the change from colorful
sharp perception to colorless fuzzy perception that occurs
during the shift from cone vision to rod vision during dark
adaptation.
The books on the top shelf represent the details we see when
viewing the books in the light, when our cones are controlling vision. The books on the middle shelf represent how we
might perceive the details midway through the process of
dark adaptation, when the rods are beginning to determine
our vision, and the books on the bottom shelf represent the
poor detail vision of the rods. (Also note that color has disappeared. We will describe why this occurs in Chapter 9.)
The poor detail vision of the rods is why it is difficult to
read in dim illumination.
We can understand how differences in rod and cone wiring explain the cones’ greater acuity by returning to our rod
and cone neural circuits. As we stimulate the receptors in
the circuits in Figure 3.28 with two spots of light, each with
an intensity of 10, we will ask the following question: Under
what conditions can we tell, by monitoring the output of the
ganglion cells, that there are two separate spots of light? We
begin by presenting the two spots next to each other, as in
Figure 3.28a. When we do this, the rod ganglion cell fires,
and the two adjacent cone ganglion cells fire. The firing of
the single rod ganglion cell provides no hint that two separate spots were presented, and the firing of the two adjacent
cone ganglion cells could have been caused by a single large
spot. However, when we spread the two spots apart, as in
Figure 3.28b, the output of the cones signals two separate
spots, because there is a silent ganglion cell between the two
that are firing, but the output of the rods’ single ganglion
cell still provides no information that would enable us to say
that there are two spots. Thus, the rods’ convergence V
L 8
decreases their ability to resolve details (Teller, 1990).
We have seen that the large amount of convergence
that occurs in the rods results in high sensitivity, and the
low amount of convergence of the cones results in high acuity. This is an example of how what we see depends both on
what’s out there in the environment and on the physiological workings of our visual system. When we are looking directly at something under high illumination, cone vision,
aided by low neural convergence, enables us to see details, as
in the top shelf of the bookcase in Figure 3.27. When we are
looking at something under low illumination, rod vision,
aided by high neural convergence, enables us to make out
things that are dimly illuminated, but we see few details, as
in the bottom shelf of the bookcase. In the next section we
will consider how another physiological mechanism—the
decrease in the rate of nerve firing caused by inhibition—can
also influence what we perceive.
(a)
Lateral Inhibition
and Perception
The neural circuit in Figure 3.29 may look familiar because
it is the circuit from Chapter 2 that introduced the idea that
neural processing is achieved by convergence and inhibition.
We saw that the convergence and inhibition in this circuit
caused neuron B to respond best to stimulation by a small
(b)
Figure 3.28 ❚ Neural circuits for the rods (left) and the
cones (right). The receptors are being stimulated by two
spots of light.
2
3
4
A
5
6
C
B
Lateral
inhibition
7
7
Firing rate of “B”
1
6
5
4
3
2
1
4
3–5
2–6
1–7
Receptors stimulated
Figure 3.29 ❚ Circuit with convergence and
inhibition from Figure 2.16. Lateral inhibition arrives at
neuron B from A and from C.
Lateral Inhibition and Perception
61
spot of light on receptors 3, 4, and 5. We are now going to
look at some perceptual effects of inhibition by focusing
on lateral inhibition—inhibition that is transmitted across
the retina. An example of lateral inhibition is the connections between neurons A and B and C and B in Figure 3.29.
Notice that activation of neurons A or C results in the release of inhibitory transmitter onto neuron B.
To understand how lateral inhibition can cause perceptual effects, we will look at an experiment using a primitive
animal called the Limulus, more familiarly known as the
horseshoe crab (Figure 3.30).
caused by lateral inhibition that is transmitted across the
Limulus’s eye by the fibers of the lateral plexus, shown
VL 9
in Figure 3.31.
Just as the lateral plexus transmits signals laterally in the
Limulus, the horizontal and amacrine cells (see Figure 3.25)
transmit signals across the human retina. We will now see
how lateral inhibition may influence how humans perceive
light and dark.
What the Horseshoe Crab Teaches
Us About Inhibition
We will now describe three perceptual phenomena that have
been explained by lateral inhibition. Each of these phenomena involves the perception of lightness—the perception of
shades ranging from white to gray to black.
In an experiment that is now considered a classic, Keffer
Hartline, Henry Wagner, and Floyd Ratliff (1956) used the
Limulus to demonstrate how lateral inhibition can affect the
response of neurons in a circuit. They chose the Limulus because the structure of its eye makes it possible to stimulate
individual receptors. The Limulus eye is made up of hundreds of tiny structures called ommatidia, and each ommatidium has a small lens on the eye’s surface that is located
directly over a single receptor. Each lens and receptor is
roughly the diameter of a pencil point (very large compared
to human receptors), so it is possible to illuminate and record from a single receptor without illuminating its neighboring receptors.
When Hartline and coworkers recorded from the nerve
fiber of receptor A, as shown in Figure 3.31a, they found that
illumination of that receptor caused a large response. But
when they added illumination to the three nearby receptors
at B, the response of receptor A decreased (Figure 3.31b).
They also found that increasing the illumination of B decreased A’s response even more (Figure 3.31c). Thus, illumination of the neighboring receptors inhibited the firing
of receptor A. This decrease in the firing of receptor A is
Lateral Inhibition and
Lightness Perception
The Hermann Grid: Seeing Spots at Intersections Notice the ghostlike gray images at the inter-
sections of the white “corridors” in the display in Figure 3.32,
Light
B
Light
A
Lateral
plexus
Electrode
recording
from A
A only
(a)
A+B
(b)
(c)
Bruce Goldstein
Lateral eye
Figure 3.30 ❚ A Limulus, or horseshoe crab. Its large eyes
are made up of hundreds of ommatidia, each containing a
single receptor.
62
CHAPTER 3
Introduction to Vision
A+B
(increased
intensity
at B)
Figure 3.31 ❚ A demonstration of lateral inhibition in
the Limulus. The records show the response recorded
by the electrode in the nerve fiber of receptor A: (a) when
only receptor A is stimulated; (b) when receptor A and the
receptors at B are stimulated together; (c) when A and B are
stimulated, with B at an increased intensity. (Adapted from
Ratliff, 1965.)
Figure 3.32 ❚ The Hermann grid. Notice the gray “ghost
images” at the intersections of the white areas, which
decrease or vanish when you look directly at an intersection.
which is called the Hermann grid. You can prove that this
grayness is not physically present by noticing that it is reduced or vanishes when you look directly at an intersection
or, better yet, when you cover two rows of black squares with
white paper.
Figure 3.33 shows how the dark spots at the intersections can be explained by lateral inhibition. Figure 3.33a
shows four squares of the grid and five receptors that are
stimulated by different parts of the white corridors. Receptor A is stimulated by light at the intersection of the two
white corridors, where the gray spot is perceived, and the
surrounding receptors B, C, D, and E are located in the corridors. It is important to note that all five of these receptors
receive the same stimulation, because they are all receiving
illumination from the white areas.
Figure 3.33b shows a three-dimensional view of the
grid and the receptors. This view shows each receptor sending signals to a bipolar cell. It also shows that each of the bipolar cells sends lateral inhibition, indicated by the arrows,
to receptor A’s bipolar cell. We are interested in determining
the output of the bipolar cell that receives signals from re-
ceptor A. We are assuming, for the purposes of this example,
that our perception of the lightness at A is determined by
the response of its bipolar cell. (It would be more accurate
to use ganglion cells because they are the neurons that send
signals out of the retina, but to simplify things for the purposes of this example, we will focus on the bipolar cells.)
The size of the bipolar cell response depends on how
much stimulation it receives from its receptor and on the
amount that this response is decreased by the lateral inhibition it receives from its neighboring cells. Let’s assume that
light falling on A generates a response of 100 units in its
bipolar cell. This would be the response of the bipolar cell
if no inhibition were present. We determine the amount of
inhibition by making the following assumption: The lateral
inhibition sent by each receptor’s bipolar cell is one-tenth of
each receptor’s response. Because receptors B, C, D, and E
receive the same illumination as receptor A, their response
is also 100. Taking one-tenth of this, we determine that each
of these receptors is responsible for 10 units of lateral inhibition. To calculate the response of A’s bipolar cell, we start
with A’s initial response of 100 and subtract the inhibition
sent from the other four bipolar cells, as follows: V
L 10, 11
100 10 10 10 10 60 (Figure 3.33c).
Now that we have calculated the response of the bipolar cell stimulated by A, we repeat the same calculation for
receptor D, which is in the corridor between two black areas (Figure 3.34). The calculation is the same as the one we
just did, but with one important difference. Two of the surrounding receptors, F and H are illuminated dimly because
they fall under black squares. If we assume their response is
only 20, this means the effect of the inhibition associated
with these receptors will be 2, and the output of the bipolar cell receiving signals from D will be 100 10 2 10
2 76 (Figure 3.34c).
These outputs make a prediction about perception:
Because the response associated with receptor A (at the
intersection) is smaller than the response associated with
receptor D (in the corridor between the black squares), the
D
C
D
A
“A” initial
response
E
“A” final
response
Lateral inhibition sent
C
D
E
B
A
C
B
E
100
Bipolar
C
–
10
10
10
10
=
60
E
Lateral
inhibition
B
(a)
(b)
Response of “A” (small)
(c)
Figure 3.33 ❚ (a) Four squares of the Hermann grid, showing five of the receptors under the pattern. Receptor A is located
at the intersection, and B, C, D, and E have a black square on either side. (b) Perspective view of the grid and five receptors,
showing how the receptors connect to bipolar cells. Receptor A’s bipolar cell receives lateral inhibition from the bipolar cells
associated with receptors B, C, D, and E. (c) The calculation of the final response of receptor A’s bipolar cell starts with A’s initial
response (100) and subtracts the inhibition associated with each of the other receptors.
Lateral Inhibition and Perception
63
G
“D” initial
response
G
F
D
F
D
A
H
“D” final
response
Lateral inhibition sent
A
F
G
H
10
2
10
2
H
A
100
–
=
76
F
H
(a)
(b)
(c)
Response of “D” (larger)
Figure 3.34 ❚ (a) Four squares of the Hermann grid, as in Figure 3.33, but now focusing on receptor D, which is flanked by two
black squares. Receptor D is surrounded by receptors A, F, G, and H. Notice that receptors F and H are located under the two
black squares, so they receive less light than the other receptors. (b) Perspective view showing the inhibition received by the
bipolar cells associated with receptor D. Notice that D receives less inhibition than A did in the previous example, because two of
the bipolar cells that are sending lateral inhibition (F and H) are associated with receptors that are illuminated more dimly. (c) The
calculation of the final response of receptor D indicates that it responds more than A in the previous example.
intersection should appear darker than the corridor. This
is exactly what happens—we perceive grey images at the intersections. Lateral inhibition therefore explains the dark
images at the intersection. (Although the fact that these
images disappear when we look at the intersection directly
must be explained by some other mechanism).
Mach
Bands:
Seeing
Borders
More
Sharply Another perceptual effect that can be explained
by lateral inhibition is Mach bands, illusory light and dark
bands near a light–dark border. Mach bands were named after the Austrian physicist and philosopher Ernst Mach, who
also lent his name to the “Mach number” that indicates
speed compared to the speed of sound (Mach 2 twice the
speed of sound). You can see Mach bands in Figure 3.35 by
looking just to the left of the light–dark border for a faint
light band (at B) and just to the right of the border for a faint
dark band (at C). (There are also bands at V
L 12, 13, 14
the other two borders in this figure.)
D E M O N S T R AT I O N
Creating Mach Bands in Shadows
Mach bands can be demonstrated using gray stripes, as in
Figure 3.35, or by casting a shadow, as shown in Figure 3.36.
When you do this, you will see a dark Mach band near the
border of the shadow and a light Mach band on the other
side of the border. The light Mach band is often harder to see
than the dark band. ❚
The reason Mach bands are interesting is that, like the
spots in the Hermann grid, they are an illusion—they are
not actually present in the pattern of light. If we determine
the intensity across the stripes in Figure 3.35a by measuring
64
CHAPTER 3
Introduction to Vision
the amount of light reflected from this pattern as we move
along the line between A and D, we obtain the result shown
in Figure 3.35b. The light intensity remains the same across
the entire distance between A and B then drops to a lower
level and remains the same between C and D.
Because the intensities remain constant across the light
stripe on the left and the dark stripe on the right, the small
bands we perceive on either side of the border must be illusions. Our perception of these illusory bands is represented
graphically in Figure 3.35c, which indicates what we perceive
across the two stripes. The upward bump at B represents
the slight increase in lightness we see to the left of the border, and the downward bump at C represents slight decrease
in lightness we see to the right of the border.
By using the circuit in Figure 3.37 and doing a calculation like the one we did for the Hermann grid, we can show
that Mach bands can be explained by lateral inhibition.
Each of the six receptors in this circuit sends signals to bipolar cells, and each bipolar cell sends lateral inhibition to
its neighbors on both sides. Receptors A, B, and C fall on
the light side of the border and so receive intense illumination; receptors D, E, and F fall on the darker side and receive
dim illumination.
Let’s assume that receptors A, B, and C generate responses of 100, whereas D, E, and F generate responses of
20, as shown in Figure 3.37. Without inhibition, A, B, and C
send the same responses to their bipolar cells, and D, E, and
F send the same responses to their bipolar cells. If perception were determined only by these responses, we would see
a bright bar on the left with equal intensity across its width
and a darker bar on the right with equal intensity across its
width. But to determine what we perceive, we need to take
lateral inhibition into account. We do this with the following calculation:
1. Start with the response received by each bipolar cell:
100 for A, B, and C, and 20 for D, E, and F.
Light band
A
B C
D
Dark band
(a)
High
B
C
D
Low
Distance
(b)
Figure 3.36 ❚ Shadow-casting technique for observing
Mach bands. Illuminate a light-colored surface with your
desk lamp and cast a shadow with a piece of paper.
B
High
Bruce Goldstein
Light intensity
A
Perception
of lightness
A
High intensity
Low intensity
D
Low
C
A
B
C
D
E
F
Receptors
100
100
100
20
20
20
Receptor response
Distance
(c)
Figure 3.35 ❚ Mach bands at a contour between light and
dark. (a) Just to the left of the contour, near B, a faint light
band can be perceived, and just to the right at C, a faint dark
band can be perceived. (b) The physical intensity distribution
of the light, as measured with a light meter. (c) A plot showing
the perceptual effect described in (a). The bump in the curve
at B indicates the light Mach band, and the dip in the curve
at C indicates the dark Mach band. The bumps that represent
our perception of the bands are not present in the physical
intensity distribution.
2. Determine the amount of inhibition that each bipolar cell receives from its neighbor on each side. As
with the Hermann grid, we will assume that each cell
sends inhibition to the cells on either side, equal to
one-tenth of that cell’s initial output. Thus, cells A,
B, and C will send 100 0.1 10 units of inhibition
to their neighbors, and cells D, E, and F will send
20 0.1 2 units of inhibition to their neighbors.
3. Determine the fi nal response of each cell by subtracting the amount of inhibition received, from the initial
response. Remember that each cell receives inhibition
from its neighbor on either side. (We assume here that
cell A receives 10 units of inhibition from an unseen
cell on its left and that F receives 2 units of inhibition
Bipolar cells
–10 –10 –10 –10–10 –2 –10 –2
80
80
88
8
–2 –2
–2 –2
16
16
Lateral inhibition
Bipolar cell response
Figure 3.37 ❚ Circuit to explain the Mach band effect
based on lateral inhibition. The circuit works like the one
for the Hermann grid in Figure 3.34, with each bipolar cell
sending inhibition to its neighbors. If we know the initial
output of each receptor and the amount of lateral inhibition,
we can calculate the final output of the receptors. (See text
for a description of the calculation.)
from an unseen cell on its right.) Here is the calculation for each cell:
Cell A: Final response
Cell B: Final response
Cell C: Final response
Cell D: Final response
Cell E: Final response
Cell F: Final response
100 10 10 80
100 10 10 80
100 10 2 88
20 10 2 8
20 2 2 16
20 2 2 16
The graph of these neural responses, shown in Figure
3.38, looks similar to the graph in Figure 3.35c, where there
is an increase in brightness on the light side of the border
at C and a decrease in brightness on the dark side at D.
Lateral Inhibition and Perception
65
background is masked off, and compare your perception of
the small squares. ❚
100
88
Final response
80
80
50
16
16
E
F
8
A
B
C
D
Cell
Figure 3.38 ❚ A plot showing the final receptor output
calculated for the circuit in Figure 3.37. The bump at B and
the dip at C correspond to the light and dark Mach bands,
respectively.
The lateral inhibition in our circuit has therefore created a
neural pattern that looks like the Mach bands we perceive.
A circuit similar to this one, but of much greater complexity, is probably responsible for the Mach bands that we see.
You may have been surprised to see that the small
squares look the same when viewed through the holes. Your
perception occurs because the two small squares are actually identical shades of gray. The illusion that they are different, which is created by the differences in the areas surrounding each square, is the simultaneous contrast effect.
An explanation for simultaneous contrast that is based
on lateral inhibition is diagrammed in Figure 3.40, which
shows an array of receptors that are stimulated by a pattern
like the one in Figure 3.39. The receptors under the two small
squares receive the same illumination. However the receptors under the light area surrounding the square on the left
are intensely stimulated, so they send a large amount of inhibition to the receptors under the left square (indicated by the
large arrows). The receptors under the dark area surrounding
the square on the right are less intensely stimulated, so they
send less inhibition to the receptors under the right square
(small arrows). Because the cells under the left square receive
Lateral Inhibition and Simultaneous Contrast Simultaneous contrast occurs when our percep-
tion of the brightness or color of one area is affected by
the presence of an adjacent or surrounding V
L 15, 16, 17
area.
D E M O N S T R AT I O N
Simultaneous Contrast
When you look at the two small squares in Figure 3.39, the
one on the left appears much darker than the one on the
right. Now, punch two holes 2 inches apart in a card or a
piece of paper, position the two holes over the squares so the
Figure 3.40 ❚ How lateral inhibition has been used to
explain the simultaneous contrast effect. The size of the
arrows indicate the amount of lateral inhibition. Because the
square on the left receives more inhibition, it appears darker.
Figure 3.39 ❚ Simultaneous contrast. The two center
squares reflect the same amount of light into your eyes
but look different because of simultaneous contrast.
66
CHAPTER 3
Introduction to Vision
A
B
A
B
Figure 3.43 ❚ The arrows indicate the amount of lateral
Figure 3.41 ❚ White’s illusion. The rectangles at A and B
appear different, even though they are printed from the same
ink and reflect the same amount of light. (From White, 1981.)
more inhibition than the cells under the right square, their
response is decreased more, they fire less than the cells under
the right square, and the left square therefore looks darker.
The above explanation based on lateral inhibition
makes sense and is still accepted by some researchers, although it is difficult for lateral inhibition to explain the
following perception: If we start at the edge of the center square on the left and move toward the middle of the
square, the lightness appears to be the same, all across the
square. However, because lateral inhibition would affect the
square more strongly near the edge, we would expect that
the square would look lighter near the border and darker in
the center. The fact that this does not occur suggests that
lateral inhibition cannot be the whole story behind simultaneous contrast. In fact, psychologists have created other
displays that result in perceptions that can’t be explained
by the spread of lateral inhibition.
A Display That Can’t Be Explained
by Lateral Inhibition
Look at the two rectangles in Figure 3.41, which is called
White’s illusion (White, 1981). Rectangle A, on the left,
looks much darker than rectangle B, on the right. However,
rectangles A and B reflect the same amount of light. This is
hard to believe, because the two rectangles look so different,
but you can prove this to yourself by using white paper to
A
B
Figure 3.42 ❚ When you mask off part of the White’s
illusion display, as shown here, you can see that rectangles A
and B are actually the same. (Try it!)
inhibition received by parts of rectangles A and B. Because
the part of rectangle B is surrounded by more white, it
receives more lateral inhibition. This would predict that B
should appear darker than A (as in the simultaneous
contrast display in Figure 3.39), but the opposite happens.
This means that lateral inhibition cannot explain our
perception of White’s illusion.
mask off part of the display and comparing parts of V
L 18
rectangles A and B, as in Figure 3.42.
What causes the rectangles on the left and right to
look so different, even though they are reflecting the same
amount of light? Figure 3.43 shows part of rectangle A, on
the left, and part of rectangle B, on the right. The amount of
lateral inhibition that affects each area is indicated by the
arrows, with larger arrows indicating more inhibition, just
as in Figure 3.40. It is clear that area B receives more lateral inhibition, because more of its border is surrounded by
white. Because area B receives more lateral inhibition than
area A, an explanation based on lateral inhibition would
predict that area B should appear darker, like the left square
in the simultaneous contrast display in Figure 3.40. But
the opposite happens—rectangle B appears lighter! Clearly,
White’s illusion can’t be explained by lateral inhibition.
What’s happening here, according to Alan Gilchrist
and coworkers (1999), is that our perception of lightness
in influenced by a principle called belongingness, which
states that an area’s appearance is influenced by the part of
the surroundings to which the area appears to belong. According to this idea, our perception of rectangle A would be
affected by the light background, because it appears to be
resting on it. Similarly, our perception of rectangle B would
be affected by the dark bars, because it appears to be resting on them. According to this idea, the light area makes
area A appear darker and the dark bars make area B appear
lighter.
Whether or not this idea of belongingness turns out to
the be correct explanation, there is no question that some
mechanism other than lateral inhibition is involved in our
perception of White’s illusion and many other displays (see
Adelson, 1993; Benary, 1924; Knill & Kersten, 1991; Williams, McCoy, & Purves, 1998). It isn’t surprising that there
are perceptions we can’t explain based just on what is happening in the retina. There is still much more processing
to be done before perception occurs, and this processing
happens later in the visual system, in the visual receiving
area of the cortex and beyond, as we will see V
L 19–26
in Chapter 4.
Lateral Inhibition and Perception
67
Something to Consider:
Perception Is Indirect
The experience of perception connects us with our environment. But perception does more than that. It gives us the
feeling that we are in direct contact with the environment. As
I look up from my writing, I can tell that there is a coffee
cup sitting on the table directly in front of me. I know where
the cup is, so I can easily reach for it, and as I pick it up I feel
the smooth texture of the cup’s ceramic fi nish. As I drink
the coffee, I sense heat, and also the coffee’s taste and smell.
But as much as I feel that all of these experiences are due to
my direct contact with the coffee cup and the liquid in it,
I know that this feeling of directness is largely an illusion.
Perception, as we will see throughout this text, is an indirect process.
We have already demonstrated the indirectness of perception by considering the mechanisms responsible for vision. I see the cup not because of any direct contact with
it, but because the light that is reflected from the cup is
focused onto my retina and then changed into electricity,
which is then processed by mechanisms such as convergence, excitation, and inhibition.
“Well, vision may be indirect,” you might say, “but how
about the perceptions of heat and texture that occur from
picking up the cup? Weren’t your fingers in direct contact with the cup?” The answer to this question is, yes, it
is true that my fingers were in direct physical contact with
the cup, but my perception of the heat of the coffee and
the texture of the cup’s fi nish was due to the stimulation
of temperature-sensitive and pressure-sensitive receptors in
my fingers, which translated the temperature and pressure
into electrical impulses, just as the light energy that causes
vision is translated into electrical impulses.
Smell and taste are also indirect because these experiences occur when chemicals travel through the air to receptor sites in the nose and tongue. Stimulation of these
receptor sites causes electrical signals that are processed by
the nervous system to create the experiences of smell and
taste. Hearing is the same. Air pressure changes transmitted through the air cause vibrations of receptors inside the
ear, and these vibrations generate the electrical signals our
auditory system uses to create the experience of sound.
The amazing thing about perception is that despite
this indirectness, it seems so real. And it is real, in the sense
that our perceptions usually provide us with accurate information about what’s out there in the distance or what’s up
close under our noses or beneath our fingers. But in all of
these cases, this information is created through the actions
of receptors that change environmental stimulation into
electrical signals and by the actions of convergence, excitation, and inhibition that transform electrical signals as they
travel through the nervous system.
T E S T YO U R S E L F 3 . 2
1. What is convergence, and how can the differences
2.
3.
4.
5.
6.
7.
in the convergence of rods and cones explain (a) the
rods’ greater sensitivity in the dark and (b) the cones’
better detail vision?
Describe the experiment that demonstrated the effect of lateral inhibition in the Limulus.
How can lateral inhibition explain the “spots” that
are perceived at the intersections of the Hermann
grid?
What are Mach bands, and how can lateral inhibition
explain our perception of them? Be sure to understand the calculations used in conjunction with the
circuit in Figure 3.37.
What is simultaneous contrast? How has it been explained by lateral inhibition? What are some problems with this explanation?
How does White’s illusion demonstrate that there are
some perceptual “lightness” effects that lateral inhibition cannot explain? What does this mean about
the location of the mechanism that determines lightness perception?
What does it mean to say that all perception is
indirect?
THINK ABOUT IT
1.
In the demonstration “Becoming Aware of What Is in
Focus” on page 45, you saw that we see things clearly
only when we are looking directly at them so that their
image falls on the cone-rich fovea. But consider the
common observation that the things we aren’t looking
at do not appear “fuzzy,” that the entire scene appears
“sharp” or “in focus.” How can this be, in light of the
results of the demonstration? (p. 45)
2.
Here’s an exercise you can do to get more in touch
with the process of dark adaptation: Find a dark place
Figure 3.44 ❚ Dark adaptation test circles.
68
CHAPTER 3
Introduction to Vision
where you will make some observations as you adapt to
the dark. A closet is a good place to do this because it
is possible to regulate the intensity of light inside the
closet by opening or closing the door. The idea is to
create an environment in which there is dim light (no
light at all, as in a darkroom with the safelight out, is
too dark). Take this book into the closet, opened to this
page. Close the closet door all the way so it is very dark,
and then open the door slowly until you can just barely
make out the white circle on the far left in the Figure
3.44, but can’t see the others or can see them only as being very dim. As you sit in the dark, become aware that
your sensitivity is increasing by noting how the circles
to the right in the figure slowly become visible over a
period of about 20 minutes. Also note that once a circle
becomes visible, it gets easier to see as time passes. If
you stare directly at the circles, they may fade, so move
your eyes around every so often. Also, the circles will be
easier to see if you look slightly above them. (p. 52)
3.
4.
Ralph, who is skeptical about the function of lateral inhibition, says, “OK, so lateral inhibition causes us to see
Mach bands and the spots at the intersections of the
Hermann grid. Even though these perceptual effects
are interesting, they don’t seem very important to me.
If they didn’t exist, we would see the world in just about
the same way as we do with them.” (a) How would you
respond to Ralph if you wanted to make an argument
for the importance of lateral inhibition? (b) What is the
possibility that Ralph could be right? (p. 61)
Look for shadows, both inside and outside, and see if
you can see Mach bands at the borders of the shadows.
Remember that Mach bands are easier to see when the
border of a shadow is slightly fuzzy. Mach bands are
not actually present in the pattern of light and dark, so
you need to be sure that the bands are not really in the
light but are created by your nervous system. How can
you accomplish this? (p. 64)
IF YOU WANT TO KNOW MORE
1.
Disorders of focusing. Many people wear glasses to compensate for the fact that their optical system does not
focus a sharp image on their retinas. The three most
common problems are farsightedness, nearsightedness, and astigmatism. (p. 46)
Goldstein, E. B. (2002). Sensation and perception (6th
ed.). Belmont, CA: Wadsworth. (See Chapter 16,
“Clinical Aspects of Vision and Hearing.”)
2.
LASIK eye surgery. For more information about
LASIK, see the following U.S. Food and Drug Administration website (p. 46):
http://www.fda.gov/cdrh/lasik
3.
Transduction. The molecular basis of transduction,
in which light is changed into electrical energy, is a
process that involves sequences of chemical reactions.
(p. 47)
Burns, M., & Lamb, T. D. (2004). Visual transduction by rod and cone photoreceptors. In L. M. Chalupa & J. S. Werner (Eds.), The visual neurosciences.
Cambridge, MA: MIT Press.
4.
A disorder of dark adaptation. There is a rare clinical
condition called Oguchi’s disease, in which adaptation of the rods is slowed so that it takes 3 or 4 hours
for the rods to reach their maximum sensitivity in the
dark. What makes this condition particularly interesting is that the rate of rod visual pigment regeneration is normal, so there must be a problem somewhere
between the visual pigments and the mechanism that
determines sensitivity to light. (p. 54)
Carr, R. E., & Ripps, H. (1967). Rhodopsin kinetics
and rod adaptation in Oguchi’s disease. Investigative Ophthalmology, 6, 426–436.
KEY TERMS
Absorption spectrum (p. 57)
Accommodation (p. 45)
Amacrine cells (p. 58)
Axial myopia (p. 46)
Belongingness (p. 67)
Bipolar cells (p. 58)
Blind spot (p. 51)
Cone (p. 44)
Cornea (p. 44)
Dark adaptation (p. 53)
Dark adaptation curve (p. 53)
Dark-adapted sensitivity (p. 53)
Detached retina (p. 55)
Electromagnetic spectrum (p. 44)
Enzyme cascade (p. 50)
Eye (p. 44)
Far point (p. 46)
Farsightedness (p. 46)
Fovea (p. 50)
Ganglion cells (p. 58)
Hermann grid (p. 63)
Horizontal cells (p. 58)
Hyperopia (p. 46)
Isomerization (p. 47)
Laser-assisted in situ keratomileusis
(LASIK) (p. 46)
Lateral inhibition (p. 62)
Lens (p. 44)
Light-adapted sensitivity (p. 53)
Lightness (p. 62)
Limulus (p. 62)
Mach bands (p. 64)
Macular degeneration (p. 51)
Monochromatic light (p. 56)
Myopia (p. 46)
Near point (p. 46)
Nearsightedness (p. 46)
Neural convergence (p. 58)
Ommatidia (p. 62)
Opsin (p. 47)
Optic nerve (p. 44)
Outer segment (p. 47)
Peripheral retina (p. 50)
Presbyopia (p. 46)
Pupil (p. 44)
If You Want to Know More
69
Purkinje shift (p. 57)
Refractive myopia (p. 46)
Retina (p. 44)
Retinal (p. 47)
Retinitis pigmentosa (p. 51)
Rod (p. 44)
Rod monochromat (p. 54)
Rod–cone break (p. 54)
Simultaneous contrast (p. 66)
Spectral sensitivity (p. 56)
Spectral sensitivity curve (p. 56)
Visible light (p. 44)
Visual acuity (p. 60)
Visual pigment (p. 44)
MEDIA RESOURCES
The Sensation and Perception
Book Companion Website
CengageNOW
www.cengage.com/cengagenow
Go to this site for the link to CengageNOW, your one-stop
shop. Take a pre-test for this chapter, and CengageNOW
will generate a personalized study plan based on your test
results. The study plan will identify the topics you need to
review and direct you to online resources to help you master those topics. You can then take a post-test to help you
determine the concepts you have mastered and what you
will still need to work on.
VL
Your Virtual Lab is designed to help you get the most out
of this course. The Virtual Lab icons direct you to specific
media demonstrations and experiments designed to help
you visualize what you are reading about. The number
beside each icon indicates the number of the media element
you can access through your CD-ROM, CengageNOW, or
WebTutor resource.
The following lab exercises are related to material
in this chapter:
A Day Without Sight A segment from Good Morning
America in which Diane Sawyer talks with people
who have lost their sight about the experience of
being blind.
2. The Human Eye A drag-and-drop exercise to test your
knowledge of parts of the eye.
3. Filling In A demonstration of how the visual system can
fi ll in empty areas to complete a pattern.
4. Types of Cones Absorption spectra showing that each
cone absorbs light in a different region of the spectrum.
5. Cross Section of the Retina A drag-and-drop exercise to
test your knowledge of the neurons in the retina.
6. Visual Path Within the Eyeball How electrical signals that
start in the rods and cones are transmitted through the
retina and out the back of the eye in the optic nerve.
1.
70
CHAPTER 3
Introduction to Vision
Receptor Wiring and Sensitivity When light is presented
to the receptors, rod ganglion cells fire at lower light intensities than cone ganglion cells.
8. Receptor Wiring and Acuity When spots of light are
presented to rod and cone receptors, detail information
is present in the cone ganglion cells but not the rod
ganglion cells.
9. Lateral Inhibition How lateral inhibition affects
the firing of one neuron when adjacent neurons are
stimulated.
10. Lateral Inhibition in the Hermann Grid How lateral
inhibition can explain the firing of neurons that cause the
“spots” in the Hermann grid.
11. Receptive Fields of Retinal Ganglion Cells A classic 1972
fi lm in which vision research pioneer Colin Blakemore describes the neurons in the retina, and how center-surround
receptive fields of ganglion cells are recorded from the cat’s
retina.
12. Intensity and Brightness Mapping the physical intensity
across a display that produces Mach bands and comparing
this intensity to perceived brightness across the display.
13. Vasarely Illusion A demonstration of how lateral inhibition can affect our perception of a picture. (Courtesy of
Edward Adelson.)
14. Pyramid Illusion Another demonstration of the Vasarely illusion. (Courtesy of Michael Bach.)
15. Simultaneous Contrast How varying the intensity of
the surround can influence perception of the brightness of
squares in the center.
16. Simultaneous Contrast: Dynamic How perception of a
gray dot changes as it moves across a background that is
graded from white to black.
7.
www.cengage.com/psychology/goldstein
See the companion website for flashcards, practice quiz
questions, Internet links, updates, critical thinking exercises, discussion forums, games, and more!
Virtual Lab
Visual pigment bleaching (p. 55)
Visual pigment molecule (p. 47)
Visual pigment regeneration (p. 55)
Wavelength (p. 44)
White’s illusion (p. 67)
Simultaneous Contrast 2 Animation illustrating simultaneous contrast. (Courtesy of Edward Adelson.)
18. White’s Illusion An animation of White’s illusion.
(Courtesy of Edward Adelson.)
19. Craik-Obrien-Cornsweet Effect A perceptual effect
caused by the fact that the visual system responds
best to sharp changes of intensity. (Courtesy of Edward
Adelson.)
20. Criss-Cross Illusion A contrast illusion based on the
idea that the visual system takes illumination into account
in determining the perception of lightness. (Courtesy of
Edward Adelson.)
17.
21.
Haze Illusion An illustration of how lightness cues
can affect an area’s appearance. (Courtesy of Edward
Adelson.)
22. Knill and Kersten’s Illusion An illustration of how our
perception of shading caused by curvature can affect lightness perception. (Courtesy of Edward Adelson.)
24.
Koffka Ring A demonstration showing how the spatial
configuration of a pattern can affect lightness perception.
(Courtesy of Edward Adelson.)
26.
23.
The Corrugated Plaid A demonstration showing how
the orientation of a surface can affect lightness perception.
(Courtesy of Edward Adelson.)
25. Snake Illusion Another contrast demonstration that
can’t be explained by lateral inhibition. (Courtesy of
Edward Adelson.)
Hermann Grid, Curving A version of the Hermann grid
that can’t be explained by lateral inhibition. (Courtesy of
Michael Bach.)
Media Resources
71
Chapter Contents
C H A P T E R
4
FOLLOWING THE SIGNALS FROM
RETINA TO CORTEX
The Visual System
Processing in the Lateral Geniculate
Nucleus
METHOD: Determining Retinotopic Maps
by Recording From Neurons
Receptive Fields of Neurons in the Striate
Cortex
DO FEATURE DETECTORS PLAY
A ROLE IN PERCEPTION?
Selective Adaptation and Feature Detectors
METHOD: Selective Adaptation to
Orientation
Selective Rearing and Feature Detectors
The Visual
Cortex and
Beyond
MAPS AND COLUMNS IN THE
STRIATE CORTEX
Maps in the Striate Cortex
METHOD: Brain Imaging
Columns in the Striate Cortex
How Is an Object Represented in the Striate
Cortex?
❚ TEST YOURSELF 4.1
STREAMS: PATHWAYS FOR WHAT,
WHERE, AND HOW
Streams for Information About What and
Where
METHOD: Brain Ablation
Streams for Information About What and
How
METHOD: Dissociations in
Neuropsychology
MODULARITY: STRUCTURES FOR
FACES, PLACES, AND BODIES
Face Neurons in the Monkey’s IT Cortex
Areas for Faces, Places, and Bodies in the
Human Brain
SOMETHING TO CONSIDER: HOW DO
NEURONS BECOME SPECIALIZED?
Is Neural Selectivity Shaped by Evolution?
How Neurons Can Be Shaped by Experience
❚ TEST YOURSELF 4.2
Think About It
If You Want to Know More
Key Terms
Media Resources
VL VIRTUAL LAB
OPPOSITE PAGE Brain imaging technology has made it possible to
visualize both the structure and functioning of different areas
of the brain.
© Barry Blackman/Taxi/Getty Images.
VL The Virtual Lab icons direct you to specific animations and videos
designed to help you visualize what you are reading about. The number beside
each icon indicates the number of the clip you can access through your
CD-ROM or your student website.
73
Some Questions We Will Consider:
❚ How can brain damage affect a person’s perception?
(p. 88)
❚ Are there separate brain areas that determine our
perception of different qualities? (p. 91)
❚ How has the operation of our visual system
been shaped by evolution and by our day-to-day
experiences? (p. 94)
I
n Chapters 2 and 3 we described the steps of the perceptual process that occur in the retina. We can summarize
this process as follows for vision: Light is reflected from an
object into the eye. This light is focused to form an image
of that object on the retina. Light, in a pattern that illuminates some receptors intensely and some dimly, is absorbed
by the visual pigment molecules that pack the rod and cone
outer segments. Chemical reactions in the outer segments
transduce the light into electrical signals. As these electrical signals travel through the retina, they interact, excite,
and inhibit, eventually reaching the ganglion cells, which
because of this processing have center-surround receptive
fields on the retina. After being processed by the retina these
electrical signals are sent out the back of the eye in fibers of
the optic nerve. It is here that we pick up our story.
Following the Signals From
Retina to Cortex
In this chapter we follow the signals from the retina to the
visual receiving area of the cortex, and then to other areas
beyond the visual receiving area. Our quest, in following
these signals to the visual cortex and beyond, is to determine the connection between these signals and what we
perceive. One way researchers have approached this problem
is by determining how neurons at various places in the visual system respond to stimuli presented to the retina. The
first step in describing this research is to look at the overall
layout of the visual system.
The Visual System
Figure 4.1a, which is an overview of the visual system, pictures the pathway that the neural signals follow once they
leave the retina. Most of the signals from the retina travel
out of the eye in the optic nerve to the lateral geniculate
nucleus (LGN) in the thalamus. From here, signals travel
to the primary visual receiving area in the occipital lobe of
the cortex. The visual receiving area is also called the striate
cortex because of the white stripes (striate ⫽ striped) that
are created within this area of cortex by nerve fibers that
Lateral geniculate
nucleus in thalamus
Visual receiving
area
(striate cortex)
Eye
Light energy
Optic nerve
(a)
Optic nerve
Optic chiasm
Lateral geniculate nucleus
Superior colliculus
Visual cortex
(b)
74
CHAPTER 4
The Visual Cortex and Beyond
Figure 4.1 ❚ (a) Side view of the visual
system, showing the three major sites along
the primary visual pathway where processing
takes place: the eye, the lateral geniculate
nucleus, and the visual receiving area of
the cortex. (b) Visual system seen from
underneath the brain showing how some of the
nerve fibers from the retina cross over to the
opposite side of the brain at the optic chiasm.
run through it (Glickstein, 1988). From the striate cortex,
signals are transmitted along two pathways, one to the temporal lobe and the other to the parietal lobe (blue arrows).
Visual signals also reach areas in the frontal lobe of V
L 1
the brain.
Figure 4.1b shows the visual system as seen from the
underside of the brain. In addition to showing the pathway
from eye to LGN to cortex, this view also indicates the location of the superior colliculus, an area involved in controlling eye movements and other visual behaviors that receives
about 10 percent of the fibers from the optic nerve. This
view also shows how signals from half of each retina cross
over to the opposite side of the brain.
From the pictures of the visual system in Figure 4.1 it is
clear that many areas of the brain are involved in vision. We
begin considering these visual areas by following signals in
the optic nerve to the first major area where visual signals
are received—the lateral geniculate nucleus.
To visual cortex
From
visual
cortex
T
From
brain
stem
LGN
cell
L
90% of fibers from eye
From retina
(a)
From cortex
To cortex
Processing in the Lateral
Geniculate Nucleus
LGN
cell
What happens to the information that arrives at the lateral
geniculate nucleus? One way to answer this question is to
record from neurons in the LGN to determine what their
receptive fields look like.
Receptive Fields of LGN Neurons Recording
from neurons in the LGN shows that LGN neurons have
the same center-surround configuration as retinal ganglion
cells (see Figure 2.18). Thus, neurons in the LGN, like neurons in the optic nerve, respond best to small spots of light
on the retina. If we just consider the receptive fields of LGN
neurons, we might be tempted to conclude that nothing is
happening there. But further investigation reveals that a
major function of the LGN is apparently not to create new
receptive field properties, but to regulate neural information
as it flows from the retina to the visual cortex (Casagrande &
Norton, 1991; Humphrey & Saul, 1994).
Information Flow in the Lateral Geniculate
Nucleus The LGN does not simply receive signals from
the retina and then transmit them to the cortex. Figure 4.2a
shows that it is much more complex than that. Ninety percent of the fibers in the optic nerve arrive at the LGN. (The
other 10 percent travel to the superior colliculus.) But these
signals are not the only ones that arrive at the LGN. The
LGN also receives signals from the cortex, from the brain
stem, from other neurons in the thalamus (T), and from
other neurons in the LGN (L). Thus, the LGN receives information from many sources, including the cortex, and then
sends its output to the cortex.
Figure 4.2b indicates the amount of flow between the
retina, LGN, and cortex. Notice that (1) the LGN receives
more input back from the cortex than it receives from the
retina (Sherman & Koch, 1986; Wilson, Friedlander, &
Sherman, 1984); and (2) the smallest signal of all is from
From retina
(b)
Figure 4.2 ❚ (a) Inputs and outputs of an LGN neuron. The
neuron receives signals from the retina and also receives
signals from the cortex, from elsewhere in the thalamus
(T), from other LGN neurons (L), and from the brain stem.
Excitatory synapses are indicated by Y’s and inhibitory ones
by T’s. (b) Information flow into and out of the LGN. The sizes
of the arrows indicate the sizes of the signals. (Part a adapted
from Kaplan, Mukherjee, & Shapley, 1993.)
the LGN to the cortex. For every 10 nerve impulses the LGN
receives from the retina, it sends only 4 to the cortex. This
decrease in firing that occurs at the LGN is one reason for
the suggestion that one of the purposes of the LGN is to
regulate neural information as it flows from the retina to
the cortex.
But the LGN not only regulates information flowing
through it; it also organizes the information. Organizing
information is important. It is the basis of finding a document in a fi ling system or locating a book in the library and,
as we will see in this chapter, in the fi ling of information
that is received by structures in the visual system. The LGN
is a good place to begin discussing the idea of organization,
because although this organization begins in the retina, it
becomes more obvious in the LGN. We will see that the signals arriving at the LGN are sorted and organized based on
the eye they came from, the receptors that generated them,
and the type of environmental information that is represented in them.
Following the Signals From Retina to Cortex
75
Organization by Left and Right Eyes The
lateral geniculate nucleus (LGN) is a bilateral structure,
which means there is one LGN in the left hemisphere and
one in the right hemisphere. Viewing one of these nuclei in
cross section reveals six layers (Figure 4.3). Each layer receives signals from only one eye. Layers 2, 3, and 5 (red layers) receive signals from the ipsilateral eye, the eye on the
same side of the body as the LGN. Layers 1, 4, and 6 (blue
layers) receive signals from the contralateral eye, the eye
on the opposite side of the body from the LGN. Thus, each
eye sends half of its neurons to the LGN that is located in
the left hemisphere of the brain and half to the LGN that
is located in the right hemisphere. Because the signals from
each eye are sorted into different layers, the information
from the left and right eyes is kept separated in the LGN.
Organization as a Spatial Map To introduce
the idea of spatial maps, we first consider Figure 4.4. When
the man looks at the cup, points A, B, and C on the cup are
imaged on points A, B, and C of the retina, and each place
on the retina corresponds to a specific place on the lateral
geniculate nucleus (LGN). This correspondence between
points on the LGN and points on the retina creates a retinotopic map on the LGN—a map in which each point on
the LGN corresponds to a point on the retina. We can determine what this map looks like by recording from neurons
in the LGN.
M E T H O D ❚ Determining Retinotopic Maps
by Recording From Neurons
The retinotopic map on the LGN has been determined
by recording from neurons in the LGN with an electrode
that penetrates the LGN obliquely (at a small angle to
the surface), as shown in Figure 4.5. In this example, we
are recording from the neurons at A, B, and C in layer 6
of the LGN.
6
5
4
3
Retinotopic map on LGN
Oblique
electrode track
2
A
B
C
1
Perpendicular
electrode track
Figure 4.3 ❚ Cross section of the LGN showing layers. Red
layers receive signals from the ipsilateral (same side of the
body) eye. Blue layers receive signals from the contralateral
(opposite side) eye.
Receptive
field
locations
Layer 6
of LGN
A⬘ B⬘ C⬘
Retina
A B C
C
A
B
A
B
LGN
Retina
Figure 4.5 ❚ Retinotopic mapping of neurons in the
LGN. The neurons at A, B, and C in layer 6 of the LGN have
receptive fields located at positions A⬘, B⬘, and C⬘ on the
retina. This mapping can be determined by recording from
neurons encountered along an oblique electrode track. Also,
neurons along a perpendicular electrode track all have their
receptive fields at about the same place on the retina.
C
Figure 4.4 ❚ Points A, B, and C on the cup create images
at A, B, and C on the retina and cause activation at points
A, B, and C on the lateral geniculate nucleus (LGN). The
correspondence between points on the LGN and retina
indicates that there is a retinotopic map on the LGN.
76
CHAPTER 4
The Visual Cortex and Beyond
Recording from the neuron at A, we determine the
location of the neuron’s receptive field on the retina by
stimulating different places on the retina with spots of
light until the neuron responds. The location of the neuron’s receptive field is indicated by A' on the retina. When
we repeat this procedure with an electrode at B and then
at C, we fi nd that B’s receptive field is at B' on the retina,
involved in vision. In fact, more than 80 percent of the cortex responds to visual stimuli (Felleman & Van Essen, 1991).
The idea that most of the cortex responds when the retina
is stimulated is the result of research that began in the
late 1950s. In the early 1950s, we knew little about visual
cortical function; a 63-page chapter on the physiology of
vision that appeared in the 1951 Handbook of Experimental Psychology devoted less than a page to the visual cortex
(Bartley, 1951). But by the end of that decade, David Hubel
and Thorsten Wiesel (1959) had published a series of papers
in which they described both receptive field properties and
organization of neurons in the striate cortex. For this research and other research on the visual system, Hubel and
Wiesel received the Nobel prize in physiology and medicine
in 1982. We will see later in this chapter how other researchers pushed our knowledge of visual physiology to areas beyond the striate cortex, but first let’s consider Hubel and
Wiesel’s research.
Using the procedure described in Chapter 2 (page 34)
in which receptive fields are determined by flashing spots
of light on the retina, Hubel and Wiesel found cells in the
striate cortex with receptive fields that, like center-surround receptive fields of neurons in the retina and LGN,
have excitatory and inhibitory areas. However, these areas
are arranged side by side rather than in the center-surround
configuration (Figure 4.6a). Cells with these side-by- V
L 2
side receptive fields are called simple cortical cells.
We can tell from the layout of the excitatory and inhibitory areas of the simple cell shown in Figure 4.6a that a cell
with this receptive field would respond best to vertical bars.
As shown in Figure 4.6b, a vertical bar that illuminates only
the excitatory area causes high firing, but as the bar is tilted
so the inhibitory area is illuminated, firing decreases.
The relationship between orientation and firing is indicated by a neuron’s orientation tuning curve, which is
determined by measuring the responses of a simple cortical
cell to bars with different orientations. The tuning curve
in Figure 4.6c shows that the cell responds with 25 nerve
and C’s receptive field is at C' on the retina. Results such
as those in Figure 4.5 show that stimulating a sequence
of points on the retina results in activity in a corresponding sequence of neurons in the LGN. This is the retinotopic map on the LGN.
The correspondence between locations on the retina
and locations on the LGN means that neurons entering the
LGN are arranged so that fibers carrying signals from the
same area of the retina end up in the same area of the LGN,
each location on the LGN corresponds to a location on the
retina, and neighboring locations on the LGN correspond
to neighboring locations on the retina. Thus, the receptive
fields of neurons that are near each other in the LGN, such
as neurons A, B, and C, in layer 6 (Figure 4.5), are adjacent
to each other at A⬘, B⬘, and C⬘ on the retina.
Retinotopic maps occur not only in layer 6, but in each
of the other layers as well, and the maps of each of the layers line up with one another. Thus, if we lower an electrode
perpendicularly, as shown in Figure 4.5, all of the neurons
we encounter along the electrode track will have receptive
fields at the same location on the retina. This is an amazing
feat of organization: One million ganglion cell fibers travel
to each LGN, and on arriving there, each fiber goes to the
correct LGN layer (remember that fibers from each eye go to
different layers) and finds its way to a location next to other
fibers that left from the same place on the retina. Meanwhile, all of the other fibers are doing the same thing in the
other layers! The result is aligned, overlapping retinotopic
maps in each of the LGN’s six layers.
Receptive Fields of Neurons
in the Striate Cortex
We are now ready to move from the LGN to the visual cortex. As we saw in Figure 4.1, a large area of the cortex is
Figure 4.6 ❚ (a) The
++
–
–
–
–
–
+ +
+ +
– + +
+ +
–
–
–
+ +
+ +
+
(1)
–
–
–
(2)
–
–
Impulses/sec
30
–
20
10
–
–
–
(3)
On
40°
20°
0
Off
+ = Excitatory area
(Vertical)
– = Inhibitory area
Orientation
(a)
(b)
(c)
20°
40°
receptive field of a
simple cortical cell.
(b) This cell responds
best to a vertical bar
of light that covers the
excitatory area of the
receptive field. The
response decreases
as the bar is tilted so
that it also covers the
inhibitory area.
(c) Orientation tuning
curve of a simple
cortical cell for a neuron
that responds best
to a vertical bar
(orientation = 0). (From
Hubel Wiesel, 1959.)
Following the Signals From Retina to Cortex
77
impulses per second to a vertically oriented bar and that the
cell’s response decreases as the bar is tilted away from the
vertical, and begins stimulating inhibitory areas of the neuron’s receptive field. Notice that a bar tilted 20 degrees from
the vertical elicits only a small response. This particular
simple cell responds best to a bar with a vertical orientation,
but there are other simple cells that respond to other orientations, so there are neurons that respond to all of V
L 3
the orientations that exist in the environment.
Edge of slide
Figure 4.7 ❚ When Hubel and Wiesel dropped a slide into
their slide projector, the image of the edge of the slide moving
down unexpectedly triggered activity in a cortical neuron.
Although Hubel and Wiesel were able to use small spots
of light to map the receptive fields of simple cortical cells
like the one in Figure 4.6, they found that many of the cells
they encountered in the cortex refused to respond to small
spots of light. In his Nobel lecture, Hubel describes how he
and Wiesel were becoming increasingly frustrated in their
attempts to get these cortical neurons to fire, when something startling happened: As they inserted a glass slide containing a spot stimulus into their slide projector, a cortical
neuron “went off like a machine gun” (Hubel, 1982). The
neuron, as it turned out, was responding not to the spot at
the center of the slide that Hubel and Wiesel had planned
to use as a stimulus, but to the image of the slide’s edge
moving downward on the screen as the slide dropped into
the projector (Figure 4.7). Upon realizing this, Hubel and
Wiesel changed their stimuli from small spots to moving
lines and were then able to fi nd cells that responded to oriented moving bars. As with simple cells, a particular V
L 4
neuron had a preferred orientation.
Hubel and Wiesel discovered that many cortical neurons respond best to moving barlike stimuli with specific
orientations. Complex cells, like simple cells, respond best
to bars of a particular orientation. However, unlike simple
cells, which respond to small spots of light or to stationary
stimuli, most complex cells respond only when a correctly
oriented bar of light moves across the entire receptive field.
Further, many complex cells respond best to a particular direction of movement (Figure 4.8a). Because these neurons
*
*
(a)
(b)
Figure 4.8 ❚ (a) Response of a complex cell recorded from the visual cortex of the cat. The
stimulus bar is moved back and forth across the receptive field. The cell fires best when the
bar is positioned with a specific orientation and is moved in a specific direction (*). (From
Hubel & Wiesel, 1959.) (b) Response of an end-stopped cell recorded from the visual cortex
of the cat. The stimulus is indicated by the light area on the left. This cell responds best to a
medium-sized corner that is moving up (*). (From “Receptive fields and functional architecture
in two non-striate visual areas (18 and 19) of the fat,” by D. H. Hubel and T. N. Wiesel, 1965,
Journal of Neurophysiology, 28, 229–289.)
78
CHAPTER 4
The Visual Cortex and Beyond
don’t respond to stationary flashes of light, their receptive
fields are not indicated by pluses and minuses, but by indicating the area which, when stimulated, elicits a response in
the neuron.
Another type of cell, called end-stopped cells, fire to moving lines of a specific length or to moving corners or angles.
Figure 4.8b shows a light corner stimulus that is being moved
up and down across the retina. The records to the right indicates that the neuron responds when the corner moves upward. The neuron’s response increases as the corner-shaped
stimulus gets longer, but then stops responding when the
corner becomes too long (Hubel & Wiesel, 1965).
Hubel and Wiesel’s finding that some neurons in the
cortex respond only to oriented lines was an extremely important discovery because it indicates that neurons in the
cortex do not simply respond to “light”; they respond to
some patterns of light and not to others. This makes sense
because the purpose of the visual system is to enable us to
perceive objects in the environment, and many objects can
be at least crudely represented by lines of various orientations. Thus, Hubel and Wiesel’s discovery that neurons respond selectively to stationary and moving lines was an important step toward determining how neurons respond to
more complex objects.
Because simple, complex, and end-stopped cells fire in
response to specific features of the stimulus, such as orientation or direction of movement, they are sometimes called
feature detectors. Table 4.1, which summarizes the properties of the five types of neurons we have described so far,
illustrates an important fact about neurons in the visual
system: As we travel farther from the retina, neurons fire to
more complex stimuli. Retinal ganglion cells respond best
to spots of light, whereas cortical end-stopped cells respond
best to bars of a certain length that are moving in a particular direction.
Do Feature Detectors Play
a Role in Perception?
Neural processing endows neurons with properties that
make them feature detectors, which respond best to a specific type of stimulus. But just showing that neurons respond
to specific stimuli doesn’t prove that they have anything to
do with the perception of these stimuli. One way to establish a link between the firing of these neurons and perception is by using a psychophysical procedure called selective
adaptation.
Selective Adaptation and
Feature Detectors
TYPE OF CELL
CHARACTERISTICS OF RECEPTIVE FIELD
Optic nerve fiber
(ganglion cell)
Center-surround receptive field.
Responds best to small spots,
but will also respond to other
stimuli.
When we view a stimulus with a specific property, neurons
tuned to that property fire. The idea behind selective adaptation is that if the neurons fire for long enough, they become fatigued, or adapt. This adaptation causes two physiological effects: (1) the neuron’s firing rate decreases, and
(2) the neuron fires less when that stimulus is immediately
presented again. According to this idea, presenting a vertical line causes neurons that respond to vertical lines to respond, but as these presentations continue, these neurons
eventually begin to fire less to vertical lines. Adaptation is
selective because only the neurons that respond to verticals
or near-verticals adapt, and other neurons do not.
The basic assumption behind a psychophysical selective
adaptation experiment is that if these adapted neurons have
anything to do with perception, then adaptation of neurons
that respond to verticals should result in the perceptual effect
of becoming selectively less sensitive to verticals, but not to
other orientations. Many selective adaptation experiments
have used a stimulus called a grating stimulus and a behavioral measure called the contrast threshold.
Lateral geniculate
Center-surround receptive fields
very similar to the receptive field of
a ganglion cell.
Grating Stimuli and the Contrast Threshold Grating stimuli are alternating bars. Figure 4.9a
TABLE 4.1
❚ Properties of Neurons in the Optic Nerve,
LGN, and Cortex
Simple cortical
Excitatory and inhibitory areas
arranged side by side. Responds
best to bars of a particular
orientation.
Complex cortical
Responds best to movement of a
correctly oriented bar across
the receptive field. Many cells
respond best to a particular
direction of movement.
End-stopped cortical
Responds to corners, angles, or
bars of a particular length moving in
a particular direction.
shows gratings with black and white bars. This figure shows
gratings with a number of different orientations. Figure
4.9b shows gratings with a number of different contrasts.
High-contrast gratings are on the left, and lower-contrast
gratings are on the right. A grating’s contrast threshold is
the difference in intensity at which the bars can just barely
be seen. The difference between the bars in the grating on
the far right of Figure 4.9b is close to the contrast threshold,
because further decreases in the difference between the light
and dark bars would make it difficult to see the bars. The following method describes the measurement of contrast V
L 5
thresholds in a selective adaptation experiment.
Do Feature Detectors Play a Role in Perception?
79
(a)
(b)
Figure 4.9 ❚ (a) Grating stimuli showing gratings with
different orientations. (b) A vertical grating. The contrast is
high for the grating on the left, and becomes lower for the
ones on the right.
M E T H O D ❚ Selective Adaptation to
Orientation
Selective adaptation to orientation involves the following three steps:
1. Measure a person’s contrast threshold to stimuli
with a number of different orientations
(Figure 4.10a).
2. Adapt the person to one orientation by having
the person view a high contrast adapting stimulus
for a minute or two. In this example, the adapting stimulus is a vertical grating (Figure 4.10b).
3. Remeasure the contrast threshold of all
of the test stimuli presented in step 1 V
L 6, 7
(Figure 4.10c).
(a) Measure contrast threshold at a number of orientations.
(b) Adapt to a high-contrast grating.
(c) Remeasure contrast thresholds for same orientation as above.
Figure 4.10 ❚ Procedure for a selective adaptation
experiment.
80
CHAPTER 4
The Visual Cortex and Beyond
Figure 4.11a shows the results of a selective adaptation
experiment in which the adapting stimulus was a vertically oriented grating. This graph indicates that adapting
with the vertical grating caused a large increase in contrast
threshold for the vertically oriented test grating. That is, the
contrast of a vertical grating had to be increased for the person to see the bars. This is what we would expect if the vertical adapting stimulus selectively affects neurons that were
tuned to respond best to verticals.
The important result of this experiment is that our psychophysical curve shows that adaptation selectively affects
only some orientations, just as neurons selectively respond
to only some orientations. In fact, comparing the psychophysically determined selective adaptation curve (4.11a) to
the orientation tuning curve for a simple cortical neuron
(4.11b) reveals that they are very similar. The psychophysical
curve is slightly wider because the adapting grating affects
not only neurons that respond best to verticals, but also
more weakly affects some neurons that respond to nearby
orientations. The near-match between the orientation selectivity of neurons and the perceptual effect of selective adaptation supports the idea that orientation detectors play a
role in perception.
Selective Rearing and
Feature Detectors
Further evidence that feature detectors are involved in perception is provided by selective rearing experiments. The
idea behind selective rearing is that if an animal is reared
in an environment that contains only certain types of stimuli, then neurons that respond to these stimuli will become
more prevalent. This follows from a phenomenon called
neural plasticity or experience-dependent plasticity—the
idea that the response properties of neurons can be shaped
by perceptual experience. According to this idea, rearing an
animal in an environment that contains only vertical lines
should result in the animal’s visual system having neurons
that respond predominantly to verticals.
This result may seem to contradict the results of the selective adaptation experiment just described, in which exposure to verticals decreases the response to verticals. However,
the selective rearing effect occurs over a longer timescale
and is strongest in young animals, whose visual systems
are still developing. Thus, when a kitten is exposed only to
verticals, some adaptation to vertical orientations may take
place (causing the response to verticals to decrease), but as
the animal develops, vertically responding neurons become
the only neurons that respond at all.
One way to describe the results of selective rearing experiments is “Use it or lose it.” This effect was demonstrated
in a classic experiment by Colin Blakemore and Grahame
Cooper (1970) in which they placed kittens in striped tubes
like the one in Figure 4.12a, so that each kitten was exposed
to only one orientation, either vertical or horizontal. The
kittens were kept in the dark from birth to 2 weeks of age, at
which time they were placed in the tube for 5 hours a day; the
30
Impulses/sec
Increase in contrast threshold
Large
20
10
Adapting
orientation
Small
40°
20°
0
20°
40°
40°
20°
0
20°
(Vertical)
(Vertical)
Orientation of grating
Orientation of grating
(a)
40°
(b)
Figure 4.11 ❚ (a) Results of a psychophysical selective adaptation experiment. This graph shows
that the participant’s adaptation to the vertical grating causes a large decrease in her ability to
detect the vertical grating when it is presented again, but has less effect on gratings that are tilted to
either side of the vertical. (b) Orientation tuning curve of the simple cortical neuron from Figure 4.6.
Horizontally
reared cat
Vertically
reared cat
Vertical
Vertical
Horizontal
Horizontal
Vertical
(a)
Vertical
(b)
Figure 4.12 ❚ (a) Striped tube used in Blakemore and Cooper’s (1970) selective rearing experiments.
(b) Distribution of optimal orientations for 52 cells from a cat reared in an environment of horizontal stripes,
on the left, and for 72 cells from a cat reared in an environment of vertical stripes, on the right. (Reprinted
by permission from Macmillan Publishers Ltd., Copyright 1970. From Blakemore, C., & Cooper, G. G. (1970).
Development of the brain depends on the visual environment. Nature, 228, 477–478.)
rest of the time they remained in the dark. Because the kittens sat on a Plexiglas platform, and the tube extended both
above and below them, there were no visible corners or edges
in their environment other than the stripes on the sides of
the tube. The kittens wore cones around their head to prevent them from seeing vertical stripes as oblique or horizontal stripes by tilting their heads; however, according to Blake-
more and Cooper, “The kittens did not seem upset by the
monotony of their surroundings and they sat for long V
L 8
periods inspecting the walls of the tube” (p. 477).
When the kittens’ behavior was tested after 5 months
of selective rearing, they seemed blind to the orientations
that they hadn’t seen in the tube. For example, a kitten that
was reared in an environment of vertical stripes would pay
Do Feature Detectors Play a Role in Perception?
81
attention to a vertical rod but ignored a horizontal rod. Following behavioral testing, Blakemore and Cooper recorded
from cells in the visual cortex and determined the stimulus
orientation that caused the largest response from each cell.
Figure 4.12b shows the results of this experiment. Each
line indicates the orientation preferred by a single neuron
in the cat’s cortex. This cat, which was reared in a vertical
environment, has many neurons that respond best to vertical or near-vertical stimuli, but none that respond to horizontal stimuli. The horizontally responding neurons were
apparently lost because they hadn’t been used. The opposite
result occurred for the horizontally reared cats. The parallel
between the orientation selectivity of neurons in the cat’s
cortex and the cat’s behavioral response to the same orientation provides more evidence that feature detectors are
involved in the perception of orientation. This connection
between feature detectors and perception was one of the
major discoveries of vision research in the 1960s and 1970s.
Another advance was the description of how these neurons
were organized in the brain.
Maps and Columns in
the Striate Cortex
We’ve seen that retinotopic maps exist on the LGN. This
organization, in which nearby points on a structure receive
signals from nearby locations on the retina, also occurs in
the striate cortex.
Maps in the Striate Cortex
Figure 4.13 shows the results of an experiment by Hubel
and Wiesel (1965), when they recorded from a series of neurons along an oblique electrode track in the cat’s visual
cortex. As for the LGN experiment in Figure 4.5, recordSurface of cortex
Retinotopic
map on cortex
1
2
M E T H O D ❚ Brain Imaging
Brain imaging refers to a number of techniques that
result in images that show which areas of the brain are
active. One of these techniques, positron emission
tomography (PET), was introduced in 1976 (Hoffman
et al., 1976; Ter-Pogossian et al., 1975). In the PET proce-
2 3 4
(a) Side view of cortex
1
ings were made from neurons encountered as the electrode
was inserted into the cortex, first neuron 1, then 2, and so
on. Hubel and Wiesel found that the receptive field of each
neuron was displaced slightly on the retina, as indicated
by the squares in Figure 4.13b, but that receptive fields of
neurons close to each other along the electrode track had
receptive fields that were close to each other on the retina.
Thus, nearby points on the cortex receive signals V
L 9, 10
from nearby locations in the retina.
Retinotopic mapping indicates that information about
objects near each other in the environment is processed by
neurons near each other in the cortex. This makes sense
in terms of efficiency of functioning. Adjacent areas in the
environment can affect one another, as evidenced by the
simultaneous contrast effect shown in Figure 3.39, so processing would be more efficient if areas that are adjacent in
the environment were also adjacent in the visual system.
Another example of physiology serving functionality
is that the area representing the cone-rich fovea is much
larger than one would expect from the fovea’s small size.
Even though the fovea accounts for only 0.01 percent of
the retina’s area, signals from the fovea account for 8 to 10
percent of the retinotopic map on the cortex (Van Essen &
Anderson, 1995). This apportioning the small fovea with a
large area on the cortex is called the cortical magnification
factor (Figure 4.14).
The cortical magnification factor in the human cortex
has been determined using a technique called brain imaging,
which makes it possible to create pictures of the brain’s activity (Figure 4.15). We will describe the procedure of brain
imaging and how this procedure has been used to measure
the cortical magnification factor in humans.
Visual
cortex
Retina
3
4
(b) Receptive field locations on retina
Figure 4.13 ❚ Retinotopic mapping of neurons in the
cortex. When the electrode penetrates the cortex obliquely,
the receptive fields of neurons recorded from the numbered
positions along the track are displaced, as indicated by the
numbered receptive fields; neurons near each other in the
cortex have receptive fields near each other on the retina.
82
CHAPTER 4
The Visual Cortex and Beyond
8–10% of cortical
map’s area
Fovea: 0.01%
of retinal area
Figure 4.14 ❚ The magnification factor in the visual system.
The small area of the fovea is represented by a large area on
the visual cortex.
(b) Test condition
jupiterimages
(a) Initial condition
Figure 4.15 ❚ A person in a brain scanning apparatus.
(c) Activity due to stimulation
dure, a person is injected with a low dose of a radioactive
tracer that is not harmful. The tracer enters the bloodstream and indicates the volume of blood flow. The basic
principle behind the PET scan is that changes in the activity of the brain are accompanied by changes in blood
flow, and monitoring the radioactivity of the injected
tracer provides a measure of this blood flow.
PET enabled researchers to track changes in blood
flow to determine which brain areas were being activated. To use this tool, researchers developed the subtraction technique, in which brain activity is measured in two conditions: (1) an initial condition, before the
stimulus of interest is presented; and (2) a test condition,
in which the stimulus of interest is presented. For example, if we were interested in determining which areas of
the brain are activated by manipulating an object with
the hand, the initial condition would be when the person is holding the object in his or her hand (Figure 4.16a)
and the test condition would be when the person is manipulating the object (Figure 4.16b). Subtracting the
activity record in the initial condition from the activity
in the test condition indicates the brain activation connected with manipulating the object (Figure 4.16c).
Another neuroimaging technique is functional
magnetic resonance imaging (fMRI). Like PET, fMRI
is based on the measurement of blood flow. Because hemoglobin, which carries oxygen in the blood, contains a
ferrous molecule and therefore has magnetic properties,
presenting a magnetic field to the brain causes the hemoglobin molecules to line up like tiny magnets.
fMRI indicates the presence of brain activity because the hemoglobin molecules in areas of high brain
activity lose some of the oxygen they are transporting.
This makes the hemoglobin more magnetic, so these
molecules respond more strongly to the magnetic field.
The fMRI apparatus determines the relative activity of
various areas of the brain by detecting changes in the
magnetic response of the hemoglobin that occurs when
a person perceives a stimulus or engages in a specific be-
Figure 4.16 ❚ The subtraction technique that is used to
interpret the results of brain imaging experiments. See text
for explanation.
havior. The subtraction technique described above for
PET is also used for the fMRI. Because fMRI doesn’t require radioactive tracers and because it is more accurate,
this technique has become the main method for localizing brain activity in humans.
Robert Dougherty and coworkers (2003) used brain
imaging to determine the magnification factor in the human visual cortex. Figure 4.17a shows the stimulus display
viewed by the observer, who was in an fMRI scanner. The
observer looked directly at the center of the screen, so the
dot at the center fell on the fovea. During the experiment
stimulus light was presented in two places: (1) near the center (red area), which illuminated a small area near the fovea;
and (2) farther from the center (blue area), which illuminated an area in the peripheral retina. The areas of the brain
activated by these two stimuli are indicated in Figure 4.17b.
This activation illustrates the magnification factor because stimulation of the small area near the fovea activated
a greater area on the cortex (red) than stimulation of the
larger area in the periphery (blue).
The large representation of the fovea in the cortex
is also illustrated in Figure 4.18, which shows the space
that would be allotted to words on a page (Wandell et al.,
2007a, 2007b). Notice that the letter “a,” which is near where
the person is looking (red arrow), is represented by a much
larger area in the cortex than letters that are far from where
Maps and Columns in the Striate Cortex
83
15°
Visual
field
10°
5°
(a)
Cortex
the person is looking. The extra cortical space allotted to
letters and words at which the person is looking provides
the extra neural processing needed to accomplish tasks
such as reading that require high visual acuity (Azzopardi &
Cowey, 1993).
The connection between cortical area and acuity has
been confirmed by Robert Duncan and Geoffrey Boynton
(2003). They measured brain activation with the fMRI and
visual acuity using a psychophysical task. The fMRI indicated that the magnification factor was not the same for
all of their observers. Some people had more cortical space
allotted to their foveas than other people, and those with
more cortical space also had better acuity. Apparently, good
acuity is associated not only with sharp focusing of images
on the retina, and the small amount of convergence of the
cones, but also with the relatively large amount of brain
area devoted to the all-cone fovea.
Columns in the Striate Cortex
Determining the retinotopic map and the magnification
factor has kept us near the surface of the cortex. We are now
going to consider what is happening below the surface by
looking at the results of experiments in which recording
electrodes were inserted perpendicular to the surface of the
cortex. Doing this has revealed that the cortex is organized
into a number of different kinds of columns.
(b)
Columns Hubel and Wiesel (1965) recorded from neurons along a perpendicular electrode track
as shown in Figure 4.19a, which shows a side view of the
Location
Figure 4.17 ❚ (a) Red and blue areas show the extent of
stimuli that were presented while a person was in an fMRI
scanner. (b) Red and blue indicate areas of the brain activated
by the stimulation in (a). (From Dougherty et al., 2003.)
Surface of cortex
1
2
3
4
Visual field
Visual field representation
in the brain (V1)
(a) Side view of cortex
1
3
2
4
(b) Receptive field locations on retina
Figure 4.19 ❚ When an electrode penetrates the cortex
Figure 4.18 ❚ Demonstration of the magnification factor.
A person looks at the red spot on the text on the left. The
area of brain activated by each letter of the text is shown on
the right. The arrows point to the letter a in the text on the
left, and the area in the brain activated by the a on the right.
(From Wandell et al., 2007b.)
84
CHAPTER 4
The Visual Cortex and Beyond
perpendicularly, the receptive fields of the neurons
encountered along this track overlap. The receptive field
recorded at each numbered position along the electrode
track is indicated by a correspondingly numbered square.
(This figure was published in Neuron, 56, Wandell, B. A.,
Dumoulin, S. O., & Brewer, A. A., Visual field maps in human
cortex, 366–383. Copyright Elsevier, 2007.)
cortex. The receptive fields of neurons 1, 2, 3, and 4, indicated by the squares in Figure 4.19b, are all located at about
the same place on the retina. Hubel and Wiesel concluded
from this result that the cortex is organized into location
columns that are perpendicular to the surface of the cortex so that all of the neurons within a location column have
their receptive fields at the same location on the retina.
Orientation Columns As Hubel and Wiesel lowered
their electrodes along the perpendicular track, they noted
not only that the neurons along this track had receptive
fields with the same location on the retina, but that these
neurons all preferred stimuli with the same orientations.
Thus, all cells encountered along the electrode track at A in
Figure 4.20 fired the most to horizontal lines, whereas all
those along electrode track B fired the most to lines oriented
at about 45 degrees. Based on this result, Hubel and Wiesel
concluded that the cortex is organized into orientation
columns, with each column containing cells that respond
best to a particular orientation. (Also see “If You Want to
Know More #1,” at the end of the chapter, for another technique for revealing orientation columns.)
Hubel and Wiesel also showed that adjacent columns
have cells with slightly different preferred orientations.
When they moved an electrode through the cortex obliquely,
as was done for the LGN (Figure 4.5), so that the electrode
cut across orientation columns, they found that the neurons’ preferred orientations changed in an orderly fashion,
so a column of cells that respond best to 90 degrees is right
next to the column of cells that respond best to 85 degrees
(Figure 4.21). Hubel and Wiesel also found that as they
moved their electrode 1 millimeter across the cortex, their
electrode passed through orientation columns that represented the entire range of orientations.
Ocular Dominance Columns Neurons in the cor-
tex are also organized with respect to the eye to which they
respond best. About 80 percent of the neurons in the cortex respond to stimulation of both the left and right eyes.
A
However, most neurons respond better to one eye than to the
other. This preferential response to one eye is called ocular
dominance, and neurons with the same ocular dominance
are organized into ocular dominance columns in the cortex. This means that each neuron encountered along a perpendicular electrode track responds best to the same eye.
Ocular dominance columns can also be observed during oblique penetrations of the cortex. A given area of cortex usually contains cells that all respond best to one of the
eyes, but when the electrode is moved about 0.25 to 0.50 mm
across the cortex, the neurons respond best to the other eye.
Thus, the cortex consists of a series of columns that alternate in ocular dominance in a left-right-left-right pattern.
Hypercolumns Hubel and Wiesel proposed that all
three types of columns could be combined into one larger
unit called a hypercolumn. Figure 4.22 is a schematic diagram called the ice-cube model (because it is shaped like an
ice cube) that Hubel and Wiesel used to depict a hypercolumn. This diagram shows two side-by-side hypercolumns.
Each hypercolumn contains a single location column (since
it responds to stimuli presented to a particular place on
the retina), left and right ocular dominance columns, and
a complete set of orientation columns that cover all possible
stimulus orientations from 0 to 180 degrees.
Hubel and Wiesel thought of a hypercolumn as a “processing module” that processes information about any
stimulus that falls within the location on the retina served
by the hypercolumn. They based this proposal on the fact
that each hypercolumn contains a full set of orientation
columns, so that when a stimulus of any orientation is presented to the area of retina served by the hypercolumn, neurons within the hypercolumn that respond to that orientation will be activated.
Research done since Hubel and Wiesel’s proposal of the
ice-cube model has shown that the actual organization of
Oblique
electrode
Surface of cortex
B
White
matter
Cortex
Figure 4.20 ❚ All of the cortical neurons encountered along
track A respond best to horizontal bars (indicated by the red
lines cutting across the electrode track). All of the neurons
along track B respond best to bars oriented at 45 degrees.
Preferred orientations of
neurons in each column
Figure 4.21 ❚ If an electrode is inserted obliquely into the
cortex, it crosses a sequence of orientation columns. The
preferred orientation of neurons in each column, indicated by
the bars, changes in an orderly way as the electrode crosses
the columns. The distance the electrode is advanced is
exaggerated in this picture.
Maps and Columns in the Striate Cortex
85
One location column
(entire darkened area)
Another
hypercolumn
(b)
R
(c)
1 mm
L
(a)
Figure 4.23 ❚ (a) Picture of the arrangement of columns
Right and left
ocular dominance
columns
Set of orientation
columns from
0 to 180 degrees
Figure 4.22 ❚ Schematic diagram of a hypercolumn as
pictured in Hubel and Wiesel’s ice-cube model. The light area
on the left is one hypercolumn, and the darkened area on the
right is another hypercolumn. The darkened area is labeled
to show that it consists of one location column, right and left
ocular dominance columns, and a complete set of orientation
columns.
the three kinds of columns is far more complex than the
picture in Figure 4.22. Figure 4.23a shows the results of an
experiment that determined the layout of orientation columns using brain imaging. In some cases columns that
prefer different orientations are lined up, as in Figure 4.23b
(the arrow on the left of Figure 4.23a locates one of these areas), and in some cases orientations are arranged in a “pinwheel” as in Figure 4.23c, so all orientations are represented
by traveling in a circle around a center point (see the small
square in Figure 4.23a).
Both Hubel and Wiesel’s ice-cube model and the more
complex arrangement of orientations shown in Figure 4.23
indicate that an oriented stimulus activates neurons located
in orientation columns in the cortex.
How Is an Object Represented
in the Striate Cortex?
How is an object represented in the striate cortex? That is,
how does the electrical activity in the cortex stand for the object in the environment? To begin, we will consider the situation in Figure 4.24, in which an observer is looking at a tree.
Looking at the tree results in an image on the retina, which
then results in a pattern of activation on the striate cortex
that looks something like the tree because of the retinotopic
map in the cortex. Notice, however, that the activation is distorted compared to the actual object. More space is allotted
to the top of the tree, where the observer is looking, because
the magnification factor allots more space on the cortex to
the parts of the image that fall on the observer’s fovea.
86
CHAPTER 4
The Visual Cortex and Beyond
that respond to different orientations, determined in the tree
shrew cortex by a brain-scanning technique called optical
imaging. Each color represents a different orientation.
Colors correspond to the orientations indicated by the bars
at the bottom; for example, an electrode inserted into a
light blue area will record from neurons that prefer vertical
orientations. (b) In some places, orientations are lined up,
so moving across the cortex in a straight line encounters all
of the orientations in order (dashed line); see the arrow on
the left in Figure 4.23a. (c) In other places, orientations are
arranged in a “pinwheel,” so preferred orientation changes
in an orderly way as we start with vertical (blue) and move
across the brain in a small circle, indicated by the arrow;
see the square in Figure 4.23a. (Adapted from Bosking et al.,
1997, Journal of Neuroscience, 17, 2112–2127, © 1997 by the
Society of Neuroscience. All rights reserved. Reproduced by
permission.)
But the pattern of activation on the surface of the cortex doesn’t tell the whole story. To appreciate how the tree is
represented by activity that is occurring under the surface
of the cortex, we will focus just on the trunk, which is essentially a long oriented bar. To determine which neurons
in the cortex will be activated by a long oriented bar, let’s
return to the idea of a hypercolumn. Remember that a hypercolumn processes information from a specific area of
the retina. This area is fairly small, however, so a long bar
will stimulate a number of hypercolumns. Since our trunk
is oriented vertically, it will activate neurons within the vertical (90-degree) orientation column within each hypercolumn, as shown in Figure 4.25.
Thus, a large stimulus, which stretches across the retina, will stimulate a number of different orientation columns, each in a location in the cortex that is separated from
the other orientation columns. Therefore, our tree trunk
has been translated into activity in a number of separated
orientation columns, and this activity looks quite different
from the shape of the stimulus, which is a single continuous bar.
Although it may be surprising that the tree is represented in a number of separate columns in the cortex, it
simply confirms a basic property of our perceptual system:
the cortical representation of a stimulus does not have to
resemble the stimulus; it just has to contain information that
represents the stimulus. The representation of the tree in the
Activation from
peripheral retina
Activation from fovea
Fovea
Figure 4.24 ❚ Looking at the tree creates an image on
Fovea
the observer’s retina, and this image on the retina causes
a pattern of activation on the visual cortex. This pattern is
distorted because of the magnificaton factor (more space is
allotted to the top of the tree, where the observer is looking.)
Image on retina
A
3. Describe the characteristics of simple, complex, and
B
end-stopped cells in the cortex. Why have these
cells been called feature detectors?
How has the psychophysical procedure of selective
adaptation been used to demonstrate a link between
feature detectors and the perception of orientation?
Be sure you understand the rationale behind a selective adaptation experiment and also how we can
draw conclusions about physiology from the results
of this psychophysical procedure.
How has the procedure of selective rearing been
used to demonstrate a link between feature detectors and perception? Be sure you understand the
concept of neural plasticity.
How is the retina mapped onto the striate cortex?
What is the cortical magnification factor, and what
function does it serve?
How was neural recording used to determine the
existence of location, orientation, and ocular dominance columns in the striate cortex?
Describe (a) the ice-cube model of organization and
(b) the pinwheel arrangement of orientation columns.
How is a simple object, such as a tree, represented
by electrical activity in the cortex?
C
4.
Retina
A
Cortex
5.
B
6.
C
Figure 4.25 ❚ How the trunk of the tree pictured in Figure
4.24 would activate a number of different orientation columns
in the cortex.
7.
8.
visual cortex is contained in the firings of neurons in separate cortical columns. Of course, this representation in the
striate cortex is only the first step in representing the tree.
As we will now see, signals from the striate cortex travel to a
number of other places in the cortex for further processing.
T E S T YO U R S E L F 4 .1
9.
Streams: Pathways for What,
Where, and How
1. Describe receptive fields of neurons in the LGN.
What is the evidence that the LGN is involved in
regulating information flow in the visual system?
2. Describe how the LGN is organized in layers, and
describe retinotopic mapping in the LGN.
So far, as we have been looking at types of neurons in the
cortex, and how the cortex is organized into maps and columns, we have been describing research primarily from the
1960s and 1970s. Most of the research during this time was
concerned with the striate cortex or areas near the striate
Streams: Pathways for What, Where, and How
87
cortex. Although a few pioneers had looked at visual functioning outside the striate cortex (Gross, Bender, & RocheMiranda, 1969), it wasn’t until the 1980s that a large number of researchers began investigating how stimulation
of the retina causes activity in areas far beyond the striate
cortex.
One of the most influential ideas to come out of this
research is that there are pathways, or “streams,” that transmit information from the striate cortex to other areas in the
brain. This idea was introduced in 1982, when Leslie Ungerleider and Mortimer Mishkin described experiments that
distinguished two streams that served different functions.
Area removed
(temporal lobe)
(a) Object discrimination
Area removed
(parietal lobe)
Streams for Information
About What and Where
Ungerleider and Mishkin (1982) used a technique called ablation (also called lesioning). Ablation refers to the destruction or removal of tissue in the nervous system.
M E T H O D ❚ Brain Ablation
The goal of a brain ablation experiment is to determine
the function of a particular area of the brain. This is accomplished by first determining an animal’s capacity by
testing it behaviorally. Most ablation experiments have
used monkeys because of the similarity of their visual
system to that of humans and because monkeys can be
trained to determine perceptual capacities such as acuity, color vision, depth perception, and object perception.
(b) Landmark discrimination
Figure 4.26 ❚ The two types of discrimination tasks used
by Ungerleider and Mishkin. (a) Object discrimination: Pick
the correct shape. Lesioning the temporal lobe (shaded area)
makes this task difficult. (b) Landmark discrimination: Pick
the food well closer to the cylinder. Lesioning the parietal
lobe makes this task difficult. (From Mishkin, Ungerleider, &
Macko, 1983.)
Once the animal’s perception has been measured,
a particular area of the brain is ablated (removed or destroyed), either by surgery or by injecting a chemical at
the area to be removed. Ideally, one particular area is removed and the rest of the brain remains intact. After ablation, the monkey is retrained to determine which perceptual capacities remain and which have been affected
by the ablation.
Ungerleider and Mishkin presented monkeys with two
tasks: (1) an object discrimination problem and (2) a landmark discrimination problem. In the object discrimination problem, a monkey was shown one object, such as a
rectangular solid, and was then presented with a two-choice
task like the one shown in Figure 4.26a, which included the
“target” object (the rectangular solid) and another stimulus,
such as the triangular shape. If the monkey pushed aside
the target object, it received the food reward that was hidden in a well under the object. The landmark discrimination problem is shown in Figure 4.26b. Here, the monkey’s
task was to remove the food well cover that was closest to
the tall cylinder.
In the ablation part of the experiment, part of temporal lobe was removed in some monkeys. After ablation,
88
CHAPTER 4
The Visual Cortex and Beyond
Where/How
Parietal lobe
Dorsal
pathway
Ventral
pathway
Temporal lobe
What
Occipital lobe
(primary visual
receiving area)
Figure 4.27 ❚ The monkey cortex, showing the what, or
ventral, pathway from the occipital lobe to the temporal lobe,
and the where, or dorsal, pathway from the occipital lobe to
the parietal lobe. The where pathway is also called the how
pathway. (From Mishkin, Ungerleider, & Macko, 1983.)
behavioral testing showed that the object discrimination
problem was very difficult for these monkeys. This result indicates that the pathway that reaches the temporal lobes is
responsible for determining an object’s identity. Ungerleider
and Mishkin therefore called the pathway leading from
the striate cortex to the temporal lobe the what pathway
(Figure 4.27).
Other monkeys, which had their parietal lobes removed,
had difficulty solving the landmark discrimination problem. This result indicates that the pathway that leads to the
parietal lobe is responsible for determining an object’s location. Ungerleider and Mishkin therefore called the pathway
leading from the striate cortex to the parietal lobe the where
pathway.
The what and where pathways are also called the ventral
pathway (what) and the dorsal pathway (where), because
the lower part of the brain, where the temporal lobe is located, is the ventral part of the brain, and the upper part
of the brain, where the parietal lobe is located, is the dorsal
part of the brain. The term dorsal refers to the back or the
upper surface of an organism; thus, the dorsal fin of a shark
or dolphin is the fin on the back that sticks out of the water.
Figure 4.28 shows that for upright, walking animals such as
humans, the dorsal part of the brain is the top of the brain.
(Picture a person with a dorsal fin sticking out of the top of
his or her head!) Ventral is the opposite of dorsal, hence it
refers to the lower part of the brain.
The discovery of two pathways in the cortex—one for
identifying objects (what) and one for locating objects
(where)—led some researchers to look back at the retina and
LGN. Using both recordings from neurons and ablation,
they found that properties of the ventral and dorsal streams
are established by two different types of ganglion cells
in the retina, which transmit signals to different layers of
the LGN. Thus, the cortical ventral and dorsal streams can
actually be traced back to the retina and LGN. (For more
about research on the origins of processing streams in the
retina and LGN, see “If You Want to Know More #2” V
L 11
at the end of the chapter.)
Although there is good evidence that the ventral and
dorsal pathways serve different functions, it is important
to note that (1) the pathways are not totally separated, but
have connections between them; and (2) signals flow not
only “up” the pathway toward the parietal and temporal
lobes, but “back” as well (Merigan & Maunsell, 1993; Ungerleider & Haxby, 1994). It makes sense that there would
be communication between the pathways because in our
everyday behavior we need to both identify and locate objects, and we routinely coordinate these two activities every
time we identify something (for example, a pencil) and take
action with regard to it (picking up the pencil and writing
with it). Thus, there are two distinct pathways, but some
information is shared between them. The “backward” flow
of information, called feedback, provides information from
higher centers that can influence the signals flowing into
the system. This feedback is one of the mechanisms behind
top-down processing, introduced in Chapter 1 (page 10).
Streams for Information
About What and How
Although the idea of ventral and dorsal streams has been
generally accepted, David Milner and Melvyn Goodale
(1995; see also Goodale & Humphrey, 1998, 2001) have
suggested that rather than being called the what and where
streams, the ventral and dorsal streams should be called
the what and how streams. The ventral stream, they argue,
is for perceiving objects, an idea that fits with the idea of
what. However, they propose that the dorsal stream is for
taking action, such as picking up an object. Taking this action would involve knowing the location of the object, consistent with the idea of where, but it also involves a physical interaction with the object. Thus, reaching to pick up a
pencil involves information about the pencil’s location plus
movement of the hand toward the pencil. According to this
idea, the dorsal stream provides information about how to
direct action with regard to a stimulus.
Evidence supporting the idea that the dorsal stream is
involved in how to direct action is provided by the discovery of neurons in the parietal cortex that respond (1) when
a monkey looks at an object and (2) when it reaches toward
the object (Sakata et al., 1992; also see Taira et al., 1990). But
the most dramatic evidence supporting the idea of a dorsal
“action,” or how, stream comes from neuropsychology—the
study of the behavioral effects of brain damage in humans.
Dorsal for brain
Ventral for brain
M E T H O D ❚ Dissociations in
Neuropsychology
Dorsal for back
Figure 4.28 ❚ Dorsal refers to the back surface of an
organism. In upright standing animals such as humans,
dorsal refers to the back of the body and to the top of the
head, as indicated by the arrows and the curved dashed line.
Ventral is the opposite of dorsal.
One of the basic principles of neuropsychology is that
we can understand the effects of brain damage by studying dissociations—situations in which one function is
absent while another function is present. There are two
kinds of dissociations: single dissociations, which can
be studied in a single person, and double dissociations,
which require two or more people.
To illustrate a single dissociation, lets consider a
woman, Alice, who has suffered damage to her temporal lobe. She has difficulty naming objects but has no
Streams: Pathways for What, Where, and How
89
TABLE 4.2
❚ A Double Dissociation
NAMING OBJECTS
DETERMINING OBJECTS’
LOCATIONS
(a) ALICE: Temporal
lobe damage
(ventral stream)
NO
YES
(b) BERT: Parietal
lobe damage
(dorsal stream)
YES
NO
trouble indicating where they are located (Table 4.2a).
Alice demonstrates a single dissociation—one function is
present (locating objects) and another is absent (naming
objects). From a single dissociation such as this, in which
one function is lost while another function remains, we
can conclude that two functions (in this example, locating and naming objects) involve different mechanisms,
although they may not operate totally independently of
one another.
We can illustrate a double dissociation by fi nding
another person who has one function present and another absent, but in a way opposite to Alice. For example,
Bert, who has parietal lobe damage, can identify objects
but can’t tell exactly where they are located (Table 4.2b).
The cases of Alice and Bert, taken together, represent a
double dissociation. Establishing a double dissociation
enables us to conclude that two functions are served by
different mechanisms and that these mechanisms operate independently of one another.
The Behavior of Patient D.F. The method of de-
termining dissociations was used by Milner and Goodale
(1995) to study D.F., a 34-year-old woman who suffered
damage to her ventral pathway from carbon monoxide poisoning caused by a gas leak in her home. One result of the
brain damage was that D.F. was not able to match the orientation of a card held in her hand to different orientations of
a slot. This is shown in the left circle in Figure 4.29a. Each
line in the circle indicates the orientation to which D.F. adjusted the card. Perfect matching performance would be indicated by a vertical line for each trial, but D.F.’s responses
are widely scattered. The right circle shows the accurate performance of the normal controls.
Because D.F. had trouble orienting a card to match
the orientation of the slot, it would seem reasonable that
she would also have trouble placing the card through the
slot because to do this she would have to turn the card so
that it was lined up with the slot. But when D.F. was asked
to “mail” the card through the slot, she could do it! Even
though D.F. could not turn the card to match the slot’s orientation, once she started moving the card toward the slot,
90
CHAPTER 4
The Visual Cortex and Beyond
(a) Perceptual orientation matching
DF
(b) Active “posting”
Control
Figure 4.29 ❚ Performance of D.F. and a person without
brain damage on two tasks: (a) judging the orientation of
a slot; and (b) placing a card through the slot. See text for
details. (From Milner & Goodale, 1995.)
she was able to rotate it to match the orientation of the slot
(Figure 4.29b). Thus, D.F. performed poorly in the static
orientation-matching task but did well as soon as action was
involved (Murphy, Racicot, & Goodale, 1996). Milner and
Goodale interpreted D.F.’s behavior as showing that there
is one mechanism for judging orientation and another for
coordinating vision and action.
These results for D.F. demonstrate a single dissociation, which indicates that judging orientation and coordinating vision and action involve different mechanisms. To
show that these two functions are not only served by different mechanisms but are also independent of one another,
we have to demonstrate a double dissociation. As we saw in
the example of Alice and Bert, this involves finding a person
whose symptoms are the opposite of D.F.’s, and such people
do, in fact, exist. These people can judge visual orientation,
but they can’t accomplish the task that combines vision and
action. As we would expect, whereas D.F.’s ventral stream is
damaged, these other people have damage to their dorsal
streams.
Based on these results, Milner and Goodale suggested
that the ventral pathway should still be called the what
pathway, as Ungerleider and Mishkin suggested, but that a
better description of the dorsal pathway would be the how
pathway, or the action pathway, because it determines how a
person carries out an action. As sometimes occurs in science,
not everyone uses the same terms. Thus, some researchers
call the dorsal stream the where pathway and some call it
the how or action pathway.
The Behavior of People Without Brain
Damage In our normal daily behavior we aren’t aware of
two visual processing streams, one for what and the other
for how, because they work together seamlessly as we perceive objects and take actions toward them. Cases like that
of D.F., in which one stream is damaged, reveal the existence of these two streams. But what about people without
damaged brains? Psychophysical experiments that measure
how people perceive and react to visual illusions have demonstrated the dissociation between perception and action
that was evident for D.F.
Figure 4.30a shows a stimulus called the rod and frame
illusion, which was used in one of these experiments. In this
illusion, the two small lines inside the tilted squares appear
slightly tilted in opposite directions, even though they are
parallel vertical lines.
Richard Dyde and David Milner (2002) presented their
observers with two tasks: a matching task and a grasping
task. In the matching task, observers adjusted the matching
A
B
Adjust to match
orientation of
stimulus on
the left
5
4
3
2
1
0
Matching
stimulus
Grasp ends of rod
between thumb
and forefinger
5
4
3
2
1
0
(c) Grasping task
Figure 4.30 ❚ (a) Rod and frame illusion. Both small
lines are oriented vertically. (b) Matching task and results.
(c) Grasping task and results. See text for details.
Magnitude of
illusion (degrees)
(b) Matching task
Magnitude of
illusion (degrees)
(a) Rod and frame illusion
stimulus, a rod located in an upright square (on the right)
until it appeared to match the orientation of the vertical
rod in the tilted square (on the left) (Figure 4.30b). This provided a measure of how much the tilted square made the
vertical rod on the left appear tilted. The results, shown on
the right, indicate that observers had to adjust the matching stimulus to 5 degrees from vertical in order to make it
match their perception of the rod in the tilted square.
In the grasping task, observers grasped a rod in the tilted
square between their thumb and forefi nger (Figure 4.30c).
The positioning of the thumb and forefi nger was measured
using a special position-sensing device attached to the observers’ fingers. The result, shown on the right, indicates
that observers positioned their fingers appropriately for the
rod’s orientation. Thus the tilted square did not affect the
accuracy of grasping.
The rationale behind this experiment is that because
these two tasks involve different processing streams (matching task ⫽ ventral, or what, stream; grasping task = dorsal,
or how, stream), they may be affected differently by the presence of the surrounding frames. In other words, conditions
that created a perceptual visual illusion (matching task) had no
effect on the person’s ability to take action with regard to the
stimulus (grasping task). These results support the idea that
perception and action are served by different mechanisms.
Thus, an idea that originated with observations of patients
with brain damage is supported by the performance of observers without brain damage.
Modularity: Structures for
Faces, Places, and Bodies
We have seen how the study of the visual system has progressed from Hubel and Wiesel’s discovery of neurons in the
striate cortex that respond to oriented bars, to discovery of
the ventral and dorsal streams. We now return to where we
left off with Hubel and Wiesel to consider more research on
the types of stimuli to which individual neurons respond.
As researchers moved outside the striate cortex, they
found neurons that responded best to more complex stimuli. For example, Keiji Tanaka and his coworkers (Ito et al.,
1995; Kobatake & Tanaka, 1994; Tanaka, 1993; Tanaka
et al., 1991) recorded from cells in the temporal cortex that
responded best to complex stimuli, such as the disc with a
thin bar shown in Figure 4.31a. This cell, which responds
best to a circular disc with a thin bar, responds poorly to
the bar alone (Figure 4.31b) or the disc alone (Figure 4.31c).
The cell does respond to the square shape with the bar
(Figure 4.31d), but not as well to the circle and bar.
In addition to discovering neurons that respond to complex stimuli, researchers also found evidence that neurons
that respond to similar stimuli are often grouped together
in one area of the brain. A structure that is specialized to
process information about a particular type of stimulus is
Modularity: Structures for Faces, Places, and Bodies
91
(b)
(c)
called a module. There is a great deal of evidence that there
are specific areas in the temporal lobe that respond best to
particular types of stimuli.
Face Neurons in the Monkey’s
IT Cortex
Edmund Rolls and Martin Tovee (1995) measured the response of neurons in the monkey’s inferotemporal (IT) cortex (Figure 4.32a). When they presented pictures of faces and
pictures of nonface stimuli (mostly landscapes and food),
they found many neurons that responded best to faces.
Figure 4.33 shows the results for a neuron that responded
briskly to faces but hardly at all to other types of stimuli.
You may wonder how there could be neurons that respond best to complex stimuli such as faces. We have seen
how neural processing that involves the mechanisms of
convergence, excitation, and inhibition can create neurons
that respond best to small spots of light (Figure 2.16). The
same mechanisms are presumably involved in creating neurons that respond to more complex stimuli. Of course, the
neural circuits involved in creating a “face-detecting” neuron must be extremely complex. However, the potential for
this complexity is there. Each neuron in the cortex receives
inputs from an average of 1,000 other neurons, so the number of potential connections between neurons in the cortex
(d)
is astronomical. When we consider the vast complexity of
the neural interconnections that must be involved in creating a neuron that responds best to faces, it is easy to agree
with William James’s (1890/1981) description of the brain
as “the most mysterious thing in the world.”
Areas for Faces, Places, and Bodies
in the Human Brain
Brain imaging (see Method, page 82) has been used to identify areas of the human brain that contain neurons that
respond best to faces, and also to pictures of scenes and human bodies. In one of these experiments, Nancy Kanwisher
and coworkers (1997) first used fMRI to determine brain
activity in response to pictures of faces and other objects,
Bruce Goldstein
(a)
Figure 4.31 ❚ How a neuron in a monkey’s
temporal lobe responds to a few stimuli. This
neuron responds best to a circular disc with
a thin bar. (Adapted from Tanaka et al., 1991.)
Firing rate
20
FFA
10
IT
0
(b)
(a)
Figure 4.32 ❚ (a) Monkey brain showing the location of
the inferotemporal (IT) cortex. (b) Human brain showing the
location of the fusiform face area (FFA), which is located just
under the temporal lobe.
92
CHAPTER 4
The Visual Cortex and Beyond
Faces
Nonfaces
Figure 4.33 ❚ Size of response of a neuron in the monkey’s
IT cortex that responds to face stimuli but not to nonface
stimuli. (Based on data from Rolls & Tovee, 1995.)
nonpreferred
PPA
Additional evidence of an area specialized for the perception of faces is that damage to the temporal lobe causes
prosopagnosia—difficulty recognizing the faces of familiar
people. Even very familiar faces are affected, so people with
prosopagnosia may not be able to recognize close friends
or family members—or even their own reflection in the
mirror—although they can easily identify people as soon
as they hear them speak (Burton et al., 1991; Hecaen &
Angelerques, 1962; Parkin, 1996).
In addition to the FFA, which contains neurons that are
activated by faces, two other specialized areas in the temporal cortex have been identified. The parahippocampal place
area (PPA) is activated by pictures depicting indoor and
outdoor scenes like those shown in Figure 4.34a (Aguirre
et al., 1998; R. Epstein et al., 1999; R. Epstein & Kanwisher,
1998). Apparently what is important for this area is information about spatial layout, because activation occurs both
to empty rooms and to rooms that are completely furnished
(Kanwisher, 2003). The other specialized area, the extrastriate body area (EBA), is activated by pictures of bodies and
parts of bodies (but not by faces), as shown in Figure 4.34b
(Downing et al., 2001).
We have come a long way from Hubel and Wiesel’s simple and complex cells in the striate cortex that respond best
to oriented lines. The existence of neurons that are specialized to respond to faces, places, and bodies brings us closer
to being able to explain how perception is based on the firing of neurons. It is likely that our perception of faces, landmarks, and people’s bodies depends on specifically tuned
neurons in areas such as the FFA, PPA, and EBA.
But it is also important to recognize that even though
stimuli like faces and buildings activate specific areas of the
brain, these stimuli also activate other areas of the brain as
well. This is illustrated in Figure 4.35, which shows the re-
preferred
such as scrambled faces, household objects, houses, and
hands. When they subtracted the response to the other objects from the response to the faces, Kanwisher and coworkers found that activity remained in an area they called the
fusiform face area (FFA), which is located in the fusiform
gyrus on the underside of the brain directly below the IT
cortex (Figure 4.32b). They interpreted this result to mean
that the FFA is specialized to respond to faces.
nonpreferred
EBA
preferred
(a)
(b)
Figure 4.34 ❚ (a) The parahippocampal place area is
activated by places (top row) but not by other stimuli (bottom
row). (b) The extrastriate body area is activated by bodies
(top), but not by other stimuli (bottom). (From Kanwisher, N.,
The ventral visual object pathway in humans: Evidence from
fMRI. In The Visual Neurosciences, 2003, pp. 1179–1189.
Edited by Chalupa, L., & Werner, J., MIT Press.)
Houses
(a) Segregation by category
Maximal Respose to:
Houses
Chairs
Faces
Chairs
(b) Response magnitude
Percent Activation
Faces
No difference
–1
0 +1 +2
Figure 4.35 ❚ fMRI responses of the human brain to various types of stimuli: (a) areas that were most strongly activated
by houses, faces, and chairs; (b) all areas activated by each type of stimulus. (From Alumit Ishai, Leslie G. Ungerleider, Alex
Martin, James V. Haxby, “The representation of objects in the human occipital and temporal cortex,” Journal of Cognitive
Neuroscience, 12:2 (2000), 35–51. © 2000 by the Massachusetts Institute of Technology.)
Modularity: Structures for Faces, Places, and Bodies
93
sults of an fMRI experiment on humans. Figure 4.35a shows
that pictures of houses, faces, and chairs cause maximum
activation in three separate areas in the IT cortex. However, each type of stimulus also causes substantial activity
within the other areas, as shown in the three panels limited
to just these areas (Figue 4.35b; Ishai et al., 2000; Ishai et al.,
1999). Thus, the idea of specialized modules is correct, but
shouldn’t be carried too far. Objects may cause a focus of
activity in a particular area, but they are represented in the
cortex by activity that is distributed over a wide area (J. D.
Cohen & Tong, 2001; Riesenhuber & Poggio, 2000, 2002).
Something to Consider:
How Do Neurons Become
Specialized?
When researchers began describing neurons that were specialized to fire to specific stimuli, such as faces, places, and
bodies, they naturally wondered how this specialization
might have occurred. One possibility is that these neurons
have become specialized by a process of biological evolution,
so that people are born with selective neurons. Another possibility is that these neurons become specialized by a process involving people’s experience as they perceive common
objects in their environment.
Is Neural Selectivity Shaped
by Evolution?
According to the theory of natural selection, genetically
based characteristics that enhance an animal’s ability to
survive, and therefore reproduce, will be passed on to future
generations. Thus, a person whose visual system contains
neurons that fire to important things in the environment
(such as faces) will be more likely to survive and pass on his
or her characteristics than will a person whose visual system does not contain these specialized neurons. Through
this evolutionary process, the visual system may have been
shaped to contain neurons that respond to faces and other
important perceptual information.
There is no question that evolution has shaped the
functioning of the senses, just as it has shaped all the other
physical and mental characteristics that have enabled us to
survive as a species. We know that the visual system is not
a “blank slate” at birth. Newborn monkeys have neurons
that respond to the direction of movement and the relative depths of objects (Chino et al., 1997), and 3½-week-old
monkeys possess orientation columns that are organized
like the adult columns in Figure 4.20 (Hübener et al., 1995).
Although we have less information about the neural structure of infant humans than of infant monkeys, we do know
that babies prefer looking at pictures in which the parts are
arranged to resemble a face compared to pictures in which
the same parts are scrambled (Johnson et al., 1991; also see
94
CHAPTER 4
The Visual Cortex and Beyond
Turati et al., 2002). It is likely that this behavior is caused by
neurons that respond best to facelike patterns.
Although there is no question that the basic layout and
functioning of all of the senses is the result of evolution, it is
difficult to prove whether a particular capacity is “built in”
by evolution or is the result of learning (Kanwisher, 2003).
There is, however, a great deal of evidence that learning can
shape the response properties of neurons that respond best
to complex visual features.
How Neurons Can Be Shaped
by Experience
Although it may be important for the visual system to have
some specialized neurons at birth, it is also important that
the visual system be able to adapt to the specific environment in which a person or animal lives. The nervous system
can achieve this adaptation through a process that causes
neurons to develop so that they respond best to the types
of stimulation to which the person has been exposed. This
is the process of experience-dependent plasticity introduced
earlier in this chapter.
The idea of experience-dependent plasticity was first
suggested by experiments with animals, such as the one in
which kittens were raised in an environment that contained
only verticals (Figure 4.12). The fact that most of the neurons in the kittens’ cortex responded only to verticals after this experience is an example of experience-dependent
plasticity. There is also evidence that experience causes
changes in how neurons are tuned in the human cortex. For
example, brain-imaging experiments have shown that there
are regions in the human cortex specialized to respond to
visual letters and word forms (Nobre et al., 1994). Because
humans have been reading for only a few thousand years,
this specialized responding could not have evolved but
must have developed as people learned to read (Ungerleider
& Pasternak, 2003).
Brain-imaging experiments have also demonstrated a
shift in responding of neurons in the FFA due to training.
Isabel Gauthier and coworkers (1999) used fMRI to determine the level of activity in the fusiform face area (FFA) in
response to faces and to objects called Greebles—families of
computer-generated “beings” that all have the same basic
configuration but differ in the shapes of their parts (Figure
4.36a). Initially, the observers were shown both human faces
and Greebles. The results for this part of the experiment,
shown by the left pair of bars in Figure 4.36b, indicate that
the FFA neurons responded poorly to the Greebles but well
to the faces.
The participants were then trained in “Greeble recognition” for 7 hours over a 4-day period. After the training
sessions, participants had become “Greeble experts,” as indicated by their ability to rapidly identify many different
Greebles by the names they had learned during the training. The right pair of bars in Figure 4.36b shows how becoming a Greeble expert affected the neural response in the
Greebles
Faces
T E S T YO U R S E L F 4 . 2
FFA response
1. How has ablation been used to demonstrate the
(a)
(b)
Before
training
After
training
Figure 4.36 ❚ (a) Greeble stimuli used by Gauthier.
Participants were trained to name each different Greeble.
(b) Brain responses to Greebles and faces before and after
Greeble training. (Reprinted by permission from Macmillan
Publishers Ltd, Copyright 1999: Nature Neuroscience, 2,
568–573. From Figure 1a, p. 569, from Gauthier, I., Tarr, M. J.,
Anderson, A. W., Skudlarski, P. L., & Gore, J. C., “Activation
of the middle fusiform ‘face area’ increases with experience
in recognizing novel objects,”1999.)
existence of the ventral and dorsal processing
streams? What is the function of these streams?
2. How has neuropsychology been used to show
that one of the functions of the dorsal stream is to
process information about coordinating vision and
action? How do the results of a behavioral experiment involving the rod and frame illusion support
this conclusion?
3. What is the evidence that there are modules for
faces, places, and bodies? What is the evidence that
stimuli like faces and places also activate a wide
area of the cortex?
4. What is the evidence that the properties of selective neurons are determined by evolution? By
experience?
THINK ABOUT IT
participants’ FFA. After the training, the FFA neurons responded about as well to Greebles as to faces.
Apparently, the FFA area of the cortex is an area that
responds not just to faces but to other complex objects as
well. The objects that the neurons respond to are established by experience with those objects. In fact, Gauthier
has also shown that neurons in the FFA of people who are
experts in recognizing cars or birds respond well not only
to human faces but to cars (for the car experts) and to birds
(for the bird experts; Gauthier et al., 2000). It is important
to note that the function of the FFA is controversial: Some
researchers agree with Gauthier’s idea that the FFA is specialized to respond to complex objects that have become familiar through experience, and others believe that the FFA
is specialized to respond specifically to faces. (See “If You
Want to Know More #6” at the end of the chapter.)
Let’s return to the question we posed at the beginning
of this section: How do neurons become specialized? It
seems that specialized tuning is at least partially the result
of experience-dependent plasticity. This makes it possible
for neurons to adapt their tuning to objects that are seen
often and that are behaviorally important. Thus, evolution has apparently achieved exactly what it is supposed to
achieve—it has created an area that is able to adapt to the
specific environment in which an animal or human lives.
According to this idea, if we moved to a new planet inhabited by Greebles or other strange-looking creatures, a place
that contained landscapes and objects quite different from
Earth’s, our neurons that now respond well to Earth creatures and objects would eventually change to respond best
to the creatures and environment of this new, and previously strange, environment (Gauthier et al., 1999).
1.
Cell A responds best to vertical lines moving to the
right. Cell B responds best to 45-degree lines moving to
the right. Both of these cells have an excitatory synapse
with cell C. How will cell C fire to vertical lines? To 45degree lines? (p. 78)
2.
We have seen that the neural firing associated with an
object in the environment does not necessarily look like,
or resemble, the object. Can you think of situations that
you encounter in everyday life in which objects or ideas
are represented by things that do not exactly resemble
those objects or ideas? (p. 86)
3.
Ralph is hiking along a trail in the woods. The trail is
bumpy in places, and Ralph has to avoid tripping on occasional rocks, tree roots, or ruts in the trail. Nonetheless, he is able to walk along the trail without constantly
looking down to see exactly where he is placing his feet.
That’s a good thing because Ralph enjoys looking out
at the woods to see whether he can spot interesting
birds or animals. How can you relate this description
of Ralph’s behavior to the operation of the dorsal and
ventral streams in the visual system? (p. 88)
4.
Although most neurons in the striate cortex respond to
stimulation of small areas of the retina, many neurons
in the temporal lobe respond to areas that represent
as much as half of the visual field (see “If You Want to
Know More #4,” below). What do you think the function of such neurons is? (p. 96)
5.
We have seen that there are neurons that respond to
complex shapes and also to environmental stimuli such
as faces, bodies, and places. Which types of neurons do
Think About It
95
Bruce Goldstein
Schiller, P. H., Logothetis, N. K., & Charles, E. R.
(1990). Functions of the colour-opponent and
broad-band channels of the visual system. Nature,
343, 68–70.
3.
Another kind of specialized neuron. Neurons called bimodal neurons respond to a visual stimulus presented
near a place on a monkey’s body, such as the face or
the hand, and also to touching that part of the body.
(p. 93)
Graziano, M. S. A., & Gross, C. G. (1995). The representation of extrapersonal space: A possible role for
bimodal, visual-tactile neurons. In M. S. Gazzaniga (Ed.), The cognitive neurosciences (pp. 1021–1034).
Cambridge, MA: MIT Press.
4.
Wide-angle neurons. There are maps of the retina in
the striate cortex, and neurons in the striate cortex
respond to stimulation of a small area of the retina.
However, recording from neurons further “upstream”
in the visual system, in places such as the temporal
cortex, reveals that “maps” like those in the striate
cortex no longer exist because neurons in these structures respond to stimulation of very large areas of the
retina. (p. 95)
Rolls, E. T. (1992). Neurophysiological mechanisms
underlying face processing within and beyond the
temporal cortical areas. Philosophical Transactions of
the Royal Society of London, 335B, 11–21.
5.
Invariant neurons. There are neurons at the far end of
the ventral stream that continue to respond to objects
even when these objects appear in different orientations or their size is changed. (p. 88)
Perrett, D. I., & Oram, M. W. (1993). Neurophysiology of shape processing. Image and Visual Computing, 11, 317–333.
6.
Fusiform face area. There is a controversy over the role
of the fusiform face area: Some researchers believe it
is specialized to respond to faces. Others believe it is
specialized to respond to complex objects that we have
had experience with; according to this view, the FFA
responds to faces because we see lots of faces. (p. 95)
Kanwisher, N. (2000). Domain specificity in face
perception. Nature, 3, 759–763.
Tarr, M. J., & Gauthier, I. (2000). FFA: A flexible fusiform area for subordinate-level visual processing
automatized by expertise. Nature, 3, 764–769.
Figure 4.37 ❚ “Howdy, pardner.”
you think would fire to the stimulus in Figure 4.37?
How would your answer to this question be affected
if this stimulus were interpreted as a human figure?
(“Howdy, pardner!”) What role would top-down processing play in determining the response to a cactus-asperson stimulus? (p. 92)
IF YOU WANT TO KNOW MORE
1.
2.
Seeing columns. Location columns can be revealed by
using a technique called autoradiography, in which a
monkey injected with radioactive tracer views grating
with a particular orientation. This makes it possible
to see columns that were discovered using single-unit
recording. (p. 85)
Hubel, D. H., Wiesel, T. N., & Stryker, M. P.
(1978). Anatomical demonstration of orientation
columns in macaque monkey. Journal of Comparative Neurolog y, 177, 361–379.
The origins of processing streams in the retina and LGN. Experiments that determined how ablating specific
areas of the LGN affected monkeys’ behavior have
shown that the dorsal and ventral streams can be
traced back to the LGN and retina. (p. 89)
KEY TERMS
Ablation (p. 88)
Action pathway (p. 90)
Brain imaging (p. 82)
Complex cells (p. 78)
Contralateral eye (p. 76)
Contrast threshold (p. 79)
96
CHAPTER 4
Cortical magnification
factor (p. 82)
Dissociation (p. 89)
Dorsal pathway (p. 89)
Double dissociation (p. 89)
End-stopped cells (p. 79)
The Visual Cortex and Beyond
Experience-dependent
plasticity (p. 80)
Extrastriate body area (EBA) (p. 93)
Feature detectors (p. 79)
Functional magnetic resonance
imaging (fMRI) (p. 83)
Fusiform face area (FFA) (p. 93)
Grating stimuli (p. 79)
How pathway (p. 90)
Hypercolumn (p. 85)
Ipsilateral eye (p. 76)
Landmark discrimination problem
(p. 88)
Lateral geniculate nucleus (LGN)
(p. 74)
Location column (p. 85)
Module (p. 92)
Neural plasticity (p. 80)
Neuropsychology (p. 89)
Object discrimination problem
(p. 88)
Ocular dominance (p. 85)
Ocular dominance column (p. 85)
Orientation column (p. 85)
Orientation tuning curve (p. 77)
Parahippocampal place area (PPA)
(p. 93)
Positron emission tomography (PET)
(p. 82)
Primary visual receiving area (p. 74)
Prosopagnosia (p. 93)
Retinotopic map (p. 76)
MEDIA RESOURCES
The Sensation and Perception
Book Companion Website
The Visual Pathways A drag-and-drop exercise that tests
your knowledge of visual structures.
2. Visual Cortex of the Cat A classic 1972 fi lm in which vision research pioneer Colin Blakemore demonstrates mapping of receptive fields of neurons in the cortex of the cat.
3. Simple Cells in the Cortex How the firing rate of a simple
cortical cell depends on orientation of a stimulus.
4. Complex Cells in the Cortex How the firing rate of a complex cortical cell changes with orientation and direction of
movement of a stimulus.
5. Contrast Sensitivity An experiment in which you measure your contrast sensitivity to grating patterns.
6. Orientation Aftereffect How adaptation to an oriented
grating can affect the perception of orientation.
7. Size Aftereffect How adaptation to a grating can affect
size perception.
8. Development in the Visual Cortex A classic 1973 fi lm in
which vision research pioneer Colin Blakemore describes
his pioneering experiments that demonstrated how the
properties of neurons in the kitten’s cortex can be affected
by the environment in which it is reared.
9. Retinotopy Movie: Ring How the cortex is activated as a
ring shape expands. (Courtesy of Geoffrey Boynton.)
1.
www.cengage.com/psychology/goldstein
See the companion website for flashcards, practice quiz
questions, Internet links, updates, critical thinking exercises, discussion forums, games, and more!
CengageNOW
www.cengage.com/cengagenow
Go to this site for the link to CengageNOW, your one-stop
shop. Take a pre-test for this chapter, and CengageNOW
will generate a personalized study plan based on your test
results. The study plan will identify the topics you need to
review and direct you to online resources to help you master those topics. You can then take a post-test to help you
determine the concepts you have mastered and what you
will still need to work on.
Virtual Lab
Rod and frame illusion (p. 91)
Selective adaptation (p. 79)
Selective rearing (p. 80)
Simple cortical cell (p. 77)
Single dissociation (p. 89)
Striate cortex (p. 74)
Subtraction technique (p. 83)
Superior colliculus (p. 75)
Theory of natural
selection (p. 94)
Ventral pathway (p. 89)
What pathway (p. 89)
Where pathway (p. 89)
VL
Your Virtual Lab is designed to help you get the most out
of this course. The Virtual Lab icons direct you to specific
media demonstrations and experiments designed to help
you visualize what you are reading about. The number
beside each icon indicates the number of the media element
you can access through your CD-ROM, CengageNOW, or
WebTutor resource.
The following lab exercises are related to material in
this chapter:
Retinotopy Movie: Wedge Record from an experiment
demonstrating how the cortex is activated as a wedge
rotates to different positions. (Courtesy of Geoffrey
Boynton.)
11. What and Where Streams Drag-and-drop exercise to test
your knowledge of the what and where pathways.
10.
Media Resources
97
Chapter Contents
C H A P T E R
5
WHY IS IT SO DIFFICULT TO DESIGN
A PERCEIVING MACHINE?
The Stimulus on the Receptors Is
Ambiguous
Objects Can Be Hidden or Blurred
Objects Look Different From Different
Viewpoints
THE GESTALT APPROACH TO OBJECT
PERCEPTION
DEMONSTRATION: Making Illusory
Contours Vanish
The Gestalt Laws of Perceptual
Organization
DEMONSTRATION: Finding Faces in a
Landscape
Perceiving
Objects and
Scenes
Perceptual Segregation: How Objects Are
Separated From the Background
The Gestalt “Laws” as Heuristics
RECOGNITION-BY-COMPONENTS
THEORY
DEMONSTRATION: Non-Accidental
Properties
❚ TEST YOURSELF 5.1
PERCEIVING SCENES AND OBJECTS
IN SCENES
Perceiving the Gist of a Scene
METHOD: Using a Mask to Achieve Brief
Stimulus Presentations
Regularities in the Environment:
Information for Perceiving
DEMONSTRATION: Shape From Shading
DEMONSTRATION: Visualizing Scenes and
Objects
The Role of Inference in Perception
Revisiting the Science Project: Designing a
Perceiving Machine
THE PHYSIOLOGY OF OBJECT AND
SCENE PERCEPTION
Neurons That Respond to Perceptual
Grouping and Figure–Ground
How Does the Brain Respond to Objects?
Connecting Neural Activity and Perception
METHOD: Region-of-Interest Approach
SOMETHING TO CONSIDER:
MODELS OF BRAIN ACTIVITY THAT
CAN PREDICT WHAT A PERSON IS
LOOKING AT
❚ TEST YOURSELF 5.2
Think About It
If You Want to Know More
Key Terms
Media Resources
VL VIRTUAL LAB
OPPOSITE PAGE This painting by Robert Indiana, titled The Great
Love, provides examples of how different areas of a picture can be
perceived as figure and ground. At first you may see the red areas,
spelling the word “Love,” standing out as the figure. It is also possible,
however, to see small green areas as arrows on a red background, or
the blue shapes in the center as three figures on a red background.
© 2010 Morgan Art Foundation Ltd./Artists Rights Society (ARS), New York. Carnegie Museum of Art,
Pittsburgh/Gift of the Women’s Committee.
VL The Virtual Lab icons direct you to specific animations and videos designed to help you visualize what you are reading about. The number beside
each icon indicates the number of the clip you can access through your
CD-ROM or your student website.
99
Some Questions We Will Consider:
❚ Why do some perceptual psychologists say “The whole
differs from the sum of its parts”? (p. 104)
❚ How do we distinguish objects from their background?
(p. 108)
❚ How do “rules of thumb” help us in arriving at a perception of the environment? (p. 109)
❚ Why are even the most sophisticated computers
unable to match a person’s ability to perceive objects?
(p. 119)
S
Bruce Goldstein
itting in the upper deck in PNC Park in Pittsburgh,
Roger looks out over the city (Figure 5.1). On the left,
he sees a group of about 10 buildings and can tell one
building from another, even though they overlap. Looking
straight ahead, he sees a small building in front of a larger
one, and has no trouble telling that they are two separate
buildings. Looking down toward the river, he notices a
horizontal yellow band above the right field bleachers. It is
obvious to him that this is not part of the ballpark but is
located across the river.
All of Roger’s perceptions come naturally to him and
require little effort. However, what Roger achieves so easily
is actually the end result of complex processes. We can gain
some perspective on the idea that perception is complex
and potentially difficult, by returning to the “science project” that we described at the beginning of Chapter 1 (review
page 4).
This project posed the problem of designing a machine
that can locate, describe, and identify all objects in the environment and, in addition, can travel from one point to another, avoiding obstacles along the way. This problem has
attracted the interest of computer scientists for more than
half a century. When computers became available in the
1950s and ’60s, it was predicted that devices with capacities
approaching human vision would be available within 10 or
15 years. As it turned out, the task of designing a computer
that could equal human vision was much more difficult
Figure 5.1 ❚ It is easy to tell that there are a number of different buildings on the left and that straight ahead there is a low
rectangular building in front of a taller building. It is also possible to tell that the horizontal yellow band above the bleachers is
across the river. These perceptions are easy for humans, but would be difficult for a computer vision system.
100
CHAPTER 5
Perceiving Objects and Scenes
than the computer scientists imagined; even now, the problem has still not been solved (Sinha et al., 2006).
One way to illustrate the complexity of the science project is to consider recent attempts to solve it. Consider, for
example, the vehicles that were designed to compete in the
“Urban Challenge” race that occurred on November 3, 2007,
in Victorville, California. This race, which was sponsored by
the Defense Advanced Research Project Agency (DARPA),
required that vehicles drive for 55 miles through a course
that resembled city streets, with other moving vehicles, traffic signals, and signs. The vehicles had to accomplish this
feat on their own, with no human involvement other than
entering global positioning coordinates of the course’s layout into the vehicle’s guidance system. Vehicles had to stay
on course and avoid unpredictable traffic without any human intervention, based only on the operation of onboard
computer systems.
The winner of the race is shown in Figure 5.2. “Boss,”
from Carnegie Mellon University, succeeded in staying on
course and avoiding other cars while maintaining an average speed of 14 miles per hour. The vehicle from Stanford
came in second, and the one from Virginia Tech came in
third. Teams from MIT, Cornell, and the University of Pennsylvania also successfully completed the course out V
L 1
of a total of 11 teams that qualified for the final race.
The feat of navigating through the environment, especially one that contains moving obstacles, is extremely impressive. However, even though these robotic vehicles can
avoid obstacles along a defined pathway, they can’t identify
most of the objects they are avoiding. For example, even
though “Boss” might be able to avoid an obstacle in the
middle of the road, it can’t tell whether the obstacle is “a
pile of rocks” or “a bush.”
Other computer-based machines have been designed
specifically to recognize objects (as opposed to navigating
a course). These machines can recognize some objects, but
only after training on a limited set of objects. The machines
can recognize faces, but only if the lighting is just right and
the faces are viewed from a specific angle. The difficulty of
computer face recognition is illustrated by the fact that systems designed to recognize faces at airport security checkpoints can accurately identify less than half of a group of
specially selected faces (Sinha, 2002; also see Chella et al.,
2000, and “If You Want to Know More,” page 128, for more
on computer perception).
Why Is It So Difficult to Design
a Perceiving Machine?
Tartan Racing and Carnegie Mellon University
We will now describe a few of the difficulties involved in designing a perceiving machine. Remember that the point of
these descriptions is that although they pose difficulties for
computers, our human “perceiving machine” solves these
problems easily.
The Stimulus on the Receptors
Is Ambiguous
Figure 5.2 ❚ The “Boss” robotic vehicle on a test run
on a track at Robot City in Pittsburgh. Notice that there is
no human driver. Navigation is accomplished by onboard
computers that receive information from numerous sensors
on the vehicle, each of which has a specialized task. Sensors
mounted on the back of the roof are laser range scanners that
point down to detect lane markings. Sensors on the roof rack
point down crossroads to detect and track vehicles when
attempting to merge with traffic. The black sensors on the
hood at the front of the vehicle are multiplane, long-range
laser scanners used for tracking vehicles. The two white
sensors on the corners of the front bumper are short-range
laser scanners used to detect and track nearby vehicles.
The four rectangles in the grill are radar sensors. The white
sensors are short-range, used for detecting obstacles near
the vehicle. The black sensors are long-range, for tracking
vehicles when Boss is moving quickly or considering turning
across traffic.
When you look at the page of this book, the image cast by
the page on your retina is ambiguous. It may seem strange
to say that, because it is obvious that the page is rectangular, but consider Figure 5.3, which shows how the page is
imaged on your retina. Viewed from straight on, the rectangular page creates a rectangular image on the retina. However, other objects, such as the tilted rectangle or slanted
trapezoid, can also create the same image.
The fact that a particular image on the retina (or a computer vision machine’s sensors) can be created by many different objects is called the inverse projection problem. Another way to state this problem is as follows: If we know an
object’s shape, distance, and orientation, we can determine
the shape of the object’s image on the retina. However, a
particular image on the retina can be created by an infinite
number of objects.
The ambiguity of the image on the retina is also illustrated by Figure 5.4a, which appears to be a circle of rocks.
However, looking at these rocks from another viewpoint
Why Is It So Difficult to Design a Perceiving Machine?
101
Image on
retina
Courtesy of Thomas Macaulay, Blackhawk Mountain School of Art, Blackhawk, CO
Objects that create the same
image on the retina
Figure 5.3 ❚ The principle behind the
inverse projection problem. The page
of the book that is near the eye creates
a rectangular image on the retina.
However, this image could also have
been created by the tilted square, by
the trapezoid and by many other stimuli.
This is why we say that the image on the
retina is ambiguous.
(a)
(b)
Figure 5.4 ❚ An environmental sculpture by Thomas Macaulay. (a) When viewed from exactly the right vantage point
(the second-floor balcony of the Blackhawk Mountain School of Art, Black Hawk, Colorado), the stones appear to be
arranged in a circle. (b) Viewing the stones from the ground floor reveals a truer indication of their configuration.
reveals that they aren’t arranged in a circle after all
(Figure 5.4b). Thus, just as a rectangular image on the retina
can be created by trapezoid and other nonrectangular objects, a circular image on the retina can be created by objects
that aren’t circular. Although the example in Figure 5.4a
leads human perceivers to the wrong conclusion about the
rocks, this kind of confusion rarely occurs, because moving
to another viewpoint reveals that the rocks aren’t arranged
in a circle.
These examples show that the information from a single view of an object can be ambiguous. Humans solve this
problem by moving to different viewpoints, and by making
use of knowledge they have gained from past experiences in
perceiving objects.
Objects Can Be Hidden or Blurred
Sometimes objects are hidden or blurred. Can you find the
pencil and eyeglasses in Figure 5.5? Although it might take
a little searching, people can find the pencil in the foreground, and the glasses frame sticking out from behind the
computer next to the scissors, even though only a small por-
102
CHAPTER 5
Perceiving Objects and Scenes
tion of these objects is visible. People also easily perceive the
book, scissors, and paper as single objects, even though they
are partially hidden by other objects.
This problem of hidden objects occurs any time one object obscures part of another object. This occurs frequently
in the environment, but people easily understand that the
part of an object that is covered continues to exist, and they
are able to use their knowledge of the environment to determine what is likely to be present.
People are also able to recognize objects that are not in
sharp focus, such as the faces in Figure 5.6. See how many of
these people you can identify, and then consult the answers
on page 130. Despite the degraded nature of these images,
people can often identify most of them, whereas computers
perform poorly on this task (Sinha, 2002).
Objects Look Different From
Different Viewpoints
Another problem facing any perception machine is that
objects are often viewed from different angles. This means
that the images of objects are continually changing, de-
(a)
(b)
(c)
Bruce Goldstein
Figure 5.8 ❚ Which photographs are of the same person?
From Sinha, P. (2002). (Recognizing complex patterns. Nature
Neuroscience, 5, 1093–1097. Reprinted by permission from
Macmillan Publishers Ltd. Copyright 2002.)
Figure 5.5 ❚ A portion of the mess on the author’s desk.
Can you locate the hidden pencil (easy) and the author’s
glasses (hard)?
Figure 5.6 ❚ Who are these people? See bottom of
Bruce Goldstein
page 130 for the answers. (From Sinha, P. (2002). Recognizing
complex patterns. Nature Neuroscience, 5, 1093–1097.
Reprinted by permission from Macmillan Publishers Ltd.
Copyright 2002.)
pending on the angle from which they are viewed. Although
humans continue to perceive the object in Figure 5.7 as the
same chair viewed from different angles, this isn’t so obvious to a computer. The ability to recognize an object seen
from different viewpoints is called viewpoint invariance.
People’s ability to achieve viewpoint invariance enables
them to identify the images in Figure 5.8a and c as being the same person, but a computer face recognition system would rate faces a and b as being more similar (Sinha,
2002).
The difficulties facing any perceiving machine illustrate
that perception is more complex than it seems. But how do
humans overcome these complexities? Early answers to this
question were provided in the early 1900s by a group of psychologists who called themselves Gestalt psychologists—
where Gestalt, roughly translated, means a whole configuration that cannot be described merely as the sum of its parts.
We can appreciate the meaning of this defi nition by considering how Gestalt psychology began.
(a)
(b)
(c)
Figure 5.7 ❚ Your ability to recognize each of these views as being of the same chair is an example of viewpoint invariance.
Why Is It So Difficult to Design a Perceiving Machine?
103
The Gestalt Approach to
Object Perception
Flash line
on left
B
50 ms of
darkness
Flash line
on right
A
B
Perception:
movement
from left
to right
(a)
Bruce Goldstein
We can understand the Gestalt approach by first considering an early attempt to explain perception that was proposed by Wilhelm Wundt, who established the first laboratory of scientific psychology at the University of Leipzig
in 1879. Wundt’s approach to psychology was called structuralism. One of the basic ideas behind structuralism was
that perceptions are created by combining elements called
sensations, just as each of the dots in the face in Figure 5.9
add together to create our perception of a face.
The idea that perception is the result of “adding up”
sensations was disputed by the Gestalt psychologists, who
offered, instead, the idea that the whole differs from the sum
of its parts. This principle had its beginnings, according to
a well-known story, in a train ride taken by psychologist
Max Wertheimer in 1911 (Boring, 1942). Wertheimer got
off the train to stretch his legs in Frankfurt and bought
a toy stroboscope from a vendor who was selling toys on
the train platform. The stroboscope, a mechanical device
that created an illusion of movement by rapidly alternating two slightly different pictures, caused Wertheimer to
wonder how the structuralist idea that experience is created from sensations could explain the illusion of movement he observed. We can understand why this question arose by looking at Figure 5.10a, which diagrams the
principle behind the illusion of movement created by the
stroboscope.
When two stimuli that are in slightly different positions are flashed one after another with the correct timing,
movement is perceived between the two stimuli. This is an
illusion called apparent movement because there is actually no movement in the display, just two stationary stimuli
flashing on and off. How, wondered Wertheimer, can the
movement that appears to occur between the two flashing
stimuli be caused by sensations? After all, there is no stimulation in the space between the two stimuli, and therefore
there are no sensations to provide an explanation for the
movement. (A modern example of apparent movement is
A
(b)
Figure 5.10 ❚ (a) Wertheimer’s demonstration of apparent
movement. (b) Moving electronic signs such as this one,
in which the words are scrolling to the left, create the
perception of movement by applying the principles of
apparent movement studied by Wertheimer.
provided by electronic signs like the one in Figure 5.10b,
which display moving advertisements or news headlines. The
perception of movement in these displays is so compelling
that it is difficult to imagine that they are made up V
L 2
of stationary lights flashing on and off.)
With his question about apparent movement as his inspiration, Wertheimer and two colleagues, Kurt Koffka and
Ivo Kohler, set up a laboratory at the University of Frankfurt, called themselves Gestalt psychologists, and proceeded
to do research and publish papers that posed serious problems for the structuralist idea that perceptions are created
from sensations (Wertheimer, 1912). The following demonstration illustrates another phenomenon that is difficult to
explain on the basis of sensations.
D E M O N S T R AT I O N
Making Illusory Contours Vanish
Figure 5.9 ❚ According to structuralism, a number of
sensations (represented by the dots) add up to create our
perception of the face.
104
CHAPTER 5
Perceiving Objects and Scenes
Consider the picture in Figure 5.11. If you see this as a cube
like the one in Figure 5.11b floating in space in front of black
circles, you probably perceive faint illusory contours that
represent the edges of the cube (Bradley & Petry, 1977).
These contours are called illusory because they aren’t actually present in the physical stimulus. You can prove this to
yourself by (1) placing your finger over the two black circles
at the bottom or (2) imagining that the black circles are holes
and that you are looking at the cube through these holes.
Covering the circles or seeing the cube through the holes
causes the illusory contours to either vanish or become more
difficult to see. ❚
Pragnanz Pragnanz, roughly translated from the Ger-
man, means “good figure.” The law of pragnanz, also called
the law of good figure or the law of simplicity, is the central law of Gestalt psychology: Every stimulus pattern is seen in
such a way that the resulting structure is as simple as possible. The
familiar Olympic symbol in Figure 5.13a is an example of
the law of simplicity at work. We see this display as five circles and not as a larger number of more complicated V
L 9
shapes such as the ones in Figure 5.13b.
Similarity Most people perceive Figure 5.14a as either
(a)
(b)
Figure 5.11 ❚ (a) This can be seen as a cube floating in
front of eight discs or as a cube seen through eight holes.
In the first case, the edges of the cube appear as illusory
contours. (b) The cube without the black circles. (Based
on “Organizational Determinants of Subjective Contour:
The Subjective Necker Cube,” by D. R. Bradley and H. M.
Petry, 1977, American Journal of Psychology, 90, 253–262.
American Psychological Association.)
When you made the contours vanish by placing your
finger over the black circles, you showed that the contour
was illusory and that our perception of one part of the display (the contours) is affected by the presence of another
part (the black circles). The structuralists would have a hard
time explaining illusory contours because there is no actual
contour, so there can’t be any sensations where the V
L 3, 4
contour is perceived.
Additional displays that are difficult to explain in
terms of sensations are bistable figures, like the cube in
Figure 5.11b, which switch back and forth as they are
viewed, and illusions, in which perceptions of one part of
a display are affected by another part. (See Virtual Labs
5–7.) Making the contours vanish by imagining that you
are looking through black holes poses a similar problem
for the structuralists. It is difficult to explain a perception
that is present one moment and gone the next in terms of
sensations, especially since the stimulus on your V
L 5–7
retina never changes.
Having rejected the idea that perception is built up
of sensations, the Gestalt psychologists proposed a number of principles, which they called laws of perceptual
organization.
horizontal rows of circles, vertical columns of circles, or
both. But when we change the color of some of the columns,
as in Figure 5.14b, most people perceive vertical columns of
circles. This perception illustrates the law of similarity: Similar things appear to be grouped together. This law causes circles
of the same color to be grouped together. Grouping
Image not available due to copyright restrictions
(a)
The Gestalt Laws of Perceptual
Organization
Perceptual organization involves the grouping of elements
in an image to create larger objects. For example, some of the
dark areas in Figure 5.12 become grouped to form a Dalmation and others are seen as shadows in the background. Here
are six of the laws of organization that the Gestalt psychologists proposed to explain how perceptual grouping V
L 8
such as this occurs.
(b)
Figure 5.13 ❚ (a) This is usually perceived as five circles,
not as the nine shapes in (b).
The Gestalt Approach to Object Perception
105
C
D
(b)
(a)
A
Figure 5.14 ❚ (a) Perceived as horizontal rows or vertical
can also occur because of similarity of shape, V
L 10
size, or orientation (Figure 5.15).
Grouping also occurs for auditory stimuli. For example,
notes that have similar pitches and that follow each other
closely in time can become perceptually grouped to form a
melody. We will consider this and other auditory grouping
effects when we describe organizational processes in hearing in Chapter 12.
Bruce Goldstein
columns or both. (b) Perceived as vertical columns.
B
Figure 5.16 ❚ Good continuation helps us perceive two
separate wires, even though they overlap.
Good Continuation In Figure 5.16 we see the wire
starting at A as flowing smoothly to B. It does not go to C or
D because those paths would involve making sharp turns
and would violate the law of good continuation: Points that,
when connected, result in straight or smoothly curving lines are seen
as belonging together, and the lines tend to be seen in such a way as to
follow the smoothest path. Another effect of good continuation
is shown in the Celtic knot pattern in Figure 5.17. In this
case, good continuation assures that we see a continuous
interweaved pattern that does not appear to be broken into
little pieces every time one strand overlaps another strand.
Good continuation also helped us to perceive V
L 11, 12
the smoothly curving circles in Figure 5.13a.
Proximity
(Nearness) Our perception of Figure
5.18a as two pairs of circles illustrates the law of proximity, or nearness: Things that are near each other appear to V
L 13
be grouped together.
Figure 5.17 ❚ Because of good continuation, we perceive
this pattern as continuous interwoven strands.
Common Region Figure 5.18b illustrates the princi-
ple of common region: Elements that are within the same region
of space appear to be grouped together. Even though the circles
inside the ovals are farther apart than the circles that are
next to each other in neighboring ovals, we see the circles
inside the ovals as belonging together. This occurs because
each oval is seen as a separate region of space (Palmer, 1992;
Palmer & Rock, 1994). Notice that in this example common
region overpowers proximity. Because the circles are in different regions, they do not group with each other, as they
did in Figure 5.18a, but with circles in the same region.
Uniform Connectedness The principle of uni-
Donald Miralie/Getty Images
form connectedness states: A connected region of visual properties, such as lightness, color, texture, or motion, is perceived as a single
unit. For example, in Figure 5.18c, the connected circles are
perceived as grouped together, just as they were when they
were in the same region in Figure 5.18b.
Synchrony The principle of synchrony states: Visual
Figure 5.15 ❚ What are they looking at? Whatever it is,
Tiger Woods and Phil Mickelson have become perceptually
linked because of the similar orientations of their arms, golf
clubs, and bodies.
106
CHAPTER 5
Perceiving Objects and Scenes
events that occur at the same time are perceived as belonging together. For example, the lights in Figure 5.18d that blink together are seen as belonging together.
Common Fate The law of common fate states: Things
that are moving in the same direction appear to be grouped together.
Thus, when you see a flock of hundreds of birds all flying
(a)
(b)
(c)
(d)
Figure 5.18 ❚ Grouping by (a) proximity; (b) common region;
(c) connectedness; and (d) synchrony. Synchrony occurs
when the yellow lights blink on and off together.
together, you tend to see the flock as a unit, and if some birds
start flying in another direction, this creates a new unit (Figure 5.19). Notice that common fate is like synchrony in that
both principles are dynamic, but synchrony can occur without movement, and the elements don’t have to change V
L 14
in the same direction as they do in common fate.
Meaningfulness or Familiarity According to the
law of familiarity, things that form patterns that are familiar or
meaningful are likely to become grouped together (Helson, 1933;
Hochberg, 1971). You can appreciate how meaningfulness
influences perceptual organization by doing the following
demonstration.
Figure 5.19 ❚ A flock of birds that are moving in the same
direction are seen as grouped together. When a portion of the
flock changes direction, their movement creates a new group.
This illustrates the law of common fate.
D E M O N S T R AT I O N
Finding Faces in a Landscape
Consider the picture in Figure 5.20. At first glance this scene
appears to contain mainly trees, rocks, and water. But on
closer inspection you can see some faces in the trees in the
background, and if you look more closely, you can see that a
Figure 5.20 ❚ The Forest Has Eyes by Bev Doolittle (1984). Can you find 13 hidden faces in this picture? E-mail the author at
bruceg@email.arizona.edu for the solution.
The Gestalt Approach to Object Perception
107
number of faces are formed by various groups of rocks. See
if you can find all 13 faces hidden in this picture. ❚
Some people find it difficult to perceive the faces at
first, but then suddenly they succeed. The change in perception from “rocks in a stream” or “trees in a forest” to “faces”
is a change in the perceptual organization of the rocks and
the trees. The two shapes that you at first perceive as two
separate rocks in the stream become perceptually grouped
together when they become the left and right eyes of a face.
In fact, once you perceive a particular grouping of rocks as
a face, it is often difficult not to perceive them in this way—
they have become permanently organized into a face. This is
similar to the process we observed for the Dalmatian. Once
we see the Dalmatian, it is difficult not to perceive it.
Figure 5.21 ❚ A version of Rubin’s reversible face–vase
figure.
Perceptual Segregation: How Objects
Are Separated From the Background
The Gestalt psychologists were also interested in explaining
perceptual segregation, the perceptual separation of one
object from another, as Roger did when he perceived each
of the buildings in Figure 5.1 as separate from one another.
The question of what causes perceptual segregation is often
referred to as the problem of figure–ground segregation.
When we see a separate object, it is usually seen as a figure
that stands out from its background, which is called the
ground. For example, you would probably see a book or
papers on your desk as figure and the surface of your desk
as ground. The Gestalt psychologists were interested in determining the properties of the figure and the ground and
what causes us to perceive one area as figure and the other
as ground.
What Are the Properties of Figure and
Ground? One way the Gestalt psychologists studied
the properties of figure and ground was by considering patterns like the one in Figure 5.21, which was introduced by
Danish psychologist Edgar Rubin in 1915. This pattern is
an example of reversible figure–ground because it can be
perceived alternately either as two blue faces looking at each
other, in front of a white background, or as a white vase on
a blue background. Some of the properties of the figure and
ground are:
■
The figure is more “thinglike” and more memorable
than the ground. Thus, when you see the vase as figure, it appears as an object that can be remembered
later. However, when you see the same white area as
ground, it does not appear to be an object and V
L 15
is therefore not particularly memorable.
■
The figure is seen as being in front of the ground.
Thus, when the vase is seen as figure, it appears to be
in front of the dark background (Figure 5.22a), and
when the faces are seen as figure, they are on top of
the light background (Figure 5.22b).
108
CHAPTER 5
Perceiving Objects and Scenes
(a)
(b)
Figure 5.22 ❚ (a) When the vase is perceived as figure,
it is seen in front of a homogeneous dark background.
(b) When the faces are seen as figure, they are seen in front
of a homogeneous light background.
■
The ground is seen as unformed material and seems
to extend behind the figure.
■
The contour separating the figure from the ground
appears to belong to the figure. This property of figure, which is called border ownership, means that, although figure and ground share a contour, the border
is associated with the figure. Figure 5.23 illustrates
border ownership for another display that can be
perceived in two ways. If you perceive the display in
Figure 5.23a as a light gray square (the figure) sitting
on a dark background (the ground), then the border
belongs to the gray square, as indicated by the dot
in Figure 5.23b. But if you perceive the display as a
black rectangle with a hole in it (the figure) through
which you are viewing a gray surface (the ground), the
border would be on the black rectangle, as shown in
Figure 5.23c.
What Factors Determine Which Area Is
Figure? What factors determine whether an area is per-
ceived as figure or ground? Shaun Vecera and coworkers
(2002) used the phenomenological method (see page 13)
to show that regions in the lower part of a display are more
likely to be perceived as figure than regions in the upper
Cross figure
Plus
figure
Red or yellow?
(a)
(b)
Cross or plus?
(c)
(a) Symmetry
(b) Smaller area
Figure 5.23 ❚ (a) This display can be perceived in two
ways. (b) When it is perceived as a small square sitting on
top of a dark background, the border belongs to the small
square, as indicated by the dot. (c) When it is perceived as a
large dark square with a hole in it, the border belongs to the
dark square.
Vertical-horizontal cross
Tilted cross
Vertical-horizontal or tilted?
(c) Vertical or horizontal orientation
Dark or light?
(d) Meaningful (waves)
Figure 5.25 ❚ Examples of how (a) symmetry, (b) size,
(c) orientation, and (d) meaning contribute to perceiving an
area as figure.
(a)
Percent of trials
100
75
In Figure 5.25b (smaller area), the smaller plus-shaped area
is more likely to be seen as figure. In Figure 5.25c (vertical or
horizontal areas), the vertical–horizontal cross tends to be
seen as figure. In Figure 5.25d (meaningfulness), the fact that
the dark areas look like waves increases the chances V
L 16
that this area will be seen as figure.
50
25
Lower seen
as figure
Left seen
as figure
(b)
Figure 5.24 ❚ (a) Stimuli from Vecerra et al. (2002).
(b) Percentage of trials on which lower or left areas were
seen as figure.
part. They flashed stimuli like the ones in Figure 5.24a
for 150 milliseconds (ms) and asked observers to indicate
which area they saw as figure, the red area or the green area.
The results, shown in Figure 5.24b, indicate that for the
upper–lower displays, observers were more likely to perceive
the lower area as figure, but for the left–right displays, they
showed only a small preference for the left region. From this
result, Vecera concluded that there is no left–right preference for determining figure, but there is a definite preference for seeing objects lower in the display as figure. The
conclusion from this experiment is that the lower region of
a display tends to be seen as figure.
Figure 5.25 illustrates four other factors that help determine which area will be seen as figure. In Figure 5.25a
(symmetry), the symmetrical red areas on the left are seen
as figure, as are the symmetrical yellow areas on the right.
The Gestalt “Laws” as Heuristics
Although the Gestalt psychologists called their principles
“laws of perceptual organization,” most perceptual psychologists call them the Gestalt “principles” or “heuristics.” The
reason for rejecting the term laws is that the rules of perceptual organization and segregation proposed by the Gestalt psychologists don’t make strong enough predictions to
qualify as laws. Instead, the Gestalt principles are more accurately described as heuristics—rules of thumb that provide a best-guess solution to a problem. We can understand
what heuristics are by comparing them to another way of
solving a problem, called algorithms.
An algorithm is a procedure that is guaranteed to solve
a problem. An example of an algorithm is the procedures
we learn for addition, subtraction, and long division. If we
apply these procedures correctly, we get the right answer every time. In contrast, a heuristic may not result in a correct
solution every time. For example, suppose that you want to
find a cat that is hiding somewhere in the house. An algorithm for doing this would be to systematically search every room in the house (being careful not to let the cat sneak
past you!). If you do this, you will eventually fi nd the cat,
The Gestalt Approach to Object Perception
109
although it may take a while. A heuristic for finding the cat
would be to first look in the places where the cat likes to
hide. So you check under the bed and in the hall closet. This
may not always lead to finding the cat, but if it does, it has
the advantage of usually being faster than the algorithm.
We say the Gestalt principles are heuristics because,
like heuristics, they are best-guess rules that work most of
the time, but not necessarily all of the time. For example,
consider the following situation in which the Gestalt laws
might cause an incorrect perception: As you are hiking in
the woods, you stop cold in your tracks because not too far
ahead, you see what appears to be an animal lurking behind
a tree (Figure 5.26a). The Gestalt laws of organization play
a role in creating this perception. You see the two shapes
to the left and right of the tree as a single object because of
the Gestalt law of similarity (because both shapes are the
same color, it is likely that they are part of the same object).
Also, good continuation links these two parts into one because the line along the top of the object extends smoothly
from one side of the tree to the other. Finally, the image resembles animals you’ve seen before. For all of these reasons,
it is not surprising that you perceive the two objects as part
of one animal.
Because you fear that the animal might be dangerous,
you take a different path. As your detour takes you around
the tree, you notice that the dark shapes aren’t an animal after all, but are two oddly shaped tree stumps (Figure 5.26b).
So in this case, the Gestalt laws have misled you.
The fact that heuristics are usually faster than algorithms helps explain why the perceptual system is designed
to operate in a way that sometimes produces errors. Consider, for example, what the algorithm would be for determining what the shape in Figure 5.26a really is. It would
involve walking around the tree, so you can see it from different angles and perhaps taking a closer look at the objects
behind the tree. Although this may result in an accurate
perception, it is slow and potentially risky (what if the shape
actually is a dangerous animal?).
The advantage of our Gestalt-based rules of thumb is
that they are fast, and correct most of the time. The reason
(a)
110
(b)
CHAPTER 5
Perceiving Objects and Scenes
they work most of the time is that they reflect properties of
the environment. For example, in everyday life, objects that
are partially hidden often “come out the other side” (good
continuation), and objects often have similar large areas of
the same color (similarity). We will return to the idea that
perception depends on what we know about properties of
the environment later in the chapter.
Although the Gestalt approach dates back to the early
1900s, it is still considered an important way to think about
perception. Modern researchers have done experiments
like Vecera’s (Figure 5.24) to study some of the principles of
perceptual organization and segregation proposed by the
Gestalt psychologists, and they have also considered issues
in addition to organization and segregation. We will now
describe a more recent approach to object perception called
recognition by components that is designed to explain how we
recognize objects.
Recognition-by-Components
Theory
How do we recognize objects in the environment based
on the image on the retina? Recognition-by-components
(RBC) theory, which was proposed by Irving Biederman
(1987), answers this question by proposing that our recognition of objects is based on features called geons, a term
that stands for “geometric ions,” because just as ions are
basic units of molecules (see page 29), these geons are basic units of objects. Figure 5.27a shows a number of geons,
which are shapes such as cylinders, rectangular solids, and
pyramids. Biederman proposed 36 different geons and suggested that this number of geons is enough to enable us to
mentally represent a large proportion of the objects that we
can easily recognize. Figure 5.27b shows a few objects that
have been constructed from geons.
To understand geons, we need to introduce the concept of non-accidental properties (NAPs). NAPs are properties of edges in the retinal image that correspond to the
Figure 5.26 ❚ (a) What lurks behind the tree?
(b) It is two strangely shaped tree stumps, not an
animal!
5
1
2
5
3
2
2
5
3
4
4
5
5
3
3
(a) Geons
3
(b) Objects
Figure 5.27 ❚ (a) Some geons. (b) Some objects created from these geons. The numbers on
the objects indicate which geons are present. Note that recognizable objects can be formed by
combining just two or three geons. Also note that the relations between the geons matter, as
illustrated by the cup and the pail. (Reprinted from “Recognition-by-Components: A Theory of
Human Image Understanding,” by I. Biederman, 1985, Computer Vision, Graphics and Image
Processing, 32, 29–73. Copyright © 1985, with permission from Elsevier.)
properties of edges in the three-dimensional environment.
The following demonstration illustrates this characteristic
of NAPs.
D E M O N S T R AT I O N
Non-Accidental Properties
Close one eye and look at a coin, such as a quarter, straight
on, so your line of sight is perpendicular to the quarter, as
shown in Figure 5.28a. When you do this, the edge of the
quarter creates a curved image on the retina. Now tilt the
quarter, as in Figure 5.28b. The edge of this tilted quarter
still creates an image of a curved edge on the retina. Now tilt
the quarter so you are viewing it edge-on, as in Figure 5.28c.
When viewed in this way, the edge of the quarter creates an
image of a straight edge on the retina. ❚
In this demonstration, the property of curvature is
called a non-accidental property, because the only time it
doesn’t occur is when you view the quarter edge-on. Because this edge-on viewpoint occurs only rarely, it is called
an accidental viewpoint. Thus, the vast majority of your
views of circular objects result in a curved image on the
retina. According to RBC, the image of a curved edge on
the retina indicates the presence of a curved edge in the
environment.
RBC proposes that a key property of geons is that each
type of geon has a unique set of NAPs. For example, consider
the rectangular-solid geon in Figure 5.29a. The NAP for this
geon is three parallel straight edges. You can demonstrate
the fact that these edges are NAPs by viewing a rectangular solid (such as a book) from different angles, as shown
in Figure 5.30. When you do this, you will notice that most
Object
Image on retina
(a)
(b)
(c)
Figure 5.28 ❚ What happens to a quarter’s image on the retina as it is tilted. Most views, such as (a) and (b), create a curved
image on the retina. The rare accidental viewpoint shown in (c) creates an image of a straight line on the retina.
Recognition-by-Components Theory
111
(a)
(b)
Figure 5.29 ❚ (a) Rectangular-solid geon. The highlighted
Bruce Goldstein
three parallel edges are the non-accidental property for this
geon. (b) Cylindrical geon. The highlighted two parallel edges
are the non-accidental property of this geon.
(a)
(b)
(c)
Figure 5.30 ❚ This book’s non-accidental property (NAP) of
three parallel edges are seen even when the book is viewed
from different angles, as in (a) and (b). When viewed from an
accidental viewpoint, as in (c), this NAP is not perceived.
of the time you can see three parallel straight edges, as in
Figures 5.30a and b. Figure 5.30c shows what happens when
you view the book from an accidental viewpoint. The three
parallel edges are not visible from this viewpoint, just as the
quarter’s curvature was not visible when it was viewed from
an accidental viewpoint.
The NAP for the cylinder geon in Figure 5.29b is two
parallel straight edges, which you see as you view a cylindrical object such as a pencil or pen from different angles. Like
the rectangular geon, the cylindrical geon has an accidental
(a)
viewpoint from which the NAP is not visible (what is the accidental viewpoint for the cylinder?).
The fact that each geon has a unique set of NAPs results
in a property of geons called discriminability—each geon
can be discriminated from other geons. The fact that NAPs
are visible from most viewpoints results in another property
of geons, viewpoint invariance (see page 103)—the geon can be
identified when viewed from most viewpoints.
The main principle of recognition-by-components
theory is that if we can perceive an object’s geons, we can
identify the object (also see Biederman & Cooper, 1991; Biederman, 1995). The ability to identify an object if we can
identify its geons is called the principle of componential
recovery. This principle is what is behind our ability to identify objects in the natural environment even when parts of
the objects are hidden by other objects. Figure 5.31a shows
a situation in which componential recovery can’t occur because the visual noise is arranged so that the object’s geons
cannot be identified. Luckily, parts of objects are rarely obscured in this way in the natural environment, so, as we see
in Figure 5.31b, we can usually identify geons and, therefore, are able to identify the object.
Another illustration of the fact that our ability to identify objects depends on our ability to identify the object’s
geons is shown by the tea kettle in Figure 5.32a. When we view
it from the unusual perspective shown in Figure 5.32b, we
can’t identify some of its basic geons, and it is therefore more
difficult to identify in Figure 5.32b than in Figure 5.32a.
RBC theory also states that we can recognize objects
based on a relatively small number of geons. Biederman
(1987) did an experiment to demonstrate this, by briefly
presenting line drawings of objects with all of their geons
and with some geons missing. For example, the airplane in
Figure 5.33a, which has a total of 9 geons, is shown with
only 3 of its geons in Figure 5.33b. Biederman found that
(b)
Figure 5.31 ❚ (a) It is difficult to identify the object behind the mask, because its geons have been obscured. (b) Now that it is
possible to identify geons, the object can be identified as a flashlight. (Reprinted from “Recognition-by-Components: A Theory
of Human Image Understanding,” by I. Biederman, 1985, Computer Vision, Graphics and Image Processing, 32, 29–73.
Copyright © 1985, with permission from Elsevier.)
112
CHAPTER 5
Perceiving Objects and Scenes
Bruce Goldstein
Figure 5.32 ❚ (a) A familiar object.
(b) The same object seen from a
viewpoint that obscures most of
its geons. This makes it harder to
recognize the object.
(b)
(a)
(a)
(b)
Figure 5.33 ❚ An airplane, as represented (a) by 9 geons
and (b) by 3 geons. (Reprinted from “Recognition-byComponents: A Theory of Human Image Understanding,” by
I. Biederman, 1985, Computer Vision, Graphics and Image
Processing, 32, 29–73. Copyright © 1985, with permission
from Elsevier.)
their feathers or markings on their wings. Similarly, there
are some things in the environment, such as clouds, that
are difficult to create using geons (although even clouds are
sometime arranged so that geons are visible, leading us to
see “objects” in the sky).
The fact that there are things that RBC can’t explain is
not surprising because the theory was not meant to explain
everything about object perception. For example, although
edges play an important role in RBC, the theory is not concerned with the rapid processes that enable us to perceive
these edges. It also doesn’t deal with the processes involved
in grouping objects (which the Gestalt approach does) or
with how we learn to recognize different types of objects.
RBC does, however, provide explanations for some important phenomena, such as view invariance and the minimum information needed to identify objects. Some of the
most elegant scientific theories are simple and provide partial explanations, leaving other theories to complete the picture. This is the case for RBC.
T E S T YO U R S E L F 5.1
9-geon objects such as the airplane were recognized correctly about 78 percent of the time based on 3 geons and
96 percent of the time based on 6 geons. Objects with 6
geons were recognized correctly 92 percent of the time even
when they were missing half their geons.
RBC theory explains many observations about shapebased object perception, but the idea that our perception of
a complex object begins with the perception of features like
geons is one that some students find difficult to accept. For
example, one of my students who, having read the fi rst four
chapters of the book was apparently convinced that perception is a complex process, wrote, in reaction to reading
about RBC theory, that “our vision is far too complex to be
determined by a few geons.”
This student’s concern can be addressed in a few ways.
First, there are factors in addition to geons that help us
identify objects. For example, we might distinguish between
two birds with the same shape on the basis of the texture of
1. What are some of the problems that make object per-
ception difficult for computers but not for humans?
2. What is structuralism, and why did the Gestalt
3.
4.
5.
6.
psychologists propose an alternative to this way of
looking at perception?
How did the Gestalt psychologists explain perceptual organization?
How did the Gestalt psychologists describe figure–
ground segregation?
What properties of a stimulus tend to favor perceiving an area as “figure”? Be sure you understand
Vecera’s experiment that showed that the lower
region of a display tends to be perceived as figure.
How does RBC theory explain how we recognize
objects? What are the properties of geons, and how
do these properties enable us to identify objects
from different viewpoints and identify objects that
are partially hidden?
Recognition-by-Components Theory
113
Perceiving Scenes and
Objects in Scenes
So far we have been focusing on individual objects. But we
rarely see objects in isolation. Just as we usually see actors in
a play on a stage, we usually see objects within a scene (Epstein, 2005). A scene is a view of a real-world environment
that contains (1) background elements and (2) multiple objects that are organized in a meaningful way relative to each
other and the background (Epstein, 2005; Henderson &
Hollingworth, 1999).
One way of distinguishing between objects and scenes is
that objects are compact and are acted upon, whereas scenes
are extended in space and are acted within. For example, if
we are walking down the street and mail a letter, we would
be acting upon the mailbox (an object) and acting within the
street (the scene).
Perceiving the Gist of a Scene
Perceiving scenes presents a paradox. On one hand, scenes
are often large and complex. However, despite this size and
complexity, you can identify most scenes after viewing them
for only a fraction of a second. This general description of
the type of scene is called the gist of a scene. An example
of your ability to rapidly perceive the gist of a scene is the
way you can rapidly fl ip from one TV channel to another,
yet still grasp the meaning of each picture as it flashes
by—a car chase, quiz contestants, or an outdoor scene with
mountains—even though you may be seeing each picture
for a second or less. When you do this, you are perceiving
the gist of each scene (Oliva & Torralba, 2006).
Research has shown that it is possible to perceive the
gist of a scene within a fraction of a second. Mary Potter
(1976) showed observers a target picture and then asked
them to indicate whether they saw that picture as they
viewed a sequence of 16 rapidly presented pictures. Her
observers could do this with almost 100-percent accuracy
even when the pictures were flashed for only 250 ms (milliseconds; 1/4 second). Even when the target picture was only
specified by a description, such as “girl clapping,” observers
achieved an accuracy of almost 90 percent (Figure 5.34).
Another approach to determining how rapidly people
can perceive scenes was used by Li Fei-Fei and coworkers
(2007), who presented pictures of scenes for times ranging from 27 ms to 500 ms and asked observers to write a
description of what they saw. This method of determining
the observer’s response is a nice example of the phenomenological method, described on page 13. Fei-Fei used a procedure called masking to be sure the observers saw the V
L 17
pictures for exactly the desired duration.
M E T H O D ❚ Using a Mask to Achieve Brief
Stimulus Presentations
To present a stimulus, such as a picture, for just 27 ms,
we need to do more than just flash the picture for 27 ms,
because the perception of any stimulus persists for about
250 ms after the stimulus is extinguished—a phenomenon called persistence of vision. Thus, a picture that
is presented for 27 ms will be perceived as lasting about
275 ms. To eliminate the persistence of vision it is therefore necessary to flash a masking stimulus, usually a
pattern of randomly oriented lines, immediately after
presentation of the picture. This stops the persistence of
vision and limits the time that the picture is perceived.
Typical results of Fei-Fei’s experiment are shown in
Figure 5.35. At brief durations, observers saw only light and
dark areas of the pictures. By 67 ms they could identify some
large objects (a person, a table), and when the duration was
increased to 500 ms they were able to identify smaller objects and details (the boy, the laptop). For another picture,
of an ornate 1800s living room, observers were able to identify the picture as a room in a house at 67 ms and to identify
details, such as chairs and portraits, at 500 ms. Thus, the
overall gist of the scene is perceived first, followed by perception of details and smaller objects within the scene.
What enables observers to perceive the gist of a scene
so rapidly? Aude Oliva and Antonio Torralba (2001, 2006)
propose that observers use information called global image
Girl
clapping
Description
250 ms
250 ms
250 ms
Figure 5.34 ❚ Procedure for Potter’s (1976) experiment. She first presented either a target photograph or, as shown here, a
description, and then rapidly presented 16 pictures for 250 ms each. The observer’s task was to indicate whether the target
picture had been presented. In this example, only 3 of the 16 pictures are shown, with the target picture being the second one
presented. On other trials, the target picture is not included in the series of 16 pictures.
114
CHAPTER 5
Perceiving Objects and Scenes
Image not available due to copyright restrictions
features, which can be perceived rapidly and are associated
with specific types of scenes. Some of the global image features proposed by Oliva and Torralba are:
■
Degree of naturalness. Natural scenes, such as the beach
and forest in Figure 5.36, have textured zones and
undulating contours. Man-made scenes, such as the
street, are dominated by straight lines and horizontals
and verticals.
■
Degree of openness. Open scenes, such as the beach,
often have a visible horizon line and contain few objects. The street scene is also open, although not as
much as the beach. The forest is an example of a scene
with a low degree of openness.
Courtesy of Aude Oliva
■
Degree of roughness. Smooth scenes (low roughness) like
the beach contain fewer small elements. Scenes with
high roughness like the forest contain many small elements and are more complex.
■
Degree of expansion. The convergence of parallel lines,
like what you see when you look down railroad tracks
that appear to vanish in the distance, or in the street
scene in Figure 5.36, indicates a high degree of expansion. This feature is especially dependent on the
observer’s viewpoint. For example, in the street scene,
looking directly at the side of a building would result
in low expansion.
■
Color. Some scenes have characteristic colors, like the
beach scene (blue) and the forest (green and brown).
(Goffaux et al., 2005)
Global image features are holistic and rapidly perceived.
They are properties of the scene as a whole and do not depend on time-consuming processes such as perceiving
small details, recognizing individual objects, or separating
one object from another. Another property of global image features is that they contain information that results
in perception of a scene’s structure and spatial layout. For
example, the degree of openness and the degree of expansion refer directly to characteristics of a scene’s layout, and
naturalness also provides layout information that comes
from knowing whether a scene is “from nature” or contains
“human-made structures.”
Global image properties not only help explain how we
can perceive the gist of scenes based on features that can
be seen in brief exposures, but also illustrate the following
general property of perception: Our past experiences in perceiving properties of the environment plays a role in determining our perceptions. We learn, for example, that blue is
associated with open sky, that landscapes are often green
and smooth, and that verticals and horizontals are associated with buildings. Characteristics of the environment
such as this, which occur frequently, are called regularities
in the environment. We will now describe these regularities
in more detail.
Regularities in the Environment:
Information for Perceiving
Although observers make use of regularities in the environment to help them perceive, they are often unaware of
the specific information they are using. This aspect of perception is similar to what occurs when we use language.
Even though people easily string words together to create
sentences in conversations, they may not know the rules of
grammar that specify how these words are being combined.
Figure 5.36 ❚ Three
scenes that have different
global image features. See
text for description.
Perceiving Scenes and Objects in Scenes
115
Similarly, we easily use our knowledge of regularities in the
environment to help us perceive, even though we may not
be able to identify the specific information we are using. We
can distinguish two types of regularities, physical regularities
and semantic regularities.
Physical Regularities Physical regularities are reg-
ularly occurring physical properties of the environment. For
example, there are more vertical and horizontal orientations
in the environment than oblique (angled) orientations. This
occurs in human-made environment (for example, buildings contain lots of horizontals and verticals) and also in
natural environments (trees and plants are more likely to be
vertical or horizontal than slanted) (Coppola et al., 1998). It
is, therefore, no coincidence that people can perceive horizontals and verticals more easily than other orientations,
an effect called the oblique effect (Appelle, 1972; Campbell
et al., 1966; Orban et al., 1984).
Why should being exposed to more verticals and horizontals make it easier to see them? One answer to this
question is that experience-dependent plasticity, introduced
in Chapter 4 (see page 80), causes the visual system to have
more neurons that respond best to these orientations. The
fact that the visual system has a greater proportion of
neurons that respond to verticals and horizontals has
been demonstrated in experiments that have recorded
from large numbers of neurons in the visual cortex of the
monkey (R. L. Devalois et al., 1982; also see Furmanski &
Engel, 2000, for evidence that the visual cortex in humans
responds better to verticals and horizontals than to other
orientations).
Another physical characteristic of the environment is
that when one object partially covers another, the contour of
the partially covered object “comes out the other side.” If this
sounds familiar, it is because it is an example of the Gestalt
law of good continuation, which we introduced on page 106
and discussed in conjunction with our “creature” behind the
tree on page 110 (Figure 5.26). Other Gestalt laws (or “heuristics”) reflect regularities in the environment as well.
Consider, for example, the idea of uniform connectedness. Objects are often defi ned by areas of the same color
or texture, so when an area of the image on the retina has
the property of uniform connectedness, it is likely that this
area arises from a single environmental shape (Palmer &
Rock, 1994). Thus, uniformly connected regions are regularities in the environment, and the perceptual system is
designed to interpret these regions so that the environment
will be perceived correctly. The Gestalt heuristics are therefore based on the kinds of things that occur so often that
we take them for granted. Another physical regularity is illustrated by the following demonstration.
D E M O N S T R AT I O N
Shape From Shading
What do you perceive in Figure 5.37a? Do some of the discs
look as though they are sticking out, like parts of threedimensional spheres, and others appear to be indentations?
If you do see the discs in this way, notice that the ones that
appear to be sticking out are arranged in a square. After
observing this, turn the page over so the small dot is on the
bottom. Does this change your perception? ❚
Figures 5.37b and c show that if we assume that light
is coming from above (which is usually the case in the environment), then patterns like the circles that are light on
the top would be created by an object that bulges out (Figure 5.37b), but a pattern like the circles that are light on
the bottom would be created by an indentation in a surface
(Figure 5.37c). The assumption that light is coming from
above has been called the light-from-above heuristic (Kleffner & Ramachandran, 1992). Apparently, people make the
light-from-above assumption because most light in our environment comes from above. This includes the sun, as well
as most artificial light sources.
Another example of the light-from-above heuristic at
work is provided by the two pictures in Figure 5.38. Figure 5.38a shows indentations created by people walking in
the sand. But when we turn this picture upside down, as
Light
Light
Front
view
Figure 5.37 ❚ (a) Some of these
Side
view
(a)
116
(b)
CHAPTER 5
Perceiving Objects and Scenes
(c)
discs are perceived as jutting
out, and some are perceived as
indentations. Why? Light coming
from above would illuminate (b) the
top of a shape that is jutting out and
(c) the bottom of an indentation.
Bruce Goldstein
Figure 5.38 ❚ Why does
(a)
(a) look like indentations in
the sand and (b) look like
mounds of sand? See text for
explanation.
(b)
shown in Figure 5.38b, then the indentations in
18–20
the sand become rounded mounds.
It is clear from these examples of physical regularities
in the environment that one of the reasons humans are able
to perceive and recognize objects and scenes so much better
than computer-guided robots is that our system is customized to respond to the physical characteristics of our environment. But this customization goes beyond physical characteristics. It also occurs because we have learned about what
types of objects typically occur in specific types of scenes.
Semantic Regularities In language, semantics refers
to the meanings of words or sentences. Applied to perceiving scenes, semantics refers to the meaning of a scene. This
meaning is often related to the function of a scene—what
happens within it. For example, food preparation, cooking,
and perhaps eating occur in a kitchen; waiting around, buying tickets, checking luggage, and going through security
checkpoints happens in airports. Semantic regularities are
the characteristics associated with the functions carried out
in different types of scenes.
One way to demonstrate that people are aware of semantic regularities is simply to ask them to imagine a particular
type of scene or object, as in the following demonstration.
ing a desk with a computer on it, bookshelves, and a chair.
The department store scene may contain racks of clothes, a
changing room, and perhaps a cash register.
What did you see when you visualized the microscope
or the lion? Many people report seeing not just a single object, but an object within a setting. Perhaps you perceived
the microscope sitting on a lab bench or in a laboratory, and
the lion in a forest or on a savannah or in a zoo.
An example of the knowledge we have of things that
typically belong in certain scenes is provided by an experiment in which Andrew Hollingworth (2005) had observers
study a scene, such as the picture of the gym in Figure 5.39
(but without the circles), that contained a target object, such
Present
Absent
D E M O N S T R AT I O N
Visualizing Scenes and Objects
Your task in this demonstration is simple—visualize or simply
think about the following scenes and objects:
1. An office
2. The clothing section of a department store
3. A microscope
4. A lion ❚
Most people who have grown up in modern society have
little trouble visualizing an office or the clothing section of
a department store. What is important about this ability,
for our purposes, is that part of this visualization involves
details within these scenes. Most people see an office as hav-
Figure 5.39 ❚ Hollingworth’s (2005) observers saw scenes
like this one (but without the circles). In this scene the target
object is the barbell, although observers do not know this
when they are viewing the scene. “Non-target” scenes are the
same but do not include the target. The circles indicate the
average error of observers’ judgments of the position of the
target object for trials in which they had seen the object in
the scene (small circle) and trials in which the object had not
appeared in the scene (larger circle). (From A. Hollingsworth,
2005, Memory for object position in natural scenes. Visual
Cognition, 12, 1003–1016. Reprinted by permission of the
publisher, Taylor & Francis Ltd., http://www.tandf.co.uk/
journals.)
Perceiving Scenes and Objects in Scenes
117
blob
Antonio Torralba
as the barbell on the mat, or the same scene but without the
target object, for 20 seconds. Observers then saw a picture
of a target object followed by a blank screen, and were asked
to indicate where the target object was in the scene (if they
had seen the picture containing the target object) or where
they would expect to see the target object in the scene (if they
had seen the same picture but without the target object).
The results are indicated by the circles, which show the
averaged error of observers’ judgments for many different
objects and scenes. The small circle shows that observers
who saw the target objects accurately located their positions
in the scene. The large circle shows that observers who had
not seen the target objects were not quite as accurate but
were still able to predict where the target objects would be.
What this means for the gym scene is that observers were
apparently able to predict where the barbell would appear
based on their prior experience in seeing objects in gyms.
This effect of semantic knowledge on our ability to perceive was illustrated in an experiment by Stephen Palmer
(1975), using stimuli like the picture in Figure 5.40. Palmer
first presented a context scene such as the one on the left
and then briefly flashed one of the target pictures on the
right. When Palmer asked observers to identify the object
in the target picture, they correctly identified an object like
the loaf of bread (which is appropriate to the kitchen scene)
80 percent of the time, but correctly identified the mailbox
or the drum (two objects that don’t fit into the scene) only
40 percent of the time. Apparently Palmer’s observers were
using their knowledge about kitchens to help them perceive
the briefly flashed loaf of bread.
The effect of semantic regularities is also illustrated in
Figure 5.41, which is called “the multiple personalities of a
blob” (Oliva & Torralba, 2007). The blob is perceived as different objects depending on its orientation and the context
within which it is seen. It appears to be an object on a table
in (b), a shoe on a person bending down in (c), and a car and
a person crossing the street in (d), even though it is the same
shape in all of the pictures.
(a)
(b)
(c)
(d)
Figure 5.41 ❚ What we expect to see in different contexts
influences our interpretation of the identity of the “blob”
inside the circles. (Part (d) adapted from Trends in Cognitive
Sciences, Vol. 11, 12, Oliva, A., and Torralba, A., The role of
context in object recognition. Copyright 2007, with permission
from Elsevier.)
The Role of Inference in Perception
People use their knowledge of physical and semantic regularities such as the ones we have been describing to infer
what is present in a scene. The idea that perception involves
inference is nothing new; it was proposed in the 18th century by Hermann von Helmholtz (1866/1911) who was one
of the preeminent physiologists and physicists of his day.
Helmholtz made many discoveries in physiology and
physics, developed the ophthalmoscope (the device that an
optometrist or ophthalmologist uses to look into your eye),
and proposed theories of object perception, color vision, and
hearing. One of his proposals about perception is a principle
called the theory of unconscious inference, which states
A
B
C
Figure 5.40 ❚ Stimuli used in Palmer’s
Context scene
118
CHAPTER 5
Perceiving Objects and Scenes
Target object
(1975) experiment. The scene at the left is
presented first, and the observer is then
asked to identify one of the objects on
the right.
Revisiting the Science Project:
Designing a Perceiving Machine
(a)
(b)
(c)
Figure 5.42 ❚ The display in (a) is usually interpreted
as being (b) a blue rectangle in front of a red rectangle. It
could, however, be (c) a blue rectangle and an appropriately
positioned six-sided red figure.
that some of our perceptions are the result of unconscious
assumptions we make about the environment.
The theory of unconscious inference was proposed to
account for our ability to create perceptions from stimulus
information that can be seen in more than one way. For example, what do you see in the display in Figure 5.42a? Most
people perceive a blue rectangle in front of a red rectangle,
as shown in Figure 5.42b. But as Figure 5.42c indicates, this
display could have been caused by a six-sided red shape
positioned either in front of or behind the blue rectangle.
According to the theory of unconscious inference, we infer
that Figure 5.42a is a rectangle covering another rectangle
because of experiences we have had with similar situations
in the past. A corollary of the theory of unconscious inference is the likelihood principle, which states that we perceive the object that is most likely to have caused the pattern
of stimuli we have received.
One reason that Helmholtz proposed the likelihood
principle is to deal with the ambiguity of the perceptual
stimulus that we described at the beginning of the chapter.
Helmholtz viewed the process of perception as being similar
to the process involved in solving a problem. For perception,
the task is to determine which object caused a particular
pattern of stimulation, and this problem is solved by a process in which the observer brings his or her knowledge of the
environment to bear in order to infer what the object might
be. This process is unconscious, hence the term unconscious
inference. (See Rock, 1983, for a modern version of this idea.)
Modern psychologists have quantified Helmholtz’s idea
of perception as inference by using a statistical technique
called Bayesian inference that takes probabilities into account (Kersten et al., 2004; Yuille & Kersten, 2006). For example, let’s say we want to determine how likely it is that it will
rain tomorrow. If we know it rained today, then this increases
the chances that it will rain tomorrow, because if it rains one
day it is more likely to rain the next day. Applying reasoning
like this to perception, we can ask, for example, whether a
given object in a kitchen is a loaf of bread or a mailbox. Since
it is more likely that a loaf of bread will be in a kitchen, the
perceptual system concludes that bread is present. Bayesian
statistics involves this type of reasoning, expressed in mathematical formulas that we won’t describe here.
We are now ready to return to the science project (see pages 4
and 100) and to apply what we know about perception to the
problem of designing a device that can identify objects in the
environment. We can now see that one way to make our device more effective would be to program in knowledge about
regularities in the environment. In other words, an effective
“object perceiving machine” would be able to go beyond processing information about light, dark, shape, and colors that
it might pick up with its sensors. It would also be “tuned” to
respond best to regularities of the environment that are most
likely to occur, and would be programmed to use this information to make inferences about what is out there.
Will robotic vision devices ever equal the human ability to perceive? Based on our knowledge of the complexities
of perception, it is easy to say “no,” but given the rapid advances that are occurring in the field of computer vision, it
is not unreasonable to predict that machines will eventually be developed that approach human perceptual abilities.
One reason to think that machines are gaining on humans
is that present-day computers have begun incorporating
humanlike inference processes into their programs. For
example, consider CMU’s vehicle “Boss,” the winner of the
“Urban Challenge” race (see page 101). One reason for Boss’s
success was that it was programmed to take into account
common events that occur when driving on city streets.
Consider, for example, what happens when a human
driver (like you) approaches an intersection. You probably
check to see if you have a stop sign, then determine if other
cars are approaching from the left or right. If they are approaching, you notice whether they have a stop sign. If they
do, you might check to be sure they are slowing down in
preparation for stopping. If you decide they might ignore
their stop sign, you might slow down and prepare to take
appropriate action. If you see that there are no cars coming,
you proceed across the intersection. In other words, as you
drive, you are constantly noticing what is happening and are
taking into account your knowledge of traffic regulations
and situations you have experienced in the past to make decisions about what to do.
The Boss vehicle is programmed to carry out a similar
type of decision-making process to determine what to do
when it reaches an intersection. It determines if another car
is approaching by using its sensors to detect objects off to
the side. It then decides whether an object is a car by taking
its size into account and by using the rule “If it is moving, it
is likely to be a car.” Boss is also programmed to know that
other cars should stop if they have a stop sign. Thus, the
computer was designed both to sense what was out there
and to go beyond simply sensing by taking knowledge into
account to decide what to do at the intersection.
The problem for computer vision systems is that before
they can compete with humans they have to acquire a great
deal more knowledge. Present systems are programmed with
just enough knowledge to accomplish specialized tasks like
Perceiving Scenes and Objects in Scenes
119
(a)
(c)
(b)
50
40
Firing rate
driving the course in the Urban Challenge. While Boss is programmed to determine where the street is and to always stay
on the street, Boss can’t always make good decisions about
when it is safe to drive off-road. For example, Boss can’t
tell the difference between tall grass (which wouldn’t pose
much of a threat for off-road driving) and a field full of vertical spikes (which would be very unfriendly to Boss’s tires)
(C. Urmson, personal communication, 2007).1
To program the computer to recognize grass, it would
be necessary to provide it with knowledge about grass such
as “Grass is green,” “Grass moves if it is windy,” “Grass is
flat and comes to a point.” Once Boss has enough knowledge about grass to accurately identify it, then it can be
programmed not to avoid it, and to drive off-road onto it
if necessary. What all of this means is that while it is helpful to have lots of computing power, it is also nice to have
knowledge about the environment. The human model of
the perceiving machine has this knowledge, and uses it to
perceive with impressive accuracy.
30
20
10
0
a
b
c
(d)
Figure 5.43 ❚ How a neuron in the striate cortex (V1)
The Physiology of Object
and Scene Perception
Thousands of experiments have been done to answer the
question “What is the neural basis of object perception?”
We have seen that object perception has many aspects, including perceptual organization, grouping, recognizing
objects, and perceiving scenes and details within scenes. We
first consider neurons that respond to perceptual grouping
and figure–ground.
Neurons That Respond to Perceptual
Grouping and Figure–Ground
Many years after the Gestalt psychologists proposed the
laws of good continuation and similarity, researchers discovered neurons in the visual cortex that respond best to
displays that reflect these principles of grouping. For example, Figure 5.43a shows a vertical line in the receptive field
(indicated by the square) of a neuron in a monkey’s striate
cortex. The neuron’s response to this single line is indicated
by the left bar in Figure 5.43d. No firing occurs when lines
are presented outside the square (Zapadia et al., 1995).
But something interesting happens when we add a field
of randomly oriented lines, as in Figure 5.43b. These lines,
which fall outside the neuron’s receptive field, cause a decrease in how rapidly the neuron fires to the single vertical
line. This effect of the stimuli that fall outside of the neuron’s receptive field (which normally would not affect the
neuron’s firing rate), is called contextual modulation, because the context within which the bar appears affects the
neuron’s response to the bar.
1
Chris Urmson is Director of Technology, Tartan Racing Team, Carnegie
Mellon University.
120
CHAPTER 5
Perceiving Objects and Scenes
responds to (a) an oriented bar inside the neuron’s receptive
field (the small square); (b) the same bar surrounded by
randomly oriented bars; (c) the bar when it becomes part of
a group of vertical bars, due to the principles of similarity and
good continuation. (Adapted from Zapadia, M. K., Ito, M.,
Gilbert, C. G., & Westheimer, G. (1995). Improvement in visual
sensitivity by changes in local context: Parallel studies in
human observers and in V1 of alert monkeys. Neuron, 15,
843–856. Copyright © 1995, with permission from Elsevier.)
Figure 5.43c shows that we can increase the neuron’s
response to the bar by arranging a few of the lines that are
outside the receptive field so that they are lined up with
the line that is in the receptive field. When good continuation and similarity cause our receptive-field line to become
perceptually grouped with these other lines, the neuron’s
response increases. This neuron is therefore affected by Gestalt organization even though this organization involves
areas outside its receptive field.
Another example of how an area outside the receptive
field can affect responding is shown in Figure 5.44. This
neuron, in the visual cortex, responds well when leftwardslanted lines are positioned over the neuron’s receptive field
(indicated by the green bar in Figure 5.44a; Lamme, 1995).
Notice that in this case we perceive the leftward slanting
bars as a square on a background of right-slanted lines.
However, when we replace the right-slanted “background”
lines with left-slanted lines, as in Figure 5.44b, the neuron
no longer fires.
Notice that when we replaced the right-slanted background lines with left-slanted lines the stimulus on the receptive field (left-slanted lines) did not change, but our perception of these lines changed from being part of a figure (in
Figure 5.44a) to being part of the background (Figure 5.44b).
This neuron therefore responds to right-slanted lines only
when they are seen as being part of the figure. (Also see
Qui & von der Heydt, 2005).
Firing rate
120
Frontal
cortex
(FC)
Superior temporal
sulcus
(STS)
80
40
Occipital
cortex
(OC)
0
0
150 300
Time (msec)
(a)
Amygdala
(A)
Fusiform gyrus (FG)
(underside of
the brain)
120
Firing rate
Figure 5.45 ❚ The human brain, showing some of the
areas involved in perceiving faces. Some of the perceptual
functions of these areas are: OC ⫽ initial processing; FG ⫽
identification; A ⫽ emotional reaction; STS ⫽ gaze direction;
FC ⫽ atttractiveness. The amygdala is located deep inside
the cortex, approximately under the ellipse.
80
40
0
(b)
0
150 300
Time (msec)
Figure 5.44 ❚ How a neuron in V1 responds to oriented
lines presented to the neuron’s receptive field (green
rectangle). (a) The neuron responded when the bars on the
receptive field are part of a figure, but there is no response
when (b) the same pattern is not part of a figure. Adapted
from Lamme, V. A. F. (1995). The neurophysiology of figure–
ground segregation in primary visual cortex. Journal of
Neuroscience, 15, 1605–1615.
How Does the Brain Respond
to Objects?
How are objects represented by the firing of neurons in the
brain? To begin answering this question, let’s review the basic principles of sensory coding we introduced in Chapters 2
and 4.
Review of Sensory Coding In Chapter 2 we described specificity coding, which occurs if an object is represented by the firing of a neuron that fires only to that
object, and distributed coding, which occurs if an object is
represented by the pattern of firing of a number of neurons.
In Chapter 4 we introduced the idea that certain areas are
specialized to process information about specific types of
objects. We called these specialized areas modules. Three of
these areas are the fusiform face area (FFA), for faces; the
extrastriate body area (EBA), for bodies; and the parahippocampal place area (PPA), for buildings and places. Although
neurons in these areas respond to specific types of stimuli,
they aren’t totally specialized, so a particular neuron that
responds only to faces responds to a number of different
faces (Tsao et al., 2006). Objects, according to this idea, are
represented by distributed coding, so a specific face would
be represented by the pattern of firing of a number of neurons that respond to faces.
We also noted that even though modules are specialized to process information about specific types of stimuli
such as faces, places, and bodies, objects typically cause activity not only in a number of neurons within a module, but
also in a number of different areas of the brain. Thus, a face
might cause a large amount of activity in the FFA, but also
cause activity in other areas as well. Firing is, therefore, distributed in two ways: (1) across groups of neurons within a
specific area, and (2) across different areas in the brain.
More Evidence for Distributed Activity
Across the Brain We begin our discussion where we
left off in Chapter 4—with the idea that objects cause activity in a number of different brain areas. Faces provide one of
the best examples of distributed representation across the
brain. We know that the fusiform face area (FFA) is specialized to process information about faces, because the FFA responds to pictures of faces but not to pictures of other types
of stimuli.
But perceiving a face involves much more than just looking at a face and identifying it as “a face,” or even as “Bill’s
face.” After you have identified a face as, say, your friend Bill,
you may have an emotional reaction to Bill based on the expression on his face or on your past experience with him.
You may notice whether he is looking straight at you or off
to the side. You may even be thinking about how attractive
(or unattractive) he is. Each of these reactions to faces has
been linked to activity in different areas of the brain.
Figure 5.45 shows some of the areas involved in face perception. Initial processing of the face occurs in the occipital
cortex, which sends signals to the fusiform gyrus, where visual information concerned with identification of the face
is processed (Grill-Spector et al., 2004). Emotional aspects
of the face, including facial expression and the observer’s
emotional reaction to the face, are reflected in activation of
the amygdala, which is located within the brain (Gobbini &
Haxby, 2007; Ishai et al., 2004).
The Physiology of Object and Scene Perception
121
Evaluation of where a person is looking is linked to activity in the superior temporal sulcus; this area is also involved in perceiving movements of a person’s mouth as the
person speaks (Calder et al., 2007; Puce et al., 1998). Evaluation of a face’s attractiveness is linked to activity in the
frontal area of the brain.
The fact that all, or most, of these factors come into play
when we perceive a face has led to the conclusion that there is
a distributed system in the cortex for perceiving faces (Haxby
et al., 2000; Ishai, 2008). The activation caused by other objects is also distributed, with most objects activating a number of different areas in the brain (Shinkareva et al., 2008).
50 ms (see Method: Using a Mask to Achieve Brief Stimulus
Presentations, page 114).
The observer’s task in this experiment was to indicate,
after presentation of the mask, whether the picture was
“Harrison Ford,” “another object,” or “nothing.” This is
the “observer’s response” in Figure 5.46. The results, based
on presentation of 60 different pictures of Harrison Ford,
60 pictures of other faces, and 60 random textures, are
shown in Figure 5.47. This figure shows the course of brain
activation for the trials in which Harrison Ford’s face was
presented. The top curve (red) shows that activation was
greatest when observers correctly identified the stimulus as
Harrison Ford’s face. The next curve shows that activation
was less when they responded “other object” to Harrison
Ford’s face. In this case they detected the stimulus as a face
but were not able to identify it as Harrison Ford’s face. The
lowest curve indicates that there was little activation when
observers could not even tell that a face was presented.
Remember that all of the curves in Figure 5.47 represent
the brain activity that occurred not when observers were responding verbally, but during presentation of Harrison Ford’s
face. These results therefore show that neural activity that
occurs as a person is looking at a stimulus is determined not only
by the stimulus that is presented, but also by how a person is
processing the stimulus. A large neural response is associated
with processing that results in the ability to identify the stimulus; a smaller response, with detecting the stimulus; and the
absence of a response with missing the stimulus altogether.
Connections between neural responses and perception
have also been determined by using a perceptual phenomenon called binocular rivalry: If one image is presented to the
left eye and a different image is presented to the right eye, perception alternates back and forth between the two eyes. For
example, if the sunburst pattern in Figure 5.48 is presented
only to the left eye, and the butterfly is presented only to the
right eye, a person would see the sunburst part of the time
and the butterfly part of the time, but never both together.
D. L. Sheinberg and Nikos Logothetis (1997) presented
a sunburst pattern to a monkey’s left eye and a picture such
as the butterfly or another animal or object to the monkey’s
right eye. To determine what the monkey was perceiving, they
trained the monkey to pull one lever when it perceived the
sunburst pattern and another lever when it perceived the butterfly. As the monkey was reporting what it was perceiving,
Connecting Neural Activity
and Perception
The results we have been describing involved experiments
in which a stimulus was presented and brain activity was
measured. Other experiments have gone beyond simply
observing which stimulus causes firing in specific areas to
studying connections between brain activity and what a
person or animal perceives.
One of these experiments, by Kalanit Grill-Spector and
coworkers (2004), studied the question of how activation of
the brain is related to whether a person recognizes an object
by measuring brain activation as human observers identified pictures of the face of a well-known person—Harrison
Ford. They focused on the fusiform face area (FFA). To locate the FFA in each person, they used a method called the
region-of-interest (ROI) approach.
One of the challenges of brain imaging research is that
although maps have been published indicating the location of different areas of the brain, there is a great deal
of variation from person to person in the exact location of a particular area. The region-of-interest (ROI)
approach deals with this problem by pretesting people on
the stimuli to be studied before running an experiment.
For example, in the study we are going to describe, GrillSpector located the FFA in each observer by presenting
pictures of faces and nonfaces and noting the area that
was preferentially activated by faces. Locating this ROI
before doing the experiment enabled researchers to focus
on the exact area of the brain that, for each individual person, was specialized to process information about faces.
Once Grill-Spector determined the location of the FFA
for each observer, she presented stimuli as shown in Figure 5.46. On each trial, observers saw either (a) a picture of
Harrison Ford, (b) a picture of another person’s face, or (c) a
random texture. Each of these stimuli was presented briefly
(about 50 ms) followed immediately by a random-pattern
mask, which limited the visibility of each stimulus to just
122
CHAPTER 5
Perceiving Objects and Scenes
© Stephane Cardinale/People Avenue/Corbis
M E T H O D ❚ Region-of-Interest Approach
Observer’s
response
Stimulus
See either
(a) Harrison Ford
(b) Another person’s face
(c) A random texture
Mask
Indicate either
(a) “Harrison Ford”
(b) “Another object”
(c) “Nothing”
Figure 5.46 ❚ Procedure for the Grill-Spector et al. (2004)
experiment. See text for details.
Consider what happened in this experiment. The images on the monkey’s retinas remained the same throughout the experiment—the sunburst was always positioned on
the left retina, and the butterfly was always positioned on
the right retina. The change in perception from “sunburst”
to “butterfly” must therefore have been happening in the
monkey’s brain, and this experiment showed that these
changes in perception were linked to changes in the firing
of a neuron in the brain.
This binocular rivalry procedure has also been used to
connect perception and neural responding in humans by using fMRI. Frank Tong and coworkers (1998) presented a picture of a person’s face to one eye and a picture of a house to
the other eye, by having observers view the pictures through
colored glasses, as shown in Figure 5.49. The images are
shown as overlapping in this figure, but because each eye
0.4
Correct identification
fMRI signal
Saw face but
incorrect identification
0.2
Did not see face
0
– 0.2
0
5
10
15
Time (sec.)
Figure 5.47 ❚ Results of Grill-Spector et al. (2004)
experiment for trials in which Harrison Ford’s face was
presented. Activity was measured in the initial part of the
experiment, when Harrison Ford’s face was presented. (From
Grill-Spector, K., Knouf, N., & Kanwisher, N., The fusiform
face area subserves face perception, not generic withincategory identification, Nature Neuroscience, 7, 555–562.
Reprinted by permission from Macmillan Publisher Ltd.
Copyright 2004.)
Courtesy of Frank Tong
they simultaneously recorded the activity of a neuron in the
inferotemporal (IT) cortex that had previously been shown
to respond to the butterfly but not to the sunburst. The
result of this experiment was straightforward: The cell fired
vigorously when the monkey was perceiving the butterfly and
ceased firing when the monkey was perceiving the sunburst.
Image not available due to copyright restrictions
Figure 5.49 ❚ Observers in the Tong et al. (1998)
experiment viewed the overlapping red house and green
face through red-green glasses, so the house image was
presented to the right eye and the face image to the left
eye. Because of binocular rivalry, the observers’ perception
alternated back and forth between the face and the house.
When the observers perceived the house, activity occurred
in the parahippocampal place area (PPA), in the left and right
hemispheres (red ellipses). When observers perceived the
face, activity occurred in the fusiform face area (FFA) in the
left hemisphere (green ellipse). (From Tong, F., Nakayama, K.,
Vaughn, J. T., & Kanwisher, N., 1998, Binocular rivalry and
visual awareness in human extrastriate cortex. Neuron, 21,
753–759.)
The Physiology of Object and Scene Perception
123
received only one of the images, binocular rivalry occurred.
Observers perceived either the face alone or the house alone,
and these perceptions alternated back and forth every few
seconds.
Tong determined what the observers were perceiving by
having them push a button when perceiving the house and
another button when perceiving the face. As the observer’s
perception was fl ipping back and forth between the house
and the face, Tong measured the fMRI response in the parahippocampal place area (PPA) and the fusiform face area
(FFA). When observers were perceiving the house, activity
increased in the PPA (and decreased in the FFA); when they
were perceiving the face, activity increased in the FFA (and
decreased in the PPA). This result is therefore similar to
what Sheinberg and Logothetis found in single neurons in
the monkey. Even though the image on the retina remained
the same throughout the experiment, activity in the brain
changed, depending on what the person was experiencing.
Something to Consider:
Models of Brain Activity
That Can Predict What
a Person Is Looking At
When you look at a scene, a pattern of activity occurs in your
brain that represents the scene. When you look somewhere
else, a new pattern occurs that represents the new scene. Is
it possible to tell what scene a person is looking at by monitoring his or her brain activity? Some recent research has
brought us closer to achieving this feat and has furthered
our understanding of the connection between brain activity
and perception.
Yakiyasui Kamitani and Frank Tong (2005) took a step
toward being able to “decode” brain activity by measuring
observers’ fMRI response to grating stimuli—alternating
black and white bars like the one in Figure 5.50a. They presented gratings with a number of different orientations
(the one in Figure 5.50a slants 45 degrees to the right, for
example) and determined the response to these gratings
in a number of fMRI voxels. A voxel is a small cube-shaped
area of the brain about 2 or 3 mm on each side. (The size of a
voxel depends on the resolution of the fMRI scanner. Scanners are being developed that will be able to resolve areas
smaller than 2 or 3 mm on a side.)
One of the properties of fMRI voxels is that there is
some variability in how different voxels respond. For example, the small cubes representing voxels in Figure 5.50a show
that the 45-degree grating causes slight differences in the
responses of different voxels. A grating with a different orientation would cause a different pattern of activity in these
voxels. By using the information provided by the responses
of many voxels, Kamitani and Tong were able to create an
“orientation decoder,” which was able to determine what
orientation a person was looking at based on the person’s
124
CHAPTER 5
Perceiving Objects and Scenes
fMRI voxels
(a)
Stimulus
Prediction
(b)
Figure 5.50 ❚ (a) Observers in Kamitani and Tong’s (2005)
experiment viewed oriented gratings like the one on the left.
The cubes in the brain represent the response of 8 voxels.
The activity of 400 voxels was monitored in the experiment.
(b) Results for two orientations. The gratings are the stimuli
presented to the observer. The line on the right is the
orientation predicted by the orientation decoder. The
decoder was able to accurately predict when each of the
8 orientations was presented. (From Kamitani, Y., & Tong, F.,
Decoding the visual and subjective contents of the human
brain, Nature Neuroscience, 8, 679–685. Reprinted by
permission of Macmillan Publishers Ltd. Copyright 2005.)
brain activity. They created this decoder by measuring the
response of 400 voxels in the primary visual cortex (V1) and
a neighboring area called V2 to gratings with eight different
orientations. They then carried out a statistical analysis on
the patterns of voxel activity for each orientation to create
an orientation decoder designed to analyze the pattern of
activity recorded from a person’s brain and predict which
orientation the person was looking at.
Kaminiti and Tong demonstrated the predictive power
of their orientation decoder by presenting oriented gratings
to an observer and feeding the resulting fMRI response into
the decoder, which predicted which orientation had been
presented. The results, shown in Figure 5.50b, show that
the decoder accurately predicted the orientations that were
presented.
In another test of the decoder, Kaminiti and Tong presented two overlapping gratings, creating a lattice like the
one in Figure 5.51, and asked their observers to pay attention
to one of the orientations. Because attending to each orientation resulted in different patterns of brain activity, the decoder was able to predict which of the orientations the person was paying attention to. Think about what this means.
Present 1,750 images
Measure response
of each of the 500 voxels
to all 1,750 images
One of the
500 voxels
Figure 5.51 ❚ The overlapping grating stimulus used for
If you were in Kaminiti and Tong’s laboratory looking over
their observer’s shoulder as he or she was observing the overlapping gratings, you would have no way of knowing exactly
what the person was perceiving. But by consulting the orientation decoder, you could find out which orientation the observer was focusing on. The orientation decoder essentially
provides a window into the person’s mind.
But what about stimuli that are more complex than
oriented gratings? Kendrick Kay and coworkers (2008) have
created a new decoder that can determine which photograph
of a natural scene has been presented to an observer. In the
first part of their experiment, they presented 1,750 black
and white photographs of a variety of natural scenes to an
observer and measured the activity in 500 voxels in the primary visual cortex (Figure 5.52). The goal of this part of the
experiment was to determine how each voxel responds to
specific features of the scene, such as the position of the image, the image’s orientation, and the degree of detail in the
image, ranging from fine details (like the two top images in
Figure 5.52) to images with little detail (like the bottom image). Based on an analysis of the responses of the 500 voxels
to the 1,750 images, Kay and coworkers created a scene decoder that was able to predict the voxel activity patterns that
would occur in the brain in response to images of scenes.
To test the decoder, Kay and coworkers did the following (Figure 5.53): (1) They measured the brain activity pat-
Response properties of
this voxel calculated
based on its response
to the images
Photos by Bruce Goldstein
Kaminiti and Tong’s (2005) experiment, in which observers
were told to pay attention to one of the orientations at a
time. (From Kamitani, Y., & Tong, F., Decoding the visual and
subjective contents of the human brain, Nature Neuroscience,
8, 679–685. Reprinted by permission of Macmillan Publishers
Ltd. Copyright 2005.)
Figure 5.52 ❚ The first part of the Kay et al. (2008)
experiment, in which the scene decoder was created. They
determined the response properties of 500 voxels in the
striate cortex by measuring the response of each voxel as
they presented 1,750 images to an observer. Three images
like the ones Kay used are shown here. The cube represents
one of the 500 voxels. The scene decoder was created by
determining how each of the 500 voxels responded to an
image’s position in space, orientation, and level of detail.
tern to a test image that had never been presented before
(the lion in this example). (2) They presented this test image
and 119 other new images to the decoder, which calculated
the predicted voxel activity patterns (shown on the right) for
each image. (3) They selected the pattern that most closely
matched the actual brain activity elicited by the test image.
When they checked to see if the image that went with this
pattern was the same as the test image, they found that the
decoder identified 92 percent of the images correctly for one
observer, and 72 percent correctly for another observer. This
is impressive because chance performance for 120 images is
less than 1 percent. It is also impressive because the images
Something to Consider: Models of Brain Activity That Can Predict What a Person Is Looking At
125
Response
(1) Measure brain activity to test image.
Voxel number
Image
Brain
Measured voxel activity pattern
(2) Present test image and 119 other images to the decoder.
Decoder
Decoder
(3) Select the predicted voxel pattern that most closely matches the pattern for the test image.
Decoder
Photos by Bruce Goldstein
Decoder
Decoder
Predicted voxel activity patterns
Figure 5.53 ❚ To test their scene decoder, Kay and coworkers (2008) first (a) measured an observer’s brain activity caused by
the presentation of a test image that the observer had never seen, and then (b) used the decoder to predict the pattern of voxel
activity for this test image and 119 other images. The highlighted pattern of voxel activity indicates that the decoder has correctly
matched the predicted response to the test image with the actual brain activity generated by the test image that was measured
in (a). In other words, the decoder was able to pick the correct image out of a group of 120 images as being the one that had
been presented to the observer. (Based on Kay, K. N., Naselaris, T., Prenger, R. J., & Gallant, J. L., Identifying natural images from
human brain activity, Nature, 7185, 352–355, Fig 1, top. Reprinted by permission from Macmillan Publisher Ltd. Copyright 2008.)
126
CHAPTER 5
Perceiving Objects and Scenes
perceptual or from some other area—that help you solve
problems quickly using “best guess” rules. (p. 109)
presented were new ones, which the decoder had never been
exposed to before.
Do these results mean that we can now use brain activity to “read minds,” as suggested by some reports of
this research that appeared in the popular press? These
experiments do show that it is possible to identify information in the activity of the primary visual cortex that can predict which image out of a group of images a person is looking
at. However, we are still not able to create, from a person’s
brain activity, a picture that corresponds to what the person
is seeing. Nonetheless, this research represents an impressive step toward understanding how neural activity represents objects and scenes.
2.
Consider this situation: We saw in Chapter 1 that topdown processing occurs when perception is affected by
the observer’s knowledge and expectations. Of course,
this knowledge is stored in neurons and groups of neurons in the brain. In this chapter, we saw that there are
neurons that have become tuned to respond to specific
characteristics of the environment. We could therefore
say that some knowledge of the environment is built
into these neurons. Thus, if a particular perception occurs because of the firing of these tuned neurons, does
this qualify as top-down processing? (p. 116)
3.
Reacting to the results of the recent DARPA race, Harry
says, “Well, we’ve finally shown that computers can perceive as well as people.” How would you respond to this
statement? (p. 119)
4.
Biological evolution caused our perceptual system to
be tuned to the Stone Age world in which we evolved.
Given this fact, how well do we handle activities like
downhill skiing or driving, which are very recent additions to our behavioral repertoire? (p. 115)
5.
Vecera showed that regions in the lower part of a stimulus are more likely to be perceived as figure. How does
this result relate to the idea that our visual system is
tuned to regularities in the environment? (p. 108)
6.
We are able to perceptually organize objects in the environment even when objects are similar, as in Figure 5.54.
What perceptual principles are involved in perceiving
two separate zebras? Consider both the Gestalt laws of
organization and the geons of RBC theory. What happens when you cover the zebras’ heads, so you see just
the bodies? Do these priciples still work? Is there information in addition to what is proposed by the Gestalt
laws and RBC theory that helps you perceptually organize the two zebras? (p. 105)
7.
How did you perceive the picture in Figure 5.55 when
you first looked at it? What perceptual assumptions in-
T E S T YO U R S E L F 5. 2
1. What is a “scene,” and how is it different from an
“object”?
2. What is the evidence that we can perceive the gist
4.
5.
6.
7.
THINK ABOUT IT
1.
This chapter describes a number of perceptual heuristics, including the Gestalt “laws” and the light-fromabove heuristic. Think of some other heuristics—either
Barbara Goldstein
3.
of a scene very rapidly? What information helps us
identify the gist?
What are regularities in the environment? Give
examples of physical regularities, and discuss how
these regularities are related to the Gestalt laws of
organization.
What are semantic regularities? How do semantic
regularities affect our perception of objects within
scenes? What is the relation between semantic
regularities and the idea that perception involves
inference? What did Helmholtz have to say
about inference and perception? What is Bayesian
inference, and how is it related to Helmholtz’s ideas
about inference?
What is a way to make a robotic vision device more
effective? Why is there reason to think that machines are gaining on humans? What do computer
vision systems have to do before they can compete
with humans?
Describe research on (a) neurons that respond to
perceptual grouping and to figure–ground; (b) the
distributed nature of the representation of faces in
the brain; and (c) connections between brain activity
and perception (be sure you understand the
“Harrison Ford” experiment and the two binocular
rivalry experiments).
Describe how fMRI has been used to create “orientation decoders” and “scene decoders” that can
predict how the brain will respond to (a) oriented
gratings and (b) complex scenes.
Figure 5.54 ❚ Which principles of organization enable us to
tell the two zebras apart?
Think About It
127
3.
When does figure separate from ground? The Gestalt psychologists proposed that figure must be separated
from ground before it can be recognized. There is
evidence, however, the meaning of an area can be
recognized before it has become separated from the
ground. This means that recognition must be occurring either before or at the same time as the figure is
being separated from ground. (p. 108)
Peterson, M. A. (1994). Object recognition processes
can and do operate before figure–ground organization. Current Directions in Psychological Science, 3,
105–111.
4.
Global precedence. When a display consists of a large
object that is made up of smaller elements, what does
the nervous system process fi rst, the large object or
the smaller elements? An effect called the global precedence effect suggests that the larger object is V
L 21
processed fi rst.
Navon, D. (1977). Forest before trees: The precedence
of global features in visual perception. Cognitive
Psycholog y, 9, 353–383.
5.
Experience-dependent plasticity and object recognition. A
person’s experience can shape both neural responding and behavioral performance related to the recognition of objects. (p. 116)
Kourtzi, Z., & DiCarlo, J. J. (2006). Learning and
neural plasticity in visual object recognition. Current Opinion in Neurobiolog y, 16, 152–158.
6.
Boundary extension effect. When people are asked to remember a photograph of a scene, they tend to remember a wider-angle view than was shown in the original
photograph. This suggests that visual mechanisms
infer the existence of visual layout that occurs beyond the boundaries of a given view. There is evidence
the parahippocampal place area may be involved in
boundary extension. (p. 118)
Intraub, H. (1997). The representation of visual
scenes. Trends in Cognitive Sciences, 1, 217–222.
Park, S., Intraub, H., Yi, D.-J., Widders, D., & Chun,
M. M. (2007). Beyond the edges of view: Boundary
extension in human scene-selective cortex. Neuron
54, 335–342.
7.
Identifying cognitive states associated with perceptions. Research similar to that described in the Something to
Consider section has used f MRI to identify different
patterns of brain activation for tools and dwellings.
(p. 124)
Shinkareva, S. V., Mason, R. A., Malave, V. L., Wang,
W., Mitchell, T. M., & Just, M. (2008). Using f MRI
brain activation to identify cognitive states associated with perception of tools and dwellings. PLoS
ONE, 3(1), e1394.
Figure 5.55 ❚ The Scarf, a drawing by Rita Ludden.
fluenced your response to this picture? (For example,
did you make an assumption about how flowers are
usually oriented in the environment?) (p. 118)
IF YOU WANT TO KNOW MORE
1.
Robotic vehicles. To fi nd out more about the DARPA
race, go to www.grandchallenge.org or search for
DARPA on the Internet. (p. 101)
2.
Perceiving figure and ground. When you look at the
vase–face pattern in Figure 5.21, you can perceive two
blue faces on a white background or a white vase on
a blue background, but it is difficult to see the faces
and the vase simultaneously. It has been suggested
that this occurs because of a heuristic built into the
visual system that takes into account the unlikelihood that two adjacent objects would have the same
contours and would line up perfectly. (p. 108)
Baylis, G. C., & Driver, J. (1995). One-sided edge assignment in vision: I. Figure–ground segmentation
and attention to objects. Current Directions in Psychological Science 4, 140–146.
128
CHAPTER 5
Perceiving Objects and Scenes
KEY TERMS
Accidental viewpoint (p. 111)
Algorithm (p. 109)
Apparent movement (p. 104)
Bayesian inference (p. 119)
Binocular rivalry (p. 122)
Border ownership (p. 108)
Contextual modulation (p. 120)
Discriminability (p. 112)
Figure (p. 108)
Figure–ground segregation (p. 108)
Geons (p. 110)
Gestalt psychologist (p. 103)
Gist of a scene (p. 114)
Global image features (pp. 114–115)
Ground (p. 108)
Heuristic (p. 109)
Illusory contour (p. 104)
Inverse projection problem (p. 101)
Law of common fate (p. 106)
Law of familiarity (p. 107)
Law of good continuation (p. 106)
Law of good figure (p. 105)
Law of pragnanz (p. 105)
Law of proximity (nearness)
(p. 106)
Law of similarity (p. 105)
Law of simplicity (p. 105)
Laws of perceptual organization
(p. 105)
Light-from-above heuristic (p. 116)
Likelihood principle (p. 119)
Masking stimulus (p. 114)
Non-accidental properties (NAPs)
(p. 110)
Oblique effect (p. 116)
Perceptual organization (p. 105)
Perceptual segregation (p. 108)
Persistence of vision (p. 114)
Physical regularities (p. 116)
Principle of common region (p. 106)
MEDIA RESOURCES
The Sensation and Perception
Book Companion Website
www.cengage.com/psychology/goldstein
See the companion website for flashcards, practice quiz
questions, Internet links, updates, critical thinking exercises, discussion forums, games, and more!
CengageNow
www.cengage.com/cengagenow
Go to this site for the link to CengageNOW, your one-stop
shop. Take a pre-test for this chapter, and CengageNOW
will generate a personalized study plan based on your test
results. The study plan will identify the topics you need to
review and direct you to online resources to help you
master those topics. You can then take a post-test to help
you determine the concepts you have mastered and what
you will still need to work on.
Virtual Lab
VL
Your Virtual Lab is designed to help you get the most out
of this course. The Virtual Lab icons direct you to specific
media demonstrations and experiments designed to help
you visualize what you are reading about. The number
beside each icon indicates the number of the media element
you can access through your CD-ROM, CengageNOW, or
WebTutor resource.
Principle of componential recovery
(p. 112)
Principle of synchrony (p. 106)
Principle of uniform connectedness
(p. 106)
Recognition-by-components (RBC)
theory (p. 110)
Region-of-interest (ROI) approach
(p. 122)
Regularities in the environment
(p. 115)
Reversible figure–ground (p. 108)
Scene (p. 114)
Semantic regularities (p. 117)
Sensations (p. 104)
Structuralism (p. 104)
Theory of unconscious inference
(p. 119)
Viewpoint invariance (p. 103)
The following lab exercises are related to material in
this chapter:
1. Robotic Vehicle Navigation: DARPA Urban Challenge A
video showing the robotic car “Boss” as it navigates a
course in California. (Courtesy of Tartan Racing, Carnegie
Mellon University.)
2. Apparent Movement How the illusion of movement can
be created between two flashing dots.
3. Linear and Curved Illusory Contours Examples of how
characteristics of illusory contour display affects contours.
4. Enhancing Illusory Contours How adding components to
a display can enhance illusory contours.
Context and Perception: The Hering Illusion How background lines can make straight parallel lines appear to
curve outward.
6. Context and Perception: The Poggendorf Illusion How
interrupting a straight line makes the segments of the line
look as though they don’t line up. (Courtesy of Michael
Bach.)
5.
Ambiguous Reversible Cube A stimulus that can be
perceived in a number of different ways, and does strange
things when it moves. (Courtesy of Michael Bach.)
8. Perceptual Organization: The Dalmatian Dog How a
black-and-white pattern can be perceived as a Dalmatian.
(Courtesy of Michael Bach.)
9. Law of Simplicity or Good Figure A situation in which the
law of good figure results in an error of perception.
7.
Media Resources
129
Law of Similarity How characteristics of a display cause
grouping due to similarity.
11. Law of Good Continuation How good continuation influences perceptual organization.
12. Law of Closure The effect of adding small gaps to an
object.
10.
Law of Proximity How varying the distance between
elements influences grouping.
14. Law of Common Fate Grouping that occurs due to common movement of stimulus elements.
15. Real-World Figure–Ground Ambiguity A reversible
figure–ground display using a picture of a real vase.
16. Figure–Ground Ambiguity How changing the contrast
of a painting influences figure–ground segregation.
17. Perceiving Rapidly Flashed Stimuli Some rapidly flashed
stimuli like those used in the Fei-Fei experiment that inves13.
130
CHAPTER 5
Perceiving Objects and Scenes
tigated what people perceive when viewing rapidly flashed
pictures. (Courtesy of Li Fei-Fei.)
18. Rotating Mask 1 How our assumption about the threedimensional shape of a face can create an error of perception. (Courtesy of Michael Bach.)
Rotating Mask 2 Another example of a rotating mask,
this one with a Charlie Chaplin mask. (Courtesy of Michael
Bach.)
20. Rotating Mask 3 Another rotating mask, this one with
a nose ring! (Courtesy of Thomas Papathomas.)
21. Global Precedence An experiment to determine reaction
times in response to large patterns and smaller elements
that make up the larger patterns.
19.
Answers for Figure 5.6. Faces from left to right: Prince
Charles, Woody Allen, Bill Clinton, Saddam Hussein, Richard Nixon, Princess Diana.
This page intentionally left blank
Chapter Contents
C H A P T E R
6
ATTENTION AND PERCEIVING THE
ENVIRONMENT
Why Is Selective Attention Necessary?
How Is Selective Attention Achieved?
What Determines How We Scan a Scene?
HOW DOES ATTENTION AFFECT
OUR ABILITY TO PERCEIVE?
Perception Can Occur Without Focused
Attention
Perception Can Be Affected by a Lack of
Focused Attention
Visual
Attention
DEMONSTRATION: Change Detection
❚ TEST YOURSELF 6.1
DOES ATTENTION ENHANCE
PERCEPTION?
Effects of Attention on Information
Processing
Effects of Attention on Perception
ATTENTION AND EXPERIENCING
A COHERENT WORLD
Why Is Binding Necessary?
Feature Integration Theory
DEMONSTRATION: Searching for
Conjunctions
The Physiological Approach to Binding
THE PHYSIOLOGY OF ATTENTION
SOMETHING TO CONSIDER:
ATTENTION IN AUTISM
❚ TEST YOURSELF 6.2
Think About It
If You Want to Know More
Key Terms
Media Resources
VL VIRTUAL LAB
OPPOSITE PAGE This photo of PNC Park shows a Pittsburgh Pirates
game in progress and the city in the background. The yellow fixation dots and red lines indicate eye movements that show where one
person looked in the first 3 seconds of viewing this picture. The eye
movement record indicates that this person first looked just above the
right field bleachers and then scanned the ball game. Another person
might have looked somewhere else, depending on his or her interests
and what attracted his or her attention.
Eye movement record courtesy of John Henderson. Photo by Bruce Goldstein.
VL The Virtual Lab icons direct you to specific animations and videos
designed to help you visualize what you are reading about. The number
beside each icon indicates the number of the clip you can access through
your CD-ROM or your student website.
133
attention provides the “glue” that enables us to perceive a
coherent, meaningful visual world. Finally, we will describe
the connection between attention and neural firing.
Some Questions We Will Consider:
❚ Why do we pay attention to some parts of a scene but
not to others? (p. 135)
❚ Do we have to pay attention to something to perceive it?
Attention and Perceiving
the Environment
(p. 137)
❚ Does paying attention to an object make the object
“stand out”? (p. 142)
In everyday life we often have to pay attention to a number
of things at once, a situation called divided attention. For
example, when driving down the road, you need to simultaneously attend to the other cars around you, traffic signals,
and perhaps what the person in the passenger seat is saying,
while occasionally glancing up at the rearview mirror. But
there are limits to our ability to divide our attention. For
example, reading your textbook while driving would most
likely end in disaster. Although divided attention is something that does occur in our everyday experience, our main
interest in this chapter will be selective attention—focusing
on specific objects and ignoring others.
L
ook at the picture on the left, below (Figure 6.1) without looking to the right. Count the number of trees,
and then immediately read the caption below the picture.
It is likely that you could describe the picture on the left
much more accurately and in greater detail than the one on
the right. This isn’t surprising because you were looking directly at the trees on the left, and not at the hikers on the
right. The point of this exercise is that as we shift our gaze
from one place to another in our everyday perception of
the environment, we are doing more than just “looking”;
we are directing our attention to specific features of the
environment in a way that causes these features to become
more visible and deeply processed than those features that
are not receiving our attention.
To understand perception as it happens in the real
world, we need to go beyond just considering how we perceive isolated objects. We need to consider how observers
seek out stimuli in scenes, how they perceive some things
and not others, and how these active processes shape their
perception of these objects and things around them.
As we describe the processes involved in attention in
this chapter, we will continue our quest to understand perception as it occurs within the richness of the natural environment. We begin by considering why we pay attention
to specific things in the environment. We consider some of
the ways attention can affect perception and the idea that
Why Is Selective Attention Necessary?
Bruce Goldstein
Bruce Goldstein
Why do we selectively focus on some things and ignore others? One possible answer is that we look at things that are
interesting. Although that may be true, there is another,
more basic, answer. You selectively focus on certain things
in your environment because your visual system has been
constructed to operate that way.
We can appreciate why attending to only a portion of the environment is determined by the way our visual system is constructed by returning to Ellen as she is
walking in the woods (Figure 1.2). As she looks out at the
scene before her, millions of her receptors are stimulated,
and these receptors send signals out of the optic nerve and
Figure 6.1 ❚ How many trees are there? After counting the
trees, and without moving your eyes from the picture, indicate
how many of the first four hikers in the picture on the right
(Figure 6.2) are males.
134
CHAPTER 6
Visual Attention
Figure 6.2 ❚ Although you may have noticed that this is an
outdoor scene with people walking on a road, it is necessary
to focus your attention on the lead hikers to determine if they
are males or females.
toward the lateral geniculate nucleus (LGN) and visual cortex. The problem the visual system faces is that there is so
much information being sent from Ellen’s retina toward her
brain that if the visual system had to deal with all of it, it
would rapidly become overloaded. To deal with this problem, the visual system is designed to select only a small part
of this information to process and analyze.
One of the mechanisms that help achieve this selection
is the structure of the retina, which contains the all-cone
fovea (see page 50). This area supports detail vision, so we
must aim the fovea directly at objects we want to see clearly.
In addition, remember that information imaged on the fovea receives a disproportionate amount of processing compared to information that falls outside of the fovea because
of the magnification factor in the cortex (see page 82).
How Is Selective Attention Achieved?
One mechanism of selective attention is eye movements—
scanning a scene to aim the fovea at places we want to process more deeply. As we will see in the following section, the
eye is moving constantly to take in information from different parts of a scene. But even though eye movements are an
important mechanism of selective attention, it is also important to acknowledge that there is more to attention than
just moving the eyes to look at objects. We can pay attention to things that are not directly on our line of vision, as
evidenced by the basketball player who dribbles down court
while paying attention to a teammate off to the side, just before she throws a dead-on pass without looking. In addition,
we can look directly at something without paying attention to it. You may have had this experience: While reading
a book, you become aware that although you were moving
your eyes across the page and “reading” the words, you have
no idea what you just read. Even though you were looking at
the words, you apparently were not paying attention.
What the examples of the basketball player and reader
are telling us is that there is a mental aspect of attention
that occurs in addition to eye movements. This connection between attention and what is happening in the mind
was described more than 100 years ago by William James
(1890/1981), in his textbook Principles of Psychology:
ing an umbrella—stand out more than many other things in
the environment. One of our concerns in this chapter is to
explain why attention causes some things to stand out more
than others. The first step in doing this is to describe the eye
movements that guide our eyes to different parts of a scene.
What Determines How We Scan a Scene?
The first task in the study of eye movements is to devise a
way to measure them. Early researchers measured eye movements using devices such as small mirrors and lenses that
were attached to the eyes, so the cornea had to be anesthetized (Yarbus, 1967). However, modern researchers use
camera-based eye trackers, like the one in Figure 6.3. An
eye tracker determines the position of the eye by taking pictures of the eye and noting the position of a reference point
such as a reflection that moves as the eye moves (Henderson,
2003; Morimoto & Mimica, 2005).
Figure 6.4 shows eye movements that occurred when
an observer viewed a picture of a fountain. Dots indicate
fi xations—places where the eye pauses to take in information about specific parts of the scene. The lines connecting
the dots are eye movements called saccades. A person who
is asked to simply view a scene typically makes about V
L 1
three fi xations per second.
What determines where we fi xate in a scene? The answer
to this question is complicated because our looking behavior
depends on a number of factors, including characteristics of
the scene and the knowledge and goals of the observer.
Salience Stimulus salience refers to
characteristics of the environment that stand out because
of physical properties such as color, brightness, contrast, or
orientation. Areas with high stimulus salience are conspicuous, such as a brightly colored red ribbon on a green Christmas tree.
Stimulus
Millions of items . . . are present to my senses
which never properly enter my experience. Why?
Because they have no interest for me. My experience is what I agree to attend to. . . . Everyone
knows what attention is. It is the taking possession by the mind, in clear and vivid form, of one
out of what seem several simultaneously possible objects or trains of thought. . . . It implies
withdrawal from some things in order to deal
effectively with others.
Thus, according to James, we focus on some things to
the exclusion of others. As you walk down the street, the
things you pay attention to—a classmate that you recognize, the “Don’t Walk” sign at a busy intersection, and the
fact that just about everyone except you seems to be carry-
Figure 6.3 ❚ A person looking at a stimulus picture in
a camera-based eye tracker. (Reprinted from Trends in
Cognitive Sciences, 7, Henderson, John M., 498–503, (2003),
with permission from Elsevier.)
Attention and Perceiving the Environment
135
First fixation
Figure 6.4 ❚ Scan path of a viewer while freely viewing
(a) Visual scene
a picture of a fountain in Bordeaux, France. Fixations are
indicated by the yellow dots and eye movements by the red
lines. Notice that this person looked preferentially at highinterest areas of the picture such as the statues and lights but
ignored areas such as the fence and the sky. (Reproduced with
permission from John Henderson, University of Edinburgh.)
Capturing attention by stimulus salience is a bottomup process—it depends solely on the pattern of stimulation
falling on the receptors. By taking into account three characteristics of the display in Figure 6.5a—color, contrast, and
orientation—Derrick Parkhurst and coworkers (2002) created the saliency map in Figure 6.5b. To determine whether
observers’ fi xations were controlled by stimulus saliency as
indicated by the map, Parkhurst measured where people fi xated when presented with various pictures. He found that
the initial fi xations were closely associated with the saliency
map, with fi xations being more likely on high-saliency
areas.
But attention is not just based on what is bright or
stands out. Cognitive factors are important as well. A number of cognitively based factors have been identified as important for determining where a person looks.
The knowledge we
have about the things that are often found in certain types
of scenes and what things are found together within a scene
can help determine where we look. For example, consider
how the observer scanned the ballpark in the chapteropening picture facing page 133. Although we don’t know
the background of the particular person whose scanning
records are shown, we can guess that this person may have
used his or her knowledge of baseball to direct his or her
gaze to the base runner leading off of first base and then
to the shortstop and the runner leading off of second base.
We can also guess that someone with no knowledge of baseball might scan the scene differently, perhaps even ignoring
the players completely and looking at the city in the background instead.
You can probably think of other situations in which
your knowledge about specific types of scenes might influ-
(b) Saliency map
Figure 6.5 ❚ (a) A visual scene. (b) Salience map of the
scene determined by analyzing the color, contrast, and
orientations in the scene. Lighter areas indicate greater
salience. (Reprinted from Vision Research, 42, Parkhurst, D.,
Law, K., and Niebur, E., 107–123, (2002), with permission from
Elsevier.)
Knowledge About Scenes
136
CHAPTER 6
Visual Attention
ence where you look. You probably know a lot, for example,
about kitchens, college campuses, automobile instrument
panels, and shopping malls, and your knowledge about
where things are usually found in these scenes can help
guide your attention through each scene (Bar, 2004).
Nature of the Observer’s Task Recently, lightweight, head-mounted eye trackers have been developed
that make it possible to track a person’s eye movements as
he or she perform tasks in the environment. This device has
enabled researchers to show that when a person is carrying
out a task, the demands of the task override factors such as
stimulus saliency. Figure 6.6 shows the fi xations and eye
movements that occurred as a person was making a peanut
butter sandwich. The process of making the sandwich begins with the movement of a slice of bread from the bag to
How Does Attention Affect
Our Ability to Perceive?
Although there is no question that attention is a major
mechanism of perception, there is evidence that we can take
in some information even from places where we are not focusing our attention.
Perception Can Occur Without
Focused Attention
Figure 6.6 ❚ Sequence of fixations of a person making a
peanut butter sandwich. The first fixation is on the loaf of
bread. (From Land, M. F., & Hayhoe, M. (2001). In what ways
do eye movements contribute to everyday activities? Vision
Research, 41, 3559–3565.)
the plate. Notice that this operation is accompanied by an
eye movement from the bag to the plate. The peanut butter
jar is then fi xated, then lifted and moved to the front as its
lid is removed. The knife is then fi xated, picked up, and used
to scoop the peanut butter, which is then spread on V
L 2
the bread (Land & Hayhoe, 2001).
The key finding of these measurements, and also of another experiment in which eye movements were measured
as a person prepared tea (Land et al., 1999), was that the
person fi xated on few objects or areas that were irrelevant to
the task and that eye movements and fi xations were closely
linked to the action the person was about to take. For example, the person fi xated the peanut butter jar just before
reaching for it (Hayhoe & Ballard, 2005).
If a person
has learned the key components of making a peanut butter sandwich, this learning helps direct attention to objects,
such as the jar, the knife, and the bread, that are relevant to
the task. Another example of a task that involves learning is
driving. Hiroyuki Shinoda and coworkers (2001) measured
observers’ fi xations and tested their ability to detect traffic
signs as they drove through a computer-generated environment in a driving simulator. They found that the observers
were more likely to detect stop signs positioned at intersections than those positioned in the middle of a block, and
that 45 percent of the observers’ fi xations occurred close to
intersections. In this example, the observer is using learning about regularities in the environment (stop signs are
usually at corners) to determine when and where to look for
stop signs.
It is clear that a number of factors determine how a
person scans a scene. Salient characteristics may capture
a person’s initial attention, but cognitive factors become
more important as the observer’s knowledge of the meaning of the scene begins determining where he or she fi xates.
Even more important than what a scene is, is what the person is doing within the scene. Specific tasks, such as making
a peanut butter sandwich or driving, exert strong control
over where we look.
Learning From Past Experience
A recent demonstration of perception without focused attention has been provided by Leila Reddy and coworkers
(2007), who showed that we can take in information from a
rapidly presented photograph of a face that is located off to
the side from where we are attending. The procedure for
Reddy’s experiment is diagramed in Figure 6.7. Observers
looked at the + on the fi xation screen (Figure 6.7a) and then
saw the central stimulus—an array of five letters (Figure 6.7b).
On some trials, all of the letters were the same; on other trials, one of the letters was different from the other four. Observers were instructed to keep looking at the center V
L 3
of the array of letters.
The letters were followed immediately by the peripheral
stimulus—either a picture of a face or a disc that was half
green and half red, flashed at a random position on the edge
of the screen (Figure 6.7c). The face or disc was then followed by a mask, to limit the time it was visible (see Method:
Using a Mask, page 114), and then the central letter stimulus and mask were turned off.
There were three conditions in this experiment. In all
three conditions, the observers were instructed to look
steadily at the middle of the letter display, where the + had
appeared. The face or red–green disc stimulus was presented
off to the side for about 150 ms, so there was no time to
make eye movements. The three conditions were as follows:
1. Central task condition. The letters are fl ashed in the center of the screen, where the observer is looking. The
observer’s task is to indicate whether all of the letters
are the same. A face or a red–green disc is presented
off to the side, but these stimuli are not relevant in
this condition.
2. Peripheral task condition. The letters are fl ashed, as
in the central task condition, and observers are instructed to look at the center of the letters, but the
letters are not relevant in this condition. The observer’s task is to indicate whether a face fl ashed off to the
side is male or female, or if a disc fl ashed off to the
side is red–green or green–red.
3. Dual task condition. As in the other conditions, observers are always looking at the center of the letter display,
but they are asked to indicate both (1) if all the letters
in the middle are the same and (2) for the face stimulus, whether the face is a male or a female, or for the
disc stimulus, whether it is red–green or green–red.
How Does Attention Affect Our Ability to Perceive?
137
L
L
T
T
T
Bruce Goldstein
T
(b)
T
T
(a)
T
T
+
(c)
or
Figure 6.7 ❚ Procedure for the Reddy et al. experiment.
See text for details. In (c) the peripheral stimulus was either
the face or the red-green disc. (Adapted from Reddy, L.,
Moradi, F., & Koch, C., 2007, Top-down biases win against
focal attention in the fusiform face area, Neuroimage 38,
730–739. Copyright 2007, with permission from Elsevier.)
One result of this experiment, which wasn’t surprising,
is that when observers only had to do one task at a time,
they performed well. In the central task condition and in
the peripheral task condition, performance was 80–90 percent on the letter task, the face task, or the disc task.
A result that was surprising is that in the dual task condition, in which observers had to do two tasks at once, performance on the faces was near 90 percent—just as high as it was
for the peripheral task condition (Figure 6.8, left bar). These
results indicate that it is possible to take in information
about faces even when attention is not focused on the faces.
You could argue that it might be possible to pay some
attention to the faces, even when images are presented
Perception Can Be Affected by a
Lack of Focused Attention
100
Evidence that attention is necessary for perception is provided by a phenomenon called inattentional blindness—
failure to perceive a stimulus that isn’t attended, even if it
is in full view.
Percent correct
80
60
40
0
Figure 6.8 ❚ Results from the dual task condition of the
Reddy and coworkers (2007) experiment. Observers were
able to accurately indicate whether faces were male or female
(left bar), but their performance dropped to near chance
accuracy when asked to indicate whether a disc was
red–green or green–red (right bar). (Based on data from
Reddy, L., Moradi, F., & Koch, C., 2007, Top-down biases win
against focal attention in the fusiform face area, Neuroimage
38, 730–739. Copyright 2007, with permission from Elsevier.)
CHAPTER 6
Arien Mack and Irvin
Rock (1998) demonstrated inattentional blindness using
the procedure shown in Figure 6.9. The observer’s task is to
indicate which arm of a briefly flashed cross is longer, the
horizontal or the vertical. Then, on the inattention trial of
the series, a small test object is flashed close to where the observer is looking, along with the cross. When observers were
then given a recognition test in which they were asked to pick
out the object from four alternatives, they were unable to
indicate which shape had been presented. Just as paying attention to the letters in Reddy’s (2007) experiment affected
observers’ ability to perceive the red–green disc, paying attention to the vertical and horizontal arms in Mack and
Rock’s experiment apparently made observers “blind” V
L 4
to the unattended geometric objects.
Mack and Rock demonstrated inattentional blindness
using rapidly flashed geometric test stimuli. But other reInattentional Blindness
20
138
briefly off to the side. But remember that in the dual task
condition observers needed to focus on the letters to perform the letter task. Also, because they did not know exactly
where the pictures would be flashed, they were not able to
focus their attention on the discs or faces. Remember, also,
that the stimuli were flashed for only 150 ms, so the observers were not able to make eye movements.
The observers’ ability to tell whether the faces were male
or female shows that some perception is possible even in the
absence of focused attention. But although Reddy’s observers performed with 80–90 percent accuracy for the faces in
the dual task condition, performance on the red–green disc
task dropped to 54 percent (chance performance would be
50 percent) in the dual task condition (Figure 6.8, right bar).
Why is it that the gender of a face can be detected without focused attention, but the layout of a red–green disc
cannot? Reddy’s experiment doesn’t provide an answer to
this question, but a place to start is to consider differences
between the faces and the discs. Faces are meaningful, and
we have had a great deal of experience perceiving them.
There is also evidence that we initially process faces as a
whole, without having to perceive individual features (Goffaux & Rossion, 2006). All of these factors—meaningfulness,
experience, and perceiving as a whole—could make it possible to categorize faces as male or female without focusing
attention directly on the face. Whatever mechanism is responsible for the difference in performance between faces
and the red–green discs, there is no question that some
types of information can be taken in without focused attention and some cannot. We will now look at some further
demonstrations of situations in which perception depends
on focused attention.
Visual Attention
Figure 6.9 ❚ Inattentional blindness experiment.
(a) Participants judge whether the horizontal or
vertical arm is larger on each trial. (b) After a few
trials, a geometric shape is flashed, along with
the arms. (c) Then the participant is asked to pick
which geometric stimulus was presented.
search has shown that similar effects can be achieved using
more naturalistic stimuli that are presented for longer periods of time. Imagine looking at a display in a department
store window. When you focus your attention on the display,
you probably fail to notice the reflections on the surface of
the window. Shift your attention to the reflections, and you
become unaware of the display inside the window.
Daniel Simons and Christopher Chabris (1999) created
a situation in which one part of a scene is attended and the
other is not. They created a 75-second fi lm that showed two
teams of three players each. One team was passing a basketball around, and the other was “guarding” that team by following them around and putting their arms up as in a basketball game. Observers were told to count the number of passes,
a task that focused their attention on one of the teams. After
about 45 seconds, one of two events occurred. Either a woman
carrying an umbrella or a person in a gorilla suit walked
through the “game,” an event that took 5 seconds.
After seeing the video, observers were asked whether
they saw anything unusual happen or whether they saw
anything other than the six players. Nearly half—46 percent—of the observers failed to report that they saw the
woman or the gorilla. In another experiment, when the
gorilla stopped in the middle of the action, turned to face
the camera, and thumped its chest, half of the observers
still failed to notice the gorilla (Figure 6.10). These experiments demonstrate that when observers are attending to
one sequence of events, they can fail to notice another event,
even when it is right in front of them (also see Goldstein &
Fink, 1981; Neisser & Becklen, 1975). If you would like to
experience this demonstration for yourself (or perhaps try
it on someone else), go to http://viscog.beckman.uiuc.edu/
media/goldstein.html or Google “gorilla experiment.”
Following in the footsteps of
the superimposed image experiments, researchers developed another way to demonstrate how a lack of focused attention can affect perception. Instead of presenting several
stimuli at the same time, they first presented one picture,
then another slightly different picture. To appreciate how
this works, try the following demonstration.
Change Detection
D E M O N S T R AT I O N
Change Detection
Bruce Goldstein
When you are finished reading these instructions, look at the
picture in Figure 6.11 for just a moment, and then turn the
page and see whether you can determine what is different in
Figure 6.12. Do this now. ❚
Figure 6.11 ❚ Stimulus for change blindness demonstration.
See text.
How Does Attention Affect Our Ability to Perceive?
139
(b)
(c)
(d)
Bruce Goldstein
(a)
Figure 6.12 ❚ Stimulus for change blindness demonstration.
Figure 6.13 ❚ Frames from a video that demonstrates
Were you able to see what was different in the second
picture? People often have trouble detecting the change
even though it is obvious when you know where to look.
(Try again, paying attention to the sign near the lower left
portion of the picture.) Ronald Rensink and coworkers
(1997) did a similar experiment in which they presented one
picture, followed by a blank field, followed by the same picture but with an item missing, followed by the blank field,
and so on. The pictures were alternated in this way until observers were able to determine what was different about the
two pictures. Rensink found that the pictures had to be alternated back and forth a number of times before V
L 5–11
the difference was detected.
This difficulty in detecting changes in scenes is called
change blindness (Rensink, 2002). The importance of attention (or lack of it) in determining change blindness is
demonstrated by the fact that when Rensink added a cue
indicating which part of a scene had been changed, participants detected the changes much more quickly (also see
Henderson & Hollingworth, 2003).
The change blindness effect also occurs when the scene
changes in different shots of a fi lm. Figure 6.13 shows successive frames from a video of a brief conversation between
two women. The noteworthy aspect of this video is that
changes take place in each new shot. In Shot (b), the woman’s scarf has disappeared; in Shot (c), the other woman’s
hand is on her chin, although moments later, in Shot (d),
both arms are on the table. Also, the plates change color
from red in the initial views to white in Shot (d).
Although participants who viewed this video were told
to pay close attention, only 1 of 10 participants claimed
to notice any changes. Even when the participants were
shown the video again and were warned that there would
be changes in “objects, body position, or clothing,” they
noticed fewer than a quarter of the changes that occurred
(Levin & Simons, 1997).
140
CHAPTER 6
Visual Attention
change blindness. The woman on the right is wearing a
scarf around her neck in shots (a), (c), and (d), but not in
shot (b). Also, the color of the plates changes from red in
the first three frames to white in frame (d), and the hand
position of the woman on the left changes between shots
(c) and (d). (From “Failure to Detect Changes to Attended
Objects in Motion Pictures,” by D. Levin and D. Simons, 1997,
Psychonomic Bulletin and Review, 4, 501–506.)
This blindness to change in films is not just a laboratory
phenomenon. It occurs regularly in popular films, in which
some aspect of a scene, which should remain the same,
changes from one shot to the next, just as objects changed
in the fi lm shots in Figure 6.13. These changes in fi lms,
which are called continuity errors, are spotted by viewers who
are looking for them, usually by viewing the fi lm multiple
times, but are usually missed by viewers in theaters who are
not looking for these errors. You can find sources of continuity errors in popular fi lms by Googling “continuity errors.”
Change blindness is interesting not only because it illustrates the importance of attention for perception, but
also because it is a counterintuitive result. When David
Levin and coworkers (2000) told a group of observers about
the changes that occurred in fi lm sequences like the ones in
Figure 6.13, and also showed them still shots from the film,
83 percent of the observers predicted that they would notice the changes. However, in experiments in which observers did not know which changes were going to occur, only
11 percent noticed the changes. Thus, even though people
believe that they would detect such obvious changes, they
fail to do so when actually tested.
One reason people think they would see the changes
may be that they know from past experience that changes
that occur in real life are usually easy to see. But there is
an important difference between changes that occur in real
life and those that occur in change detection experiments.
Changes that occur in real life are often accompanied by
motion, which provides a cue that indicates a change is occurring. For example, when a friend walks into a room, the
person’s motion attracts your attention. However, the appearance of a new object in a change detection experiment
is not signaled by motion, so your attention is not attracted
to the place where the object appears. The change detection experiments therefore show that when attention is disrupted, we miss changes.
To summarize this section, the answer to the question
“How does attention affect our ability to perceive?” is that
we can perceive some things, such as the gender of a face,
without focused attention, but that focused attention is
necessary for detecting many of the details within a scene
and for detecting the details of specific objects in the scene.
T E S T YO U R S E L F 6.1
Effects of Attention on Information
Processing
Michael Posner and coworkers (1978) were interested in
answering the following question: Does attention to a specific location improve our ability to respond rapidly to a
stimulus presented at that location? To answer this question, Posner used a procedure called precueing, as shown in
Figure 6.14.
Posner’s observers kept their eyes stationary throughout
the experiment, always looking at the +. They first saw an arrow cue indicating on which side of the target a stimulus
was likely to appear. In Figure 6.14a the cue indicates that
they should focus their attention to the right. (Remember,
they do this without moving their eyes.) The observer’s task
is to press a key as rapidly as possible when a target square is
presented off to the side. The trial shown in Figure 6.14a is
a valid trial because the square appears on the side indicated
by the cue arrow. The location indicated by the arrow was
1. What are two reasons that we focus on some
4.
5.
6.
7.
Respond to target
+
+
+
+
(b) Invalid trial
325
Does Attention Enhance
Perception?
William James, whose statement at the beginning of this
chapter described attention as withdrawing from some
things in order to deal effectively with others, did no experiments. Thus, many of the statements he made in his book
Principles of Psychology were based purely on James’s psychological insights. What is amazing about these insights
is that many of them were correct. Consider, for example,
James’s idea that attending to a stimulus makes it more
“clear and vivid.” Although this idea may seem reasonable,
it has only recently been confirmed experimentally. We will
consider this evidence by first describing some experiments
showing that paying attention increases our ability to react
rapidly to a stimulus.
See cue
(a) Valid trial
Reaction time (ms)
2.
3.
things and ignore others? Relate your answer to the
structure and function of the visual system.
What is selective attention? Divided attention?
What are the general characteristics of eye movements and fixations?
Describe the factors that influence how we direct
our attention in a scene.
What does it mean to say that perception can occur
without focused attention?
Describe the following two situations that illustrate
how attention affects our ability to perceive: (1) inattentional blindness; (2) change detection.
What is the reasoning behind the idea that change
blindness occurs because of a lack of attention? In
your answer, indicate how the situation in change
blindness experiments differs from the situation in
which change occurs in real life.
300
275
250
225
200
0
(c) Results
Valid
Invalid
Figure 6.14 ❚ Procedure for (a) the valid task and (b) the
invalid task in the Posner et al. (1978) precueing experiment;
see text for details. (c) Results of the experiment: Average
reaction time was 245 ms for valid trials but 305 ms for invalid
trials. (From Posner, M. I., Nissen, M. J., & Ogden, W. C.,
1978, Attended and unattended processing modes: The role
of set for spatial location. In H. L. Pick & I. J. Saltzman (Eds.),
Modes of perceiving and processing information. Hillsdale,
N.J.: Erlbaum.)
Does Attention Enhance Perception?
141
valid 80 percent of the time. Figure 6.14b shows an invalid
trial. The cue arrow indicates that the observer should attend to the left, but the target is presented on the right.
The results of this experiment, shown in Figure 6.14c,
indicate that observers react more rapidly on valid trials than on invalid trials. Posner interpreted this result as
showing that information processing is more effective at
the place where attention is directed.
There is also evidence that when attention is directed
to one place on an object, the enhancing effect of this
attention spreads throughout the object. This idea was
demonstrated in an experiment by Robert Egly and coworkers (1994), in which the observer first saw two side-by-side
rectangles, as shown in Figure 6.15a. As the observer looked
at the +, a cue signal was flashed at one location (A, B, C, or
D). After the cue signal, a target was presented at one of the
positions, and the observer responded as rapidly as possible
(Figure 6.15b). Reaction time was fastest when the target appeared where the cue signal had been presented (at A in this
example). Like Posner’s experiment, this shows that paying
attention to a location results in faster responding when a
target is presented at that location.
But the most important result of this experiment is
that observers responded faster when the target appeared at
B, which is in the same rectangle as A, than when the target
appeared at C, which is in the neighboring rectangle. Notice
that B’s advantage occurs even though B and C are the same
distance from A. Apparently the enhancing effect of attention had spread within the rectangle on the right, so when
the cue was at A, some enhancement occurred at B but not
at C, which was just as close but was in a different object.
The same result occurs even when a horizontal bar is
added to the display, as shown in Figure 6.16a (Moore et al.,
1998). Even though the bar is covering the vertical rectangles, presenting the cue at A still results in enhancement
at B. What this means is that enhancement still spreads
throughout the object. This “spreading enhancement” may
help us perceive partially obscured objects, such as our “animal” lurking behind the tree from Chapter 5 (Figure 6.16b).
Cue
C
A
374
C
+
D
A
324
B
358
+
B
D
Present cue...................Cue off...................Present target
(a)
(b)
Figure 6.15 ❚ In Egley et al.’s (1994) experiment, (a) a cue
signal appears at one place on the display. Then the cue is
turned off and (b) a target is flashed at one of four possible
locations, A, B, C, or D. Numbers are reaction times in ms for
positions A, B, and C when the cue appeared at position A.
142
CHAPTER 6
Visual Attention
C
A
B
(a)
(b)
Figure 6.16 ❚ (a) Stimulus in Figure 6.15, but with a
horizontal bar added (Moore et al., 1998). (b) Possible animal
lurking behind a tree (see Chapter 5, p. 110).
Because the effects of attention spread behind the tree, our
awareness spreads throughout the object, thereby enhancing
the chances we will interpret the interrupted shape as being
a single object. (Also see Baylis & Driver, 1993; Driver & Baylis, 1989, 1998; and Lavie & Driver, 1996, for more demonstrations of how attention spreads throughout objects.)
Does the finding that attention can result in faster reaction times show that attention can change the appearance
of an object, as William James suggested? Not necessarily. It
is possible that the target stimulus could appear identical in
the valid and invalid trials, but that attention was enhancing the observer’s ability to press the button quickly. Thus, to
answer the question of whether attention affects an object’s
appearance, we need to do an experiment that measures the
perceptual response to a stimulus rather than the speed of responding to the stimulus.
Effects of Attention on Perception
One possible way to measure the perceptual response to seeing
a stimulus is shown in Figure 6.17a. An observer views two
stimuli and is instructed to pay attention to one of them
and decide whether this attended stimulus is brighter than
the other, unattended, stimulus. The stimuli could be presented at different intensities from trial to trial, and the
goal would be to determine whether observers report that
+
(a)
+
(b)
Figure 6.17 ❚ (a) Stimuli to measure how attention might
Carrasco’s observers kept their eyes fi xed on the +. Just
before the gratings were presented, a small dot was briefly
flashed on the left or on the right to cause observers to shift
their attention to that side. Remember, however, that just as
in Posner’s studies, observers continued to look steadily at
the fi xation cross. When the two gratings were presented,
the observer indicated the orientation of the one that appeared to have more contrast.
Carrasco found that when there was a large difference in contrast between the two gratings, the attentioncapturing dot had no effect. However, when two gratings
were physically identical, observers were more likely to report the orientation of the one that was preceded by the
dot. Thus, when two gratings were actually the same, the
one that received attention appeared to have more contrast.
More than 100 years after William James suggested that attention makes an object “clear and vivid,” we can now say
that we have good experimental evidence that attention
does, in fact, enhance the appearance of an object. (Also see
Carrasco, in press; Carrasco et al., 2006.)
affect perception. (b) A better procedure was devised by
Carrasco et al. (2004), using grating stimuli.
the attended stimulus appears brighter when the two stimuli have the same intensity.
This procedure is a step in the right direction because
it focuses on what the observer is seeing rather than on how
fast the observer is reacting to the stimulus. But can we be
sure that the observer is accurately reporting his or her perceptions? If the observer has a preconception that paying attention to a stimulus should make it stand out more, this
might influence the observer to report that the attended
stimulus appears brighter when, in reality, the two stimuli
appear equally bright (Luck, 2004).
A recent study by Marissa Carrasco and coworkers
(2004) was designed to reduce the possibility that bias could
occur because of observers’ preconceptions about how attention should affect their perception. Carrasco used grating stimuli with alternating light and dark bars, like the
one in Figure 6.17b. She was interested in determining
whether attention enhanced the perceived contrast between
the bars. Higher perceived contrast would mean that there
appeared to be an enhanced difference between the light and
dark bars. However, instead of asking observers to judge the
contrast of the stimuli, she instructed them to indicate the
orientation of the grating that had the higher contrast. For
the stimuli shown in the illustration, the correct response
would be the grating on the right, because it has a slightly
higher contrast than the one on the left. Thus, the observer
had to first decide which grating had higher contrast and
then indicate the orientation of that grating.
Notice that although the observer in this experiment had
to decide which grating had higher contrast, they were asked
to report the orientation of the grating. Having the observer
focus on responding to orientation rather than to contrast
reduced the chances that they would be influenced by their
expectation about how attention should affect contrast.
Attention and Experiencing
a Coherent World
We have seen that attending to an object brings it to the
forefront of our consciousness and may even alter its appearance. Furthermore, not attending to an object can
cause us to miss it altogether. We now consider yet another
function of attention, one that is not obvious from our everyday experience. This function of attention is to help create binding, which is the process by which features—such as
color, form, motion, and location—are combined to create
our perception of a coherent object.
Why Is Binding Necessary?
We can appreciate why binding is necessary by remembering
our discussion of modularity in Chapter 4, when we learned
that separated areas of the brain are specialized for the perception of different qualities. In Chapter 4 we focused on
the inferotemporal (IT) cortex, which is associated with
perceiving forms. But there are also areas associated with
motion, location, and possibly color (the exact location of
a color area, if it exists, is still being researched) located at
different places in the cortex.
Thus, when you see a red ball roll by, cells sensitive to
the ball’s shape fire in the IT cortex, cells sensitive to movement fire in the medial temporal (MT) cortex, and cells
sensitive to color fire in other areas (Figure 6.18). But even
though the ball’s shape, movement, and color cause firing
in different areas of the cortex, you don’t perceive the ball as
separated shape, movement, and color perceptions. You experience an integrated perception of a ball, with all of these
components occurring together.
Attention and Experiencing a Coherent World
143
Depth
Location
Motion
Object
Preattentive
stage
Focused
attention
stage
Features
separated
Features
combined
Color
Form
Rolling ball
Figure 6.18 ❚ Any stimulus, even one as simple as a rolling
ball, activates a number of different areas of the cortex.
Binding is the process by which these separated signals are
combined to create a unified percept.
This raises an important question: How do we combine
all of these physically separated neural signals to achieve
a unified perception of the ball? This question, which is
called the binding problem, has been answered at both the
behavioral and physiological levels. We begin at the behavioral level by describing feature integration theory, which
assigns a central role to attention in the solution of the
binding problem.
Feature Integration Theory
Feature integration theory, originally proposed by Anne
Treisman and Garry Gelade (1980; also see Treisman, 1988,
1993, 1999), describes the processing of an object by the visual system as occurring in two stages (Figure 6.19).1 The
first stage is called the preattentive stage because it does
not depend on attention. During this stage, which occurs so
rapidly that we’re not aware of it, an object is broken down
into features such as color, orientation, and location.
The second stage is called the focused attention stage
because it does depend on attention. In this stage, the features are recombined, so we perceive the whole object, not
individual features.
Treisman links the process of binding that occurs in the
focused attention stage to physiology by noting that an object causes activity in both the what and where streams of the
cortex (see page 88). Activity in the what stream would include information about features such as color and form. Activity in the where stream would include information about
location and motion. According to Treisman, attention is
the “glue” that combines the information from the what and
where streams and causes us to perceive all of the features of
an object as being combined at a specific location.
1
This is a simplifi ed version of feature integration theory. For a more detailed
description of the model, which also includes “feature maps” that code the
location of each of an object’s features, see Treisman (1999).
144
CHAPTER 6
Visual Attention
Perception
Figure 6.19 ❚ Flow diagram of Treisman’s (1988) feature
integration theory.
Let’s consider how this might work for the object in
Figure 6.20a. All of this object’s features are registered as
being located in the same area because this is the only object present. When we pay attention to the object, its features are all combined at that location, and we perceive the
object. This process is simple because we are dealing with
a single object at a fi xed location. However, things become
more complicated when we introduce multiple objects, as
normally occurs in the environment.
When we consider multiple objects, numerous features
are involved, and these features exist at many different locations (Figure 6.20b). The perceptual system’s task is to
associate each of these features with the object to which it
belongs. Feature integration theory proposes that in order
for this to occur, we need to focus our attention on each object in turn. Once we attend to a particular location, the features at that location are bound together and are associated
with the object at that location.
What evidence supports the idea that focused attention is necessary for binding? One line of evidence, illusory
conjunctions, is based on the finding that under some conditions, features associated with one object can become incorrectly associated with another object.
Illusory conjunctions were
first demonstrated in an experiment by Treisman and
Schmidt (1982), which used a stimulus display of four objects flanked by two black numbers, as shown in Figure 6.21.
They flashed this display onto a screen for one-fifth of a second, followed by a random-dot masking field designed to
eliminate any residual perception that might remain after
the stimuli were turned off. Observers were told to report
the black numbers first and then to report what they saw
at each of the four locations where the shapes had been.
Under these conditions, observers reported seeing illusory
conjunctions on 18 percent of the trials. For example, after
being presented with the display in Figure 6.21, in which
the small triangle was red and the small circle was green,
they might report seeing a small red circle and a small green
triangle.
Although illusory conjunctions may seem like a phenomenon that would occur only in the laboratory, Treisman (2005) relates a situation in which she perceived illusory conjunctions in the environment. After thinking she’d
seen a bald-headed man with a beard, she looked again and
realized that she had actually seen two men—one bald and
one with a beard—and had combined their features to create an illusory bald, bearded man.
Illusory Conjunctions
Straight
Curved
Straight
Dark red
Red
Round
Figure 6.20 ❚ (a) A single object.
Binding features is simple in this case
because all of the features are at one
location. (b) When multiple objects with
many features are present, binding
becomes more complicated.
Bruce Goldstein
Red
Rough texture
(a)
Yellow
Orange
(b)
object among a number of other objects, such as looking for
a friend in a crowd or trying to fi nd Waldo in a “Where’s
Waldo?” picture (Handford, 1997). A type of visual search
called a conjunction search has been particularly useful in
studying binding.
1
8
D E M O N S T R AT I O N
Searching for Conjunctions
Figure 6.21 ❚ Stimuli for Treisman and Schmidt’s (1982)
illusory conjunction experiment.
The reason illusory conjunctions occurred for the stimuli in Figure 6.21 is that these stimuli were presented rapidly, and the observers’ attention was distracted from the
target object by having them focus on the black numbers.
Treisman and Schmidt found, however, that asking their
observers to attend to the target objects eliminated the illusory conjunctions.
More evidence that supports the idea that illusory conjunctions are caused by a failure of attention is provided
by studies of patient R.M., who had parietal lobe damage
that resulted in a condition called Balint’s syndrome. The
crucial characteristic of this syndrome is an inability to focus attention on individual objects. According to feature
detection theory, lack of focused attention would make it
difficult for R.M. to combine features correctly, and this
is exactly what happened. When R.M. was presented with
two different letters of different colors, such as a red T and
a blue O, he reported illusory conjunctions such as “blue T”
on 23 percent of the trials, even when he was able to view
the letters for as long as 10 seconds (Friedman-Hill et al.,
1995; Reddy et al., 2006; Robertson et al., 1997).
Another approach to studying the role
of attention in binding has used a task called visual search.
Visual search is something we do anytime we look for an
Visual Search
We can understand what a conjunction search is by first
describing another type of search called a feature search.
Before reading further, look at Figure 6.22, and find the
horizontal line in (a) and the green horizontal line in (b). The
search you carried out in Figure 6.22a was a feature search
because the target can be found by looking for a single
feature—“horizontal.” In contrast, the search you carried out
(a)
(b)
Figure 6.22 ❚ Find the horizontal line in (a) and then the
green horizontal line in (b).
Attention and Experiencing a Coherent World
145
in Figure 6.22b was a conjunction search because it was
necessary to search for a combination (or conjunction) of
two or more features in the same stimulus—“horizontal” and
“green.” In Figure 6.22b, you couldn’t focus just on green
because there are vertical green lines, and you couldn’t focus
just on horizontal because there are horizontal red lines. You
had to look for the conjunction of horizontal and green. ❚
Two neurons firing to the woman
Two neurons firing to the dog
Conjunction searches are useful for studying binding because finding the target in a conjunction search involves focusing attention at a specific location. To test the
idea that attention to a location is required for a conjunction search, a number of researchers have tested the Balint’s
patient R.M. and have found that he cannot find the target when a conjunction search is required (Robertson et al.,
1997). This is what we would expect, because of R.M’s difficulty in focusing attention. R.M. can, however, fi nd targets
when only a feature search is required, as in Figure 6.22a,
because attention-at-a-location is not required for this kind
of search.
The link between the parietal lobe, which is damaged in
patients with Balint’s syndrome, and conjunction searches
is also supported by the fact that other patients with parietal lobe damage also have difficulty performing conjunction searches (Ashbridge et al., 1999). In addition, carrying
out a conjunction search activates the parietal lobe in people without brain damage (Shafritz et al., 2002). This connection between the parietal lobe and conjunction searches
makes sense when we remember that the parietal lobe is the
destination of the where stream, which is involved in determining the locations of objects.
In conclusion, behavioral evidence suggests that it is
necessary to focus attention at a location in order to achieve
binding. We will now consider how the binding problem has
been approached physiologically.
The Physiological Approach to Binding
To solve the binding problem, the brain must combine information contained in neurons that are located in different places. For example, in the case of our rolling red ball,
the brain must combine information from separate areas
that are activated by form, color, and motion. Anatomical
connections between these different areas enable neurons
in these areas to communicate with one another (Gilbert &
Wiesel, 1989; Lamme & Roelfesma, 2000). But what is it
that they communicate?
One physiological solution to the binding problem, the
synchrony hypothesis, states that when neurons in different parts of the cortex are firing to the same object, the pattern of nerve impulses in these neurons will be synchronized
with each other. For example, consider the two “objects”
in Figure 6.23—the woman and the dog. The image of the
woman on the retina activates neurons in a number of different places in the visual cortex. The activity in two of the
neurons activated by the woman is indicated by the blue firing records. The image of the dog activates other neurons,
146
CHAPTER 6
Visual Attention
Figure 6.23 ❚ How synchrony can indicate which neurons
are firing to the same object. See text for explanation.
(Based on Engel, A. K., Fries, P., Konig, P., Brecht, M., &
Singer, W. (1999). Temporal binding, binocular rivalry, and
consciousness. Consciousness and Cognition, 8, 128–151.)
which fire as indicated by the red records. Notice that the
neurons associated with the woman have the same pattern
of firing, and the neurons associated with the dog also have
a common pattern of firing (but one that differs from the
firing pattern associated with the woman). The similarity
in the patterns of firing in each group of neurons is called
synchrony. The fact that the two neurons activated by the
woman have this property of synchrony tells the brain that
these two neurons represent the woman; the same situation
occurs for the neurons representing the dog.
Although attention is not a central part of the synchrony
hypothesis, there is evidence that paying attention to a particular object may increase the synchrony among neurons
representing that object (Engel et al., 1999). Perhaps further
research will enable us to draw connections between the
behavioral explanation of binding, which emphasizes the
role of attention, and the physiological explanation, which
emphasizes synchrony of neural firing. Note, however, that
even though there is a great deal of physiological evidence
that synchrony does occur in neurons that are associated with the same object (Brosch et al., 1997; Engel et al.,
1999; Neuenschwander & Singer, 1996; Roskies, 1999), the
synchrony hypothesis is not accepted by all researchers.
More research is necessary to determine whether synchrony
is, in fact, the signal that causes binding to occur.
The Physiology of Attention
How does attention affect neurons in the visual system?
This question has attracted a great deal of research. We
will focus here on one of the main conclusions from this
research—that attention enhances the firing of neurons.
The results of a typical experiment are shown in
Figure 6.24. Carol Colby and coworkers (1995) trained
Fix
Stimulus light
Fix
Stimulus light
Fixation only
Fixation and attention
Time
(a)
Time
200 ms
(b)
Figure 6.24 ❚ The results of Colby et al.’s (1995) experiment showing how attention affects the responding of
a neuron in a monkey’s parietal cortex. The monkey always looked at the dot marked “Fix.” A stimulus light was
flashed within the circle off to the side. (a) Nerve firing when monkey was not paying attention to the light.
(b) Nerve firing when monkey was paying attention to the light. (Reprinted from Colby, C. L., Duhamel, J.-R,
& Goldberg, M. E. (1995). Oculocentric spatial representation in parietal cortex. Cerebral Cortex, 5, 470–481.
Copyright © 1995, with permission from Oxford University Press.)
experiments show that although the enhancement effect
occurs as early in the visual system as the striate cortex, V1,
the effect becomes stronger at higher areas in the visual system (Figure 6.25). This makes sense because higher areas
are more likely to reflect an observer’s knowledge of characteristics of an object such as its meaning or behavioral significance (Gottlieb et al., 2002).
We can appreciate the connection between the behavioral significance of an object and attention by considering
an experiment by Daniel Sheinberg and Nikos Logothetis
100
Response enhancement (%)
a monkey to continually look at the small fi xation light
marked “Fix.” As the monkey looked at this light, a stimulus
light was flashed at a location off to the right. In the fi xation
only condition (Figure 6.24a), the monkey’s task was to release its hand from a bar when the fi xation light was dimmed.
In the fi xation and attention condition (Figure 6.24b), the
monkey continued looking at the fi xation light but had to
release the bar when the stimulus light was dimmed. Thus, in
the fi xation and attention condition, the monkey was looking straight ahead, but had to pay attention to the stimulus
light located off to the side.
As the monkey was performing these tasks, Colby recorded from a neuron in the parietal cortex that fired to the
stimulus light. The records in Figure 6.24 show that this
neuron responded poorly to the flashing of the stimulus
light in the fi xation only condition, but responded well to
the light in the fixation and attention condition. Because the
monkey was always looking at the fi xation light, the images
of the fi xation and stimulus lights were always the same on
the monkey’s retina. Thus, the greater response when the
monkey was paying attention to the stimulus light must
have been caused not by any change of the stimulus on the
monkey’s retina, but by the monkey’s attention to the light.
This means that the firing of a neuron depends on more
than just the shape or size or orientation of a stimulus. It
also depends on whether the animal is paying attention to
the stimulus.
This enhancement of responding by attention has been
demonstrated in many single-unit recording experiments
on animals (Bisley & Goldberg, 2003; Moran & Desimone, 1985; Reynolds & Desimone, 2003) and also in brain
imaging experiments on humans (Behrmann et al., 2004;
Downar et al., 2001; Kastner et al., 1999). The single-unit
50
0
V1
MT
MST
Higher in
visual system
Figure 6.25 ❚ Enhancement of the rate of nerve firing
caused by attention for neurons in areas V1, MT, and
MST. Area MT is in the dorsal stream, and MST is further
“downstream.” (Maunsell, J. H. R. (2004). The role of attention
in visual cerebral cortex. In L. M. Chalupa & J. S. Werner
(Eds.), The visual neurosciences (pp. 1538–1545). Cambridge,
MA: MIT Press.)
The Physiology of Attention
147
(2001), who recorded from neurons in a monkey’s inferotemporal (IT) cortex (Figure 4.29) as the monkey was scanning a scene.
In the first part of the experiment, the monkeys were
trained to move a lever to the left in response to pictures
of some objects and to the right to pictures of other objects. These objects included people, animals, and views of
human-made objects such as toys and drinking cups.
After the monkeys had learned the correct response to
each picture, Sheinberg and Logothetis found IT neurons
that responded to specific pictures. They found that if a
neuron responded to a picture when it was presented alone
on a blank field, it also responded to the picture when it was
placed in an environmental scene. For example, a neuron
that fired to a picture of an isolated parrot also fired when
the parrot appeared on the roof of a church, as shown in
Figure 6.26.
Having shown that the parrot on the roof causes an IT
neuron to fire when the parrot is flashed within the neuron’s
receptive field, the next task was to determine whether the
cell would fire when the monkey looked at the parrot while
freely scanning the picture. The data below the picture in
Figure 6.26 show the monkey’s eye movements and when
the monkey fi xated on the parrot. Immediately after the
monkey fi xated the parrot, the neuron fired, and shortly after the neuron fired, the monkey moved the lever, indicating
that it had identified the parrot. What’s important about
this result is that the neuron didn’t fire when the monkey’s
gaze came very close to the parrot. It only fired once the
monkey had noticed the parrot, as indicated by moving the
lever.
Think about what this tells us about the connection
between neural firing and perception. A particular scene
may contain many different objects, and the brain contains many neurons that respond to those objects. But even
though the retina is bombarded with stimuli that could,
potentially, cause these neurons to fire, some of these neurons do not fire until a stimulus is noticed. This is another
example of the fact that firing is not determined only by the
image on the retina, but by how behaviorally significant the
object is to the observer.
Gaze distance to
target (deg)
Something to Consider:
Attention in Autism
Fixates on
parrot
10
Moves
lever
5
0
0
1000
2000
Time from scene onset (ms)
Figure 6.26 ❚ Top: scan path as a monkey looked for a
target (the parrot on the roof). Just below picture: firing of an
IT neuron as the monkey was looking. Bottom: graph showing
how far the monkey’s gaze was from the parrot. Notice that
the neuron begins firing just after the monkey has fixated on
the parrot (arrow), and shortly after this the monkey pulls the
lever, indicating that it has identified the parrot (vertical line).
(From Sheinberg, D. L., & Logothetis, N. K. (2001). Noticing
familiar objects in real world scenes: The role of temporal
cortical neurons in natural vision. Journal of Neuroscience,
21, 1340–1350.)
148
CHAPTER 6
Visual Attention
Not only is attention important for detecting objects in the
environment, as we have described above; it is also a crucial component of social situations. People pay attention
not only to what others are saying, but also to their faces
(Gullberg & Holmqvist, 2006) and to where they are looking (Kuhn & Land, 2006; Tatler & Kuhn, 2007), because
these things provide information about the other person’s
thoughts, emotions, and feelings.
The link between attention and perceptions of social
interactions becomes especially evident when we consider a
situation in which that link is disturbed, as occurs in people with autism. Autism is a serious developmental disorder
in which one of the major symptoms is the withdrawal of
contact from other people. People with autism typically do
not make eye contact with others and have difficulty telling
what emotions others are experiencing in social situations.
Research has revealed many differences in both behavior and brain processes between autistic and nonautistic
people (Grelotti et al., 2002, 2005). Ami Klin and coworkers (2003) note the following paradox: Even though people
with autism can often solve reasoning problems that involve
social situations, they cannot function when placed in an
actual social situation. One possible explanation is differences in the way autistic people observe what is happening.
Klin and coworkers (2003) demonstrated this by comparing eye fi xations of autistic and nonautistic people as they
watched the fi lm Who’s Afraid of Virginia Woolf?
Figure 6.27 shows fi xations on a shot of George Segal’s
and Sandy Dennis’s faces. The shot occurs just after the
character in the fi lm played by Richard Burton has smashed
a bottle. The nonautistic observers fi xated on Segal’s eyes
Viewers with autism
Typically developing viewers
Viewer with autism
Typically developing viewer
Figure 6.27 ❚ Where people look when viewing this image
from the film Who’s Afraid of Virginia Woolf? Nonautistic
viewers: white crosses; autistic viewers: black crosses. (From
“The Enactive Mind, or From Actions to Cognition: Lessons
From Autism,” by A. Klin, W. Jones, R. Schultz, & F. Wolkmar,
Philosophical Transactions of the Royal Society of London B,
pp. 345–360. Copyright 2003. The Royal Society. Published
online.)
Figure 6.28 ❚ Scan paths for nonautistic viewers (white
in order to access his emotional reaction, but the autistic
observers looked near Sandy Dennis’s mouth or off to the
side.
Another difference between how autistic and nonautistic observers direct their attention is related to the tendency
of nonautistic people to direct their eyes to the place where
a person is pointing. Figure 6.28 compares the fi xations of
a nonautistic person (shown in white) and an autistic person (shown in black). In this scene, Segal’s character points
to the painting and asks Burton’s character, “Who did the
painting?” The nonautistic person follows the pointing
movement from Segal’s finger to the painting and then
looks at Burton’s face to await a reply. In contrast, the autistic observer looks elsewhere first, than back and forth between the pictures.
All of these results indicate that because of the way autistic people attend or don’t attend to events as they unfold
in a social situation, they may perceive the environment differently than normal observers. Autistic people look more at
things, whereas nonautistic observers look at other people’s
actions and especially at their faces and eyes. Autistic observers therefore create a mental representation of a situation that does not include much of the information that
nonautistic observers usually use in interacting with others.
Some recent experiments provide clues to physiological
differences in attention between autistic and nonautistic
people. Kevin Pelphrey and coworkers (2005) measured
brain activity in the superior temporal sulcus (STS; see Figure 5.45), an area in the temporal lobe that has been shown
to be sensitive to how other people direct their gaze in social
situations. For example, the STS is strongly activated when a
passerby makes eye contact with a person, but is more weakly
activated if the passerby doesn’t make eye contact (Pelphrey
et al., 2004).
Pelphrey measured STS activity as autistic and nonautistic people watched an animated character’s eyes move 1 second after a flashing checkerboard appeared (Figure 6.29a).
The character either looked at the checkerboard (congruent condition) or in a direction away from the checkerboard
(incongruent condition). To determine whether the observers saw the eye movements, Pelphrey asked his observers to
press a button when they saw the character’s eyes move. Both
autistic and nonautistic observers performed this task with
99 percent accuracy.
But even though both groups of observers saw the
character’s eyes move, there was a large difference between
how the STS responded in the two groups. The STS of the
nonautistic observers was activated more for the incongruent situation, but the STS of the autistic observers was activated equally in the congruent and incongruent situations
(Figure 6.29b).
What does this result mean? Since both groups saw the
character’s eyes move, the difference may have to do with
how observers interpreted what the eye movements meant.
Pelphrey suggests that there is a difference in autistic and
nonautistic people’s ability to read other people’s intentions. The nonautistic observers expected that the character
would look at the checkerboard, and when that didn’t happen, this caused a large STS response. Autistic observers, on
the other hand, may not have expected the observer to look
at the checkerboard, so the STS responded in the same way
to both the congruent and incongruent stimuli.
path) and autistic viewers (black path) in response to the
picture and dialogue while viewing this shot from Who’s
Afraid of Virginia Woolf? (From “The Enactive Mind, or From
Actions to Cognition: Lessons From Autism,” by A. Klin,
W. Jones, R. Schultz, & F. Wolkmar, Philosophical
Transactions of the Royal Society of London B, pp. 345–360.
Copyright 2003. The Royal Society. Published online.)
Something to Consider: Attention in Autism
149
Congruent
Incongruent
5.
6.
(a)
7.
8.
.4
9.
Signal
.3
.2
.1
pose about the role of attention in perception and
binding?
What evidence links attention and binding? Describe
evidence that involves both illusory conjunctions
and conjunction search.
What is the synchrony explanation for the physiological basis of binding? Has any connection
been demonstrated between this explanation and
attention?
Describe physiological experiments that show that
attention enhances neural firing.
Describe the experiment that showed that neurons
fire when the monkey notices an object.
Describe the results of experiments that measured (a) eye movements in autistic and nonautistic
observers while they watched a film; (b) the response of the STS to “congruent” and “incongruent” conditions. What can we conclude from these
results?
0
C
C
IC
Autistic
IC
Nonautistic
(b)
Figure 6.29 ❚ (a) Observers in Pelphrey’s (2005)
experiment saw either the congruent condition, in which
the animated character looked at the checkerboard 1
second after it appeared, or the incongruent condition, in
which the character looked somewhere else 1 second after
the checkerboard appeared. (b) Response of the STS in
autistic and nonautistic observers to the two conditions:
C = congruent; IC = incongruent. (From Pelphrey, K. A.,
Morris, J. P., & McCarthy, G. (2005). Neural basis of eye gaze
processing deficits in autism. Brain, 128, 1038–1048. By
permission of Oxford University Press.)
THINK ABOUT IT
1.
If salience is determined by characteristics of a scene
such as contrast, color, and orientation, why might it be
correct to say that paying attention to an object can increase its salience? (p. 136)
2.
Art composition books often state that it is possible
to arrange elements in a painting in a way that controls both what a person looks at in a picture and the
order in which the person looks at things. An example of this would be the statement that when viewing
Kroll’s Morning on the Cape (Figure 6.30), the eye is
drawn first to the woman with the books in the foreground, and then to the pregnant woman. But measurements of eye movements show that there are individual differences in the way people look at pictures.
For example, E. H. Hess (1965) reported large differences between how men and women looked at the Kroll
picture. Try showing this picture, and others, to people as suggested in the figure caption to see if you can
observe these individual differences in picture viewing.
(p. 135)
3.
How is the idea of regularities of the environment that
we introduced in Chapter 5 (see page 115) related to the
cognitive factors that determine where people look?
(p. 136)
4.
Can you think of situations from your experience that
are similar to the change detection experiments in that
you missed seeing an object that became easy to see
once you knew it was there? What do you think was behind your initial failure to see this object? (p. 139)
5.
The “Something to Consider” section discussed differences between how autistic and nonautistic people
The idea that neural responding may reflect cognitive
factors, such as what people expect will happen in a particular situation, is something we will encounter again in the
next chapter when we consider the connection between perception and how people interact with the environment.
T E S T YO U R S E L F 6. 2
1. What evidence is there that attention enhances vi-
sual information processing? Why can’t we use this
evidence to draw conclusions about the connection
between attention and perception?
2. Describe the experiment that shows an object’s appearance can be changed by attention. What clever
feature of this experiment was used to avoid the
effect of bias?
3. What is the binding problem? Describe the physiological processes that create this problem.
4. What are the two stages in feature integration
theory? What does feature integration theory pro-
150
CHAPTER 6
Visual Attention
© Leon Kroll/Carnegie Mellon Museum of Art, Pittsburgh, Patrons Art Fund
Figure 6.30 ❚ Leon Kroll,
Morning on the Cape. Try
showing this picture to a number
of people for 1–2 seconds,
and ask them what they notice
first and what else they see.
You can’t determine eye scan
patterns using this method, but
you may gain some insight into
differences in the way different
people look at pictures.
direct their attention. Do you think differences in directing attention may also occur in nonautistic people?
Can you think of situations in which you and another
person perceived the same scene or event differently?
(p. 148)
3.
When does selection occur in selective attention? A classic
controversy in the field of attention is whether selective
attention involves “early selection” or “late selection.”
Researchers in the “early selection” camp hold that
when many messages are present, people select one to
attend to based on physical characteristics of the message, such as a person’s voice. Researchers in the “late
selection” camp state that people don’t select which
message to attend until they have analyzed the meaning of the various messages that are present. (p. 135)
Broadbent, D. E. (1958). Perception and communication. London: Pergamon.
Luck, S. J., & Vecera, S. P. (2002). Attention. In
H. Pashler & S. Yantis (Eds.), Stevens’ handbook of
experimental psycholog y (3rd ed., pp. 235–286). New
York: Wiley.
Treisman, A. M. (1964). Selective attention in man.
British Medical Bulletin, 20, 12–16.
4.
Eye movements and reward systems. The reward value of
an element in a scene may help determine where people look. This idea is supported by evidence that looking at certain objects activates reward areas in the
brain. (p. 135)
Yue, X., Vessel, E. A., & Biederman, I. (2007). The
neural basis of scene preferences. Neuroreport, 18,
525–529.
5.
Features and visual search. Visual search has been used
not only to study binding, as described in this chapter, but also to study how the match or mismatch between features in the target and the distractors can
influence the ability to fi nd the target. When a target
has features that differ from those of the distractors,
the target “pops out” and so is perceived immediately.
However, when the target shares features with the
IF YOU WANT TO KNOW MORE
1.
2.
Dividing attention. Our ability to divide our attention
among different tasks depends on the nature of the
task and also on how well we have practiced specific
tasks. The following two references describe (1) the
idea that task difficulty determines our ability to divide our attention and (2) the fi nding that people who
play video games may increase their ability to divide
their attention among different tasks. (p. 134)
Green, G. S., & Bavelier, D. (2003). Action video
game modifies visual selective attention. Nature,
423, 534–537.
Lavie, N. (1995). Perceptual load as a necessary condition for selective attention. Journal of Experimental Psycholog y: Human Perception and Performance, 21,
451–486.
Eye movements. The role of eye movements in determining attention is often studied by measuring the
sequence of fi xations that a person makes when freely
viewing a picture. However, another important variable is how long a person looks at particular areas of a
picture. Factors that determine the length of fi xation
may not be the same as those that determine the sequence of fi xations. (p. 135)
Henderson, J. M. (2003). Human gaze control during real-world scene perception. Trends in Cognitive
Sciences, 7, 498–503.
If You Want to Know More
151
distractors, search takes longer. You can demonstrate
this for yourself in Virtual Lab 12: Feature
VL 12
Analysis. (p. 144)
Treisman, A. (1986). Features and objects in visual
processing. Scientific American, 255, 114B–125B.
Treisman, A. (1998). The perception of features
and objects. In R. D. Wright (Ed.), Visual attention
(pp. 26–54). New York: Oxford University Press.
6.
Emotion and attention. There is evidence that emotion
can affect attention in a number of ways, including
the ability to detect stimuli and the appearance of objects. (p. 142)
Phelps, E. A., Ling, S., & Carrasco, M. (2006). Emotion facilitates perception and potentiates the perceptual benefits of attention. Psychological Science,
17, 292–299.
KEY TERMS
Attention (p. 134)
Autism (p. 148)
Balint’s syndrome (p. 145)
Binding (p. 143)
Binding problem (p. 144)
Change blindness (p. 140)
Conjunction search (p. 146)
Divided attention (p. 134)
Feature integration theory (p. 144)
Feature search (p. 145)
Fixation (p. 135)
Focused attention stage (p. 144)
Illusory conjunction (p. 144)
Inattentional blindness (p. 138)
Preattentive stage (p. 144)
Precueing (p. 141)
MEDIA RESOURCES
The Sensation and Perception
Book Companion Website
CengageNOW
www.cengage.com/cengagenow
Go to this site for the link to CengageNOW, your one-stop
shop. Take a pre-test for this chapter, and CengageNOW
will generate a personalized study plan based on your test
results. The study plan will identify the topics you need to
review and direct you to online resources to help you master those topics. You can then take a post-test to help you
determine the concepts you have mastered and what you
will still need to work on.
VL
Your Virtual Lab is designed to help you get the most out
of this course. The Virtual Lab icons direct you to specific
media demonstrations and experiments designed to help
you visualize what you are reading about. The number
beside each icon indicates the number of the media element
you can access through your CD-ROM, CengageNOW, or
WebTutor resource.
The following lab exercises are related to material in
this chapter:
152
CHAPTER 6
Visual Attention
Eye Movements While Viewing a Scene Records of a person’s fi xations while viewing a picture of a scene. (Courtesy
of John Henderson.)
2. Task-Driven Eye Movements Records of a head-mounted
eye movement camera that show eye movements as a person
makes a peanut butter and jelly sandwich. (Courtesy of
Mary Hayhoe.)
3. Perception Without Focused Attention Some stimuli from
Reddy’s (2007) experiment in which she tested observers’
ability to identify stimuli presented rapidly off to the side
of the focus of attention. (Courtesy of Leila Reddy.)
4. Inattentional Blindness Stimuli The sequence of stimuli
presented in an inattentional blindness experiment.
5. Change Detection: Gradual Changes Three images that
test your ability to detect changes that happen slowly.
6. Change Detection: Airplane A test of your ability to determine the difference between two images that are flashed
rapidly, separated by a blank field. (Courtesy of Ronald
Rensink.)
7. Change Detection: Farm (Courtesy of Ronald Rensink.)
8. Change Blindness: Harborside (Courtesy of Ronald
Rensink.)
9. Change Detection: Money (Courtesy of Ronald Rensink.)
10. Change Detection: Sailboats (Courtesy of Ronald
Rensink.)
11. Change Detection: Tourists (Courtesy of Ronald
Rensink.)
1.
www.cengage.com/psychology/goldstein
See the companion website for flashcards, practice quiz
questions, Internet links, updates, critical thinking exercises, discussion forums, games, and more!
Virtual Lab
Saccade (p. 135)
Saliency map (p. 136)
Selective attention (p. 134)
Stimulus salience (p. 135)
Synchrony (p. 146)
Synchrony hypothesis (p. 146)
Visual search (p. 145)
Feature Analysis A number of visual search experiments
in which you can determine the function relating reaction
time and number of distractors for a number of different
types of targets and distractors.
12.
This page intentionally left blank
Chapter Contents
C H A P T E R
7
THE ECOLOGICAL APPROACH TO
PERCEPTION
The Moving Observer and Information in
the Environment
Self-Produced Information
The Senses Do Not Work in Isolation
DEMONSTRATION: Keeping Your
Balance
NAVIGATING THROUGH THE
ENVIRONMENT
Taking
Action
Other Strategies for Navigating
The Physiology of Navigation
❚ TEST YOURSELF 7.1
ACTING ON OBJECTS: REACHING
AND GRASPING
Affordances: What Objects Are Used For
The Physiology of Reaching and Grasping
OBSERVING OTHER PEOPLE’S
ACTIONS
Mirroring Others’ Actions in the Brain
Predicting People’s Intentions
Mirror Neurons and Experience
SOMETHING TO CONSIDER:
CONTROLLING MOVEMENT WITH
THE MIND
❚ TEST YOURSELF 7.2
Think About It
If You Want to Know More
Key Terms
Media Resources
VL VIRTUAL LAB
The mechanisms that enable Lance Armstrong to
negotiate this bend involve using both perceptual mechanisms that
enable him to see what is happening around him, and action
mechanisms that help him keep his bike upright and stay on course.
These mechanisms work in concert with one another, with perception
guiding action, and action, in turn, influencing perception.
OPPOSITE PAGE
(Steven E. Sutton/CORBIS)
VL The The Virtual Lab icons direct you to specific animations and videos
designed to help you visualize what you are reading about. The number beside
each icon indicates the number of the clip you can access through your
CD-ROM or your student website.
155
Some Questions We Will Consider:
❚ What is the connection between perceiving and moving
through the environment? (p. 156)
❚ What is the connection between somersaulting and
ahead involves both perception and motor activity occurring together. How is this coordination achieved? Researchers have approached this question in a number of ways. An
early and influential approach was proposed by J. J. Gibson,
who founded the ecological approach to perception.
vision? (p. 157)
❚ How do neurons in the brain respond when a
person perceives an action and when the person
watches someone else perceive the same action?
(p. 168)
❚ Is it possible to control the position of a cursor on a
computer screen by just thinking about where you want
it to move? (p. 171)
S
erena straps on her helmet for what she anticipates will
be a fast, thrilling, and perhaps dangerous ride. As an
employee of the Speedy Delivery Package Service, her mission is to deliver the two packages strapped to the back of
her bicycle to an address 30 blocks uptown. Once on her
bike, she weaves through traffic, staying alert to close calls
with cars, trucks, pedestrians, and potholes. Seeing a break
in traffic, she reaches down to grab her water bottle to take
a quick drink before having to deal with the next obstacle.
“Yes,” Serena thinks, “I can multitask!” As she replaces the
water bottle, she downshifts and keeps a wary eye out for
the pedestrian ahead who looks as though he might decide
to step off the curb at any moment.
Serena faces a number of challenges that involve both
perception—using her sight and hearing to monitor what
is happening in her environment—and action—staying balanced on her bike, staying on course, shifting gears, reaching for her water bottle. We have discussed some of these
things in the last two chapters: perceiving a scene and individual objects within it, scanning the scene to shift attention from one place to another, focusing on what is
important and ignoring what is not, and relying on prior
knowledge about characteristics of the environment. This
chapter builds on what we know about perceiving objects
and scenes, and about paying attention, to consider the processes involved in being physically active within a scene and
interacting with objects. In other words, we will be asking
how perception operates as a person steps out (or rides a
bike) into the world.
You might think that taking action in the world is a
different topic than perception because it involves moving
the body, rather than seeing or hearing or smelling things
in the environment. However, the reality is that motor activity and perception are closely linked. We observed this link
when we described how the ventral stream (temporal lobe)
is involved in identifying objects and the dorsal stream (parietal lobe) is involved in locating objects and taking action.
(Remember D.F. from Chapter 4, page 90, who had difficulty perceiving orientations because of damage to her temporal lobe but could “mail” a letter because her parietal lobe
was not damaged.)
Our bicyclist’s ability to balance, stay on course, grab
her water bottle, and figure out what is going to happen
156
CHAPTER 7
Taking Action
The Ecological Approach
to Perception
The ecological approach to perception focuses on how
perception occurs in the environment by (1) emphasizing
the moving observer—how perception occurs as a person
is moving through the environment—and (2) identifying
information in the environment that the moving observer
uses for perception.
The Moving Observer and
Information in the Environment
The idea that we need to take the moving observer into account to fully understand perception does not seem very
revolutionary today. After all, a good deal of our perception
occurs as we are walking or driving through the environment. However, perception research in the 1950s and the decades that followed focused on testing stationary observers
as they observed stimuli flashed on a screen.
It was in this era of the fi xed-in-place observer that Gibson began studying how pilots land airplanes. In his first
book, The Perception of the Visual World (1950), he reported
that what we know about perception from testing people
fi xed in place in the laboratory cannot explain perception
in dynamic environments that usually occur in everyday experience. The correct approach, suggested Gibson, is to look
for information that moving observers use to help them
carry out actions such as traveling toward a destination.
Gibson’s approach to perception can be stated simply
as “Look for information in the environment that provides information for perception.” Information for perception, according to Gibson, is located not on the retina, but “out there”
in the environment. He thought about information in the
environment in terms of the optic array—the structure created by the surfaces, textures, and contours of the environment, and he focused on how movement of the observer
causes changes in the optic array. According to this idea,
when you look out from where you are right now, all of the
surfaces, contours, and textures you see make up the optic
array; if you get up and start walking, the changes that occur in the surfaces, contours, and textures provide information for perception.
One source of the information for perception that occurs as you move is optic flow—the movement of elements in
a scene relative to the observer. For example, imagine driving
through a straight tunnel. You see the opening of the tunnel
as a small rectangle of light in the distance, and as your car
hurtles forward, everything around you—the walls on the
left and right, the ceiling above, and the road below—moves
Barbara Goldstein
Figure 7.1 ❚ The flow of the environment as seen through
the front window of a car speeding across a bridge toward
the destination indicated by the white dot. (The red object in
the foreground is the hood of the car.) The flow is more rapid
closer to the car, as indicated by the increased blur and the
longer arrows. The flow occurs everywhere except at the
white dot, which is the focus of expansion located at the car’s
destination at the end of the bridge.
past you in a direction opposite to the direction you are moving. This movement of the surroundings is the optic flow.
Figure 7.1 shows the flow for a car driving across a bridge that
has girders to the left and right and above. The arrows V
L 1
and the blur in the photograph indicate the flow.
Optic flow has two characteristics: (1) the flow is more
rapid near the moving observer, as indicated by the length
of the arrows in Figure 7.1, and (2) there is no flow at the
destination toward which the observer is moving, indicated
by the small white dot in Figure 7.1. The different speed
of flow—fast near the observer and slower farther away—is
called the gradient of flow. According to Gibson, the gradient of flow provides information about the observer’s speed.
The absence of flow at the destination point is called the
focus of expansion (FOE). Because the FOE is centered on
the observer’s destination, it indicates where the observer is
heading.
Another characteristic of optic flow is that it produces
invariant information. We introduced the idea of invariant information in Chapter 5, when we described the
recognition-by-components approach to object perception
(see page 112). We defined an invariant as a property that
remains constant under different conditions. For Gibson,
the key invariants are the properties that remain constant
as an observer moves through the environment. Optic flow
provides invariant information because it occurs no matter
where the observer is, as long as he or she is moving. The
focus of expansion is also invariant because it is always centered on where the person is heading.
Self-Produced Information
Another basic idea behind the ecological approach is selfproduced information—an observer’s movement provides
information that the observer uses to guide further move-
ment. Another way to state this reciprocal relationship between movement and perception is that we need to perceive
to move, and we also need to move to perceive (Figure 7.2).
The optic flow that our observer produces by moving is an
example of self-produced information. Another example is
provided by somersaulting.
We can appreciate the problem facing a gymnast who
wants to execute an airborne backward somersault by realizing that, within 600 ms, the gymnast must execute
the somersault and then end in exactly the correct body
configuration precisely at the moment that he or she hits the
ground (Figure 7.3).
One way this could be accomplished is to learn to run
a predetermined sequence of motions within a specific period of time. In this case, performance should be the same
with eyes open or closed. However, Benoit Bardy and Makel
Laurent (1998) found that expert gymnasts performed somersaults more poorly with their eyes closed. Films showed
that when their eyes were open, the gymnasts appeared to
be making in-the-air corrections to their trajectory. For example, a gymnast who initiated the extension of his or her
body a little too late compensated by performing the rest of
the movement more rapidly.
Movement
Provides information
for more movement
Creates
flow
Flow
Figure 7.2 ❚ The relationship between movement and flow
is reciprocal, with movement causing flow and flow guiding
movement. This is the basic principle behind much of our
interaction with the environment.
Figure 7.3 ❚ “Snapshots” of a somersault, starting on
the left and finishing on the right. (From Bardy, B. G., &
Laurent, M. (1998). How is body orientation controlled during
somersaulting? Journal of Experimental Psychology: Human
Perception and Performance, 24, 963–977. Copyright ©
1998 by The American Physiological Society. Reprinted by
permission.)
The Ecological Approach to Perception
157
Another interesting result was that closing the eyes did
not affect the performance of novice somersaulters as much
as it affected the performance of experts. Apparently, experts
learn to coordinate their movements with their perceptions,
but novices have not yet learned to do this. Therefore, when
the novices closed their eyes, the loss of visual information
had less effect than it did for the experts. Thus, somersaulting, like other forms of action, involves the regulation of action during the continuous flow of perceptual information.
that enable you to sense the position of your body. These
systems include the vestibular canals of your inner ear
and receptors in the joints and muscles. However, Gibson
argued that information provided by vision also plays a role
in keeping our balance. One way to illustrate the role of vision in balance is to see what happens when visual information isn’t available, as in the following demonstration.
D E M O N S T R AT I O N
The Senses Do Not Work in Isolation
Keeping Your Balance
Another of Gibson’s ideas was that the senses do not work
in isolation—that rather than considering vision, hearing,
touch, smell, and taste in isolated categories, we should
consider how each provides information for the same behaviors. One example of how a behavior originally thought
to be the exclusive responsibility of one sense is also served
by another one is provided by the sense of balance.
Your ability to stand up straight, and to keep your balance while standing still or walking, depends on systems
Keeping your balance is something you probably take for
granted. Stand up. Raise one foot from the ground and stay
balanced on the other. Then close your eyes and observe
what happens. ❚
Did staying balanced become more difficult when you
closed your eyes? Vision provides a frame of reference that
Flow when wall is moving
toward person.
(a) Room swings toward person.
(b) Person sways back to compensate.
Figure 7.4 ❚ Lee and Aronson’s swinging
Flow when wall is moving
away from person.
(c) When room swings away, person sways forward to compensate.
158
CHAPTER 7
Taking Action
room. (a) Moving the room toward the observer
creates an optic flow pattern associated
with moving forward, so (b) the observer
sways backward to compensate. (c) As the
room moves away from the observer, flow
corresponds to moving backward, so the
person leans forward to compensate, and
may even lose his or her balance. (Based
on Lee, D. N., & Aronson, E. (1974). Visual
proprioceptive control of standing in human
infants. Perception and Psychophysics, 15,
529–532.)
helps the muscles constantly make adjustments to help
maintain balance.
The importance of vision in maintaining balance was
demonstrated by David Lee and Eric Aronson (1974). Lee
and Aronson placed 13- to 16-month-old toddlers in a
“swinging room” (Figure 7.4). In this room, the floor was
stationary, but the walls and ceiling could swing toward and
away from the toddler. Figure 7.4a shows the room swaying
toward the toddler. This movement of the wall creates the
optic flow pattern on the right. Notice that this pattern is
similar to the optic flow that occurs when moving forward,
as when you are driving through a tunnel.
Because the flow is associated with moving forward, it
creates the impression in the observer that he or she is swaying forward. This causes the toddler to sway back to compensate (Figure 7.4b). When the room moves away, as in Figure 7.4c, the flow pattern creates the impression of swaying
backward, so the toddler sways forward to compensate. In
Lee and Aronson’s experiment, 26 percent of the toddlers
swayed, 23 percent staggered, and 33 percent fell down!
Adults were also affected by the swinging room. If they
braced themselves, “oscillating the experimental room
through as little as 6 mm caused adult subjects to sway approximately in phase with this movement. The subjects were
like puppets visually hooked to their surroundings and were
unaware of the real cause of their disturbance” (Lee, 1980,
p. 173). Adults who didn’t brace themselves could, like the
toddlers, be knocked over by their perception of the moving
room.
The swinging room experiments show that vision is
such a powerful determinant of balance that it can override the traditional sources of balance information provided by the inner ear and the receptors in the muscles and
joints (see also C. R. Fox, 1990). In a developmental study,
Bennett Berthenthal and coworkers (1997) showed that infants as young as 4 months old sway back and forth in response to movements of a room, and that the coupling of
the room’s movement and the swaying becomes closer with
age. (See also Stoffregen et al., 1999, for more evidence that
flow information can influence posture while standing still;
and Warren et al., 1996, for evidence that flow is involved in
maintaining posture while walking.)
Gibson’s emphasis on the moving observer, on identifying information in the environment that observers use
for perception, and on the importance of determining how
people perceive in the natural environment was taken up by
researchers that followed him. The next section describes
some research designed to test Gibson’s ideas about how
people navigate through the environment.
Navigating Through
the Environment
Gibson proposed that optic flow provides information
about where a moving observer is heading. But do observers actually use this information in everyday life? Research
on whether people use flow information has asked observers to make judgments regarding their heading based on
computer-generated displays of moving dots that create optic
flow stimuli. The observer’s task is to judge, based on optic
flow stimuli, where he or she would be heading relative to a
reference point such as the vertical line in Figures 7.5a and b.
The flow in Figure 7.5a indicates movement directly toward
the line, and the flow in Figure 7.5b indicates movement
to the right of the line. Observers viewing stimuli such as this
can judge where they are heading relative to the vertical line
to within about 0.5 to 1 degree (Warren, 1995, 2004; V
L 2
also see Fortenbaugh et al., 2006; Li, 2006).
Other Strategies for Navigating
Although research has shown that flow information can be
used to determine heading, there is also evidence that people use other information as well.
(a)
(b)
Figure 7.5 ❚ (a) Optic flow generated by a person moving
straight ahead toward the vertical line on the horizon. The
lengths of the lines indicate the person’s speed. (b) Optic
flow generated by a person moving in a curved path that is
headed to the right of the vertical line. (From Warren, W. H.
(1995). Self-motion: Visual perception and visual control. In
W. Epstein & S. Rogers (Eds.), Handbook of perception and
cognition: Perception of space and motion (pp. 263–323).
Copyright © 1965, with permission from Elsevier.)
Navigating Through the Environment
159
Driving Experiments To study what information
people use to stay on course in an actual environmental
situation, Michael Land and David Lee (1994) fitted an automobile with instruments to record the angle of the steering wheel and the speed, and measured where the driver was
looking with a video eye tracker. According to Gibson, the
focus of expansion provides information about the place
toward which a moving observer is headed. However, Land
and Lee found that although drivers look straight ahead
while driving, they do not look directly at the focus of expansion (Figure 7.6a).
Land and Lee also studied where drivers look as they are
negotiating a curve. This task poses a problem for the idea
of focus of expansion because the driver’s destination keeps
changing as the car rounds the curve. Land and Lee found
that when going around a curve, drivers don’t look directly
at the road, but look at the tangent point of the curve on
the side of the road, as shown in Figure 7.6b. Because drivers don’t look at the focus of expansion, which would be
in the road directly ahead, Land and Lee suggested that
drivers probably use information in addition to optic flow
to determine their heading. An example of this additional
information would be noting the position of the car relative
to the lines in the center of the road or relative to the side
of the road. (See also Land & Horwood, 1995; Rushton &
Salvucci, 2001; Wann & Land, 2000; Wilkie & Wann, 2003,
for more research on the information drivers use to stay on
the road.)
in a snowstorm (Harris & Rogers, 1999). Jack Loomis and
coworkers (Loomis et al., 1992; Philbeck, Loomis, & Beall,
1997) have demonstrated this by eliminating flow altogether, with a “blind walking” procedure in which people
observe a target object located up to 12 meters away, then
walk to the target with their eyes closed.
These experiments show that people are able to walk
directly toward the target and stop within a fraction of a
meter of it. In fact, people can do this even when they are
asked to walk off to the side first and then make a turn and
walk to the target, while keeping their eyes closed. Some
records from these “angled” walks are shown in Figure 7.7,
which depicts the paths taken when a person first walked
Target
Start
2
Turning
points
Walking Experiments How do people navigate on
foot? Apparently, an important strategy used by walkers
(and perhaps drivers as well) that does not involve flow is
the visual direction strategy, in which observers keep their
body pointed toward a target. If they go off course, the target will drift to the left or right. When this happens, the
walker can correct course to recenter the target (Fajen &
Warren, 2003; Rushton et al., 1998).
Another indication that flow information is not always necessary for navigation is that we can find our way
even when flow information is minimal, such as at night or
1
Figure 7.7 ❚ The results of a “blind walking” experiment
(Philbeck et al., 1997). Participants looked at the target, which
was 6 meters from the starting point, then closed their eyes
and begin walking to the left. They turned either at point 1 or
2, keeping their eyes closed the whole time, and continued
walking until they thought they had reached the target. (From
Philbeck, J. W., Loomis, J. M., & Beall, A. C., 1997, Visually
perceived location is an invariant in the control of action.
Perception and Psychophysics, 59, 601–612. Adapted with
permission.)
Focus of
expansion
(a)
(b)
Figure 7.6 ❚ Results of Land and Lee’s (1994) experiment. The ellipses indicate where
drivers were most likely to look while driving on (a) a straight road and (b) a curve to the left.
(Adapted by permission from Macmillan Publishers Ltd., from Sinai, M., Ooi Leng, T., & He, Z,
Terrain influences the accurate judgment of distance, Nature, 395, 497–500. Copyright 1998.)
160
CHAPTER 7
Taking Action
The Physiology of Navigation
The physiology of navigation has been studied both by recording from neurons in monkeys and by determining
brain activity in humans.
Optic Flow Neurons Neurons that respond to optic
flow patterns are found in the medial superior temporal area
(MST) (Figure 7.8). Figure 7.9 shows the response of a neuron
in MST that responded best to a pattern of dots that were expanding outward (Figure 7.9a) and another neuron that responded best to circular motions (Figure 7.9b; see also Duffy
& Wurtz, 1991; Orban et al., 1992; Raffi et al., 2002; Regan &
Cynader, 1979). What does the existence of these optic flow
Posterior
parietal
Premotor
(mirror area)
Medial superior
temporal area
Figure 7.8 ❚ Monkey brain, showing key areas for
movement perception and visual-motor interaction.
Firing rate
0
1
2
0
1
2
0
1
Time(s)
2
0
1
Time(s)
2
Firing rate
to the left from the “start” position and then was told to
turn either at turn point 1 or 2 and walk to a target that
was 6 meters away. The fact that the person stopped close
to the target shows that we are able to accurately navigate
short distances in the absence of any visual stimulation at
all (also see Sun et al., 2004).
Gibson’s ideas about identifying information in the environment that is available for perception plus the research
we have described tell us something important about studying perception. One task is to determine what information
is available for perception. This is what Gibson accomplished
in identifying information such as optic flow and the focus
of expansion. Another task is to determine what information is actually used for perception. As we have seen, optic
flow can be used, but other sources of information are probably used as well. In fact, the information that is used may
depend on the specific situation. Thus, as Serena speeds
down the street on her bike, she may be using flow information provided by the parked cars “flowing by” on her right
while simultaneously using the visual direction strategy to
avoid potholes and point her bike toward her destination.
In addition, she also uses auditory information, taking into
account the sound of cars approaching from behind. Perception, as we have seen, involves multiple sources of information. This idea also extends to physiology. We will now
consider how neurons and different areas of the brain provide information for navigation.
(a) Neuron 1
(b) Neuron 2
Figure 7.9 ❚ (a) Response of a neuron in the monkey’s
MST that responds with a high rate of firing to an expanding
stimulus (top record) but that hardly fires to a stimulus that
moves with a circular motion (bottom record) or with other
types of motion (not shown). (b) Another neuron that responds
best to circular movement (top) but does not respond well to
an expanding pattern or other types of movement (bottom).
(From Graziano, M. S. A., Andersen, R. A., & Snowden, R. J.
(1994). Tuning of MST neurons to spiral motions. Journal of
Neuroscience, 14, 54–67.)
neurons mean? We know from previous discussions that
finding a neuron that responds to a specific stimulus is only
the first step in determining whether this neuron has anything to do with perceiving that stimulus (see Chapter 4,
p. 79; Chapter 5, p. 122). The next step is to demonstrate a
connection between the neuron’s response and behavior.
Kenneth Britten and Richard van Wezel (2002) demonstrated a connection between the response of neurons
in MST and behavior by first training monkeys to indicate
whether the flow of dots on a computer screen indicated
movement to the left or right of straight ahead (Figure 7.10).
Then, as monkeys were making that judgment, Britten and
van Wezel electrically stimulated MST neurons that were
tuned to respond to flow associated with a specific direction. When they did this, they found that the stimulation
shifted the monkey’s judgments toward the direction favored by the stimulated neuron.
For example, the blue bar in Figure 7.10b shows how a
monkey responded to a flow stimulus before the MST was
stimulated. The monkey judged this stimulus as moving to
the left on about 60 percent of the trials and to the right
on 40 percent of the trials. However, when Britten and van
Wezel stimulated MST neurons that were tuned to respond
to leftward movement, the monkey shifted its judgment so
it made “leftward” judgments on more than 80 percent of
the trials, as indicated by the red bar in Figure 7.10b. This
link between MST firing and perception supports the idea
that flow neurons do, in fact, help determine perception of
the direction of movement.
Navigating Through the Environment
161
(a)
(a)
“Moving to left”
(% trials)
100
2
50
1
A
B
3
0
Not
stimulated
MST
stimulated
(b)
Figure 7.10 ❚ (a) A monkey watches a display of moving
dots on a computer monitor. The dots indicate the flow
pattern for movement slightly to the left of straight ahead,
or slightly to the right. (b) Effect of microstimulation of
the monkey’s MST neurons that were tuned to respond
to leftward movement. Stimulation (red bar) increases the
monkey’s judgment of leftward movement. (Based on data
from Britten, K. H., & van Wezel, R. J. A. (2002). Area MST
and heading perception in macaque monkeys. Cerebral
Cortex, 12, 692–701.)
(b)
Brain Areas for Navigation There is more to nav-
Figure 7.11 ❚ (a) Scene from the “virtual town” viewed by
Maguire et al.’s (1998) observers. (b) Plan of the town showing
three of the paths observers took between locations A and B.
Activity in the hippocampus and parietal lobe was greater for
the accurate path (1) than for the inaccurate paths (2 and 3).
(From Maguire, E. A., Burgess, N., Donnett, J. G., Frackowiak,
R. S. J., Frith, C. D., & O’Keefe, J., Knowing where, and
getting there: A human navigation network, Science, 280,
921–924, 1998. Copyright © 1998 by AAAS. Reprinted with
permission from AAAS.)
igating through the environment than perceiving the direction of movement. An essential part of navigation is knowing what path to take to reach your destination. People
often use landmarks to help them find their way. For example, although you may not remember the name of a specific
street, you may remember that you need to turn right at the
gas station on the corner.
In Chapter 4 (page 93; Figure 4.34a) we saw that there
are neurons in the parahippocampal place area (PPA) that
respond to buildings, the interiors of rooms, and other
things associated with locations. We now return to the
PPA, but instead of just describing neurons that respond to
pictures of houses or rooms, we will describe some experiments that have looked at the connection between activity
in the PPA and using landmarks to navigate through the
environment.
First, let’s consider an experiment by Eleanor Maguire
and coworkers (1998) in which observers viewed a computer
screen to see a tour through a “virtual town” (Figure 7.11a).
Observers first learned the town’s layout, and then, as they
were being scanned in a PET scanner, they were given the
task of navigating from one point to another in the town
(Figure 7.11b).
Maguire and coworkers found that navigating activated the right hippocampus and part of the parietal cortex.
They also found that activation was greater when navigation between two locations, A and B, was accurate (path 1
in Figure 7.11b) than when it was inaccurate (paths 2 and 3).
Based on these results, Maguire concluded that the hippocampus and portions of the parietal lobe form a “navigation
network” in the human cortex.
162
CHAPTER 7
Taking Action
(a) Toy at decision point
(b) Toy at nondecision point
Nondecision points
Decision points
3.5
Brain activation
3
2.5
2
1.5
1
0.5
0
Remembered
Forgotten
(c)
But what about landmarks that people use to fi nd their
way through environments? Gabriele Janzen and Miranda
van Turennout (2004) investigated the role of landmarks in
navigation by having observers first study a fi lm sequence
that moved through a “virtual museum” (Figure 7.12). Observers were told that they needed to learn their way around
the museum well enough to be able to guide a tour through
it. Objects (“exhibits”) were located along the hallway of this
museum. Decision-point objects, like the object at (a), marked
a place where it was necessary to make a turn. Non-decisionpoint objects, like the one at (b), were located at a place where
a decision was not required.
After studying the museum’s layout in the fi lm, observers were given a recognition test while in an fMRI scanner.
They saw objects that had been in the hallway and some
objects they had never seen. Their brain activation was measured in the scanner as they indicated whether they remembered seeing each object. Figure 7.12c indicates activity in
the right parahippocampal gyrus for objects the observers
had seen as they learned their way through the museum. The
left pair of bars, for objects that the observers remembered,
indicates that activation was greater for decision-point objects than for non-decision-point objects. The right pair of
bars indicates that the advantage for decision-point objects
occurred even for objects that were not remembered during
the recognition test.
Figure 7.12 ❚ (a & b) Two locations in the “virtual
museum” viewed by Janzen and van Turennouts’s
(2004) observers. (c) Brain activation during the
recognition test for objects that had been located at
decision points (red bars) and nondecision points
(blue bars). (Adapted by permission from Macmillan
Publishers Ltd., from Janzen, G., & van Turennout, M.,
Selective neural representation of objects relevant for
navigation, Nature Neuroscience, 7, 673–677. Copyright
2004.)
Janzen and van Turennout concluded that the brain
automatically distinguishes objects that are used as landmarks to guide navigation. The brain therefore responds
not just to the object but also to how relevant that object
is for guiding navigation. This means that the next time
you are trying to find your way along a route that you have
traveled before but aren’t totally confident about, activity in
your parahippocampal gyrus may automatically be “highlighting” landmarks that indicate where you should make
that right turn, even though you may not remember having
seen these landmarks before.
But what about situations in which a person is moving through a more realistic environment? Maguire and
colleagues (1998) had previously shown how the brain responded as a person navigated from one place to another
in a small “virtual town” (Figure 7.11). To increase both the
realism and complexity of the navigation task, Hugo Spiers
and Maguire (2006) used London taxi drivers as observers
and gave them a task that involved navigating through the
streets of central London. The taxi drivers operated an interactive computer game called “The Getaway” that accurately
depicted the streets of central London as seen through the
front window of a car, including all of the buildings and
landmarks along the road and some pedestrians as well.
The drivers were given instructions, such as “Please take
me to Big Ben,” and carried out these instructions by using
Navigating Through the Environment
163
the computer game to drive toward the destination. In midroute the instructions were changed (“Sorry, take me to the
River Thames”), and later the drivers also heard an irrelevant statement that they might hear from a passenger in a
real taxi ride (“I want to remember to post that letter”).
The unique feature of this experiment is that the taxi
drivers’ brain activity was measured using fMRI during
their trip. Also, immediately after the trip was over, the taxi
drivers observed a playback of their trip and answered questions about what they were thinking at various points. This
experiment therefore generated information about how the
driver’s brain was activated during the trip and what the
driver was thinking about the driving task during the trip.
The result, depicted in Figure 7.13, identifies connections
between the drivers’ thoughts and patterns of activity in the
brain.
Figure 7.13 ❚ Patterns of brain activation in the taxi drivers in Spiers and Maguire’s (2006) experiment. The descriptions above
each picture indicate what event was happening at the time the brain was being scanned. For example, “customer-driven route
planning” shows brain activity right after the passenger indicated the initial destination. The “thought bubbles” indicate the
drivers’ reports of what they were thinking at various points during the trip. (Reprinted from Spiers, H. J., & Maguire, E. A.,
Thoughts, behaviour, and brain dynamics during navigation in the real world, NeuroImage, 31, 1831. Copyright 2006, with
permission from Elsevier.)
164
CHAPTER 7
Taking Action
One example of such a link is that the drivers’ hippocampus and parahippocampal place area (PPA) were activated as the drivers were planning which route to take.
Other structures were also activated during the trip, including the visual cortex and PPA, which responded as the taxi
drivers visually inspected buildings along the way. Spiers
and Maguire were thus able to link brain activation to specific navigation tasks.
T E S T YO U R S E L F 7.1
1. What two factors does the ecological approach to
perception emphasize?
2. Where did Gibson look for information for percep-
3.
4.
5.
6.
tion? What is the optic array? Optic flow? Gradient
of flow? Focus of expansion? Invariant information?
What is observer-produced information? Describe
its role in somersaulting (why is there a difference
between novices and experts when they close their
eyes?).
Describe the swinging room experiments. What
principles do they illustrate?
What is the evidence (a) that observers can use
optic flow information to guide navigation and
(b) that they always use this information? What do
the results of driving and walking experiments tell
us about information in addition to optic flow that
observers may use for guiding navigation?
Describe the following: (a) responding of optic
flow neurons; (b) effect on behavior of stimulating
MST neurons; (c) Maguire’s “virtual town” navigation experiment; (d) Janzen and van Turennout’s
“landmark” experiment; (e) Spiers and Maguire’s
taxi driver experiment. Be sure you understand the
procedures used in these experiments and what
they demonstrate regarding the role of neurons and
different brain areas in navigation.
Acting on Objects: Reaching
and Grasping
So far, we have been describing how we move around in the
environment. But our actions go beyond walking or driving. One of the major actions we take is reaching to pick
something up, as Serena did on her bike ride, as she reached
down, grabbed her water bottle, and raised it to her mouth.
One of the characteristics of reaching and grasping is that
it is usually directed toward specific objects, to accomplish
a specific goal. We reach for and grasp doorknobs to open
doors; we reach for a hammer to pound nails. An important
approach to studying reaching and grasping, which originated with J. J. Gibson, starts with the idea that objects have
a property called its affordance, which is related to the object’s function.
Affordances: What Objects
Are Used For
Remember that Gibson’s ecological approach involves identifying information in the environment that provides information for perception. Earlier in the chapter we described
information such as optic flow, which is created by movement of the observer. Another type of information that
Gibson specified are affordances—information that indicates what an object is used for. In Gibson’s (1979) words,
“The affordances of the environment are what it offers the
animal, what it provides for or furnishes.” A chair, or anything
that is sit-on-able, affords sitting; an object of the right size
and shape to be grabbed by a person’s hand affords grasping; and so on.
What this means is that our response to an object does
not only include physical properties, such as shape, size,
color, and orientation, that might enable us to recognize
the object; our response also includes information about
how the object is used. For example, when you look at a cup,
you might receive information indicating that it is “a round
white coffee cup, about 5 inches high, with a handle,” but
your perceptual system would also respond with information indicating “can pick the cup up” and “can pour liquid
into it.” Information such as this goes beyond simply seeing or recognizing the cup, because it provides information
that can guide our actions toward it. Another way of saying
this is that “potential for action” is part of our perception
of an object.
One way that affordances have been studied is by looking at the behavior of people with brain damage. As we have
seen in other chapters, loss of function as a result of damage
to one area of the brain can often reveal behaviors or mechanisms that were formerly not obvious. Glyn Humphreys and
Jane Riddoch (2001) studied affordances by testing patient
M.P., who had damage to his temporal lobe that impaired
his ability to name objects.
M.P. was given a cue, either (1) the name of an object
(“cup”) or (2) an indication of the object’s function (“an
item you could drink from”). He was then shown 10 different objects and was told to press a key as soon as he found
the object. The results of this testing showed that M.P. identified the object more accurately and rapidly when given the
cue that referred to the object’s function. Humphreys and
Riddoch concluded from this result that M.P. was using his
knowledge of an object’s affordances to help find it.
Although M.P. wasn’t reaching for these objects, it is
likely that he would be able to use the information about
the object’s function to help him take action with respect
to the object. In line with this idea, there are other patients
with temporal lobe damage who cannot name objects, or
even describe how they can be used, but who can pick them
up and use them nonetheless.
Another study that demonstrated how an object’s affordance can influence behavior was carried out by Guiseppi Di
Pellegrino and coworkers (2005), who tested J.P., a woman
who had a condition called extinction, caused by damage to
her parietal lobe. A person with extinction can identify a
Acting on Objects: Reaching and Grasping
165
stimulus in the right or left visual field if just one stimulus is presented. However, if two stimuli are presented, one
on the left and one on the right, these people have trouble
detecting the object on the left. For example, when Di Pellegrino briefly presented J.P. with two pictures of cups, one
on the left and one on the right, she detected the right cup
on 94 percent of the trials, but detected the left cup on only
56 percent of the trials (Figure 7.14a).
Extinction is caused by a person’s inability to direct attention to more than one thing at a time. When only one
object is presented, the person can direct his or her attention to that object. However, when two objects are presented, only the right object receives attention, so the left
one is less likely to be detected. Di Pellegrino reasoned that
if something could be done to increase attention directed
toward the object on the left, then perhaps its detection
would increase.
To achieve this, Di Pellegrino added a handle to the left
cup, with the idea that this handle, which provides an affordance for grasping, might activate a system in the brain
that is responsible for reaching and grasping. When he did
this, detection of the cup increased to about 80 percent
(Figure 7.14b). To be sure detection hadn’t increased simply
because the handle made the left cup stand out more, Di
Pellegrino did a control experiment in which he presented
the stimulus in Figure 7.14c, with the handle replaced by an
easily distinguished mark. Even though the mark made the
cup stand out as well as the handle, performance was only
50 percent. Di Pellegrino concluded from this result that
(1) the presence of the handle, which provides an affordance
for grasping, automatically activates a brain system that
is responsible for reaching and grasping the handle, and
(2) this activation increases the person’s tendency to pay attention to the cup on the left. The results of experiments
such as this one and Humphreys and Riddoch’s study of
patient M.P. support the idea that an object’s potential for
action is one of the properties represented when we perceive
and recognize an object.
The Physiology of Reaching
and Grasping
+
(a)
56%
+
(b)
80%
+
(c)
50%
Figure 7.14 ❚ Cup stimuli presented to Di Pellegrino et al.’s
(2005) subject J.P. Numbers on the left indicate the percent
of trials on which the left cup was detected (a) when the cups
were the same; (b) when there was a handle on the left cup;
and (c) when there was an easily visible mark on the left cup.
(Adapted from Di Pellegrino, G., Rafal, R, & Tipper, S. P.,
Implicitly evoked actions modulate visual selection: Evidence
from parietal extinction, Current Biology, 15, 1470. Copyright
2005, with permission from Elsevier.)
166
CHAPTER 7
Taking Action
To study how neurons respond to reaching and grasping,
it is necessary to record from the brain while an animal is
awake and behaving (Milner & Goodale, 2006). Once procedures were developed that make it possible to record from
awake, behaving animals (Evarts, 1966; Hubel, 1959; Jasper
et al., 1958), researchers began studying how neurons in
the brain respond as monkeys carry out tasks that involve
reaching for objects.
One of the first discoveries made by these researchers was that some neurons in the parietal cortex that were
silent when the monkey was not behaving began firing
vigorously when the monkey reached out to press a button
that caused the delivery of food (Hyvärinen & Poranen,
1974; Mountcastle et al., 1975). The most important aspect of this result is that the neurons fired only when the
monkey was reaching to achieve a goal such as obtaining
food. They didn’t fire when the monkey made similar
movements that were not goal-directed. For example, no
response occurred to aggressive movements, even though
the same muscles were activated as were activated during
goal-directed movements.
The idea that there are neurons in the parietal cortex
that respond to goal-directed reaching is supported by the
discovery of neurons in the parietal cortex that respond
before a monkey actually reaches for an object. Jeffrey Calton and coworkers (2002) trained monkeys to look at and
reach for a blue square (Figure 7.15a). Then the square
changed color to either green (which indicated that the
monkey was to look at the next stimulus presented) or red
(which indicated that the monkey was to reach for the next
stimulus (Figure 7.15b). There was then a delay of about a
second (Figure 7.15c), followed by presentation of a blue target at different positions around the red fi xation stimulus
(shown on top in Figure 7.15d). The monkey either reached
(a) Look at and reach
for blue square
(b) Square changes
to green or red
(c) Delay
(d) Target presented
Figure 7.15 ❚ Procedure of the Calton et al. (2002) experiment showing the delay period (shaded)
during which brain activity increased when the monkey was planning to reach. See text for
details. (Adapted by permission from Macmillan Publishers, Ltd., from Calton, J. L., Dickenson,
A. R., & Snyder, L. H., Non-spatial, motor-specific activation in posterior parietal cortex, Nature
Neuroscience, 5, Fig. 1, p. 581. Copyright 2002.)
+
Cue
+
9 – second delay
Observer points
(a)
Signal
1
0.5
0
Delay
Wait
(b)
Figure 7.16 ❚ Procedure for Connolly’s (2003) experiment.
(a) The observer looks at the fixation point (⫹), and the target
(䊉) appears off to the side. (b) Activation of the PR during
the 9-second delay or during a waiting period. See text for
details.
for the blue target while still looking at the red fi xation
stimulus (Figure 7.15e, top) or looked at it by making an eye
movement away from the fi xation stimulus (Figure 7.15e,
bottom). During this sequence, Calton recorded the activity
of neurons in the monkey’s parietal cortex.
The key data in this experiment were the neuron firings recorded during the delay period, when the monkey
(e) Reach for blue target
(top) or look at it (bottom)
was waiting to either reach for a target or look at a target
(Figure 7.15c). Calton found that the parietal neurons fired
during this delay if the monkey was planning to reach,
but did not fire if the monkey was planning to look. Neurons in the posterior parietal cortex (see Figure 7.8) that
respond when a monkey is planning to reach, or is actually reaching, constitute the parietal reach region (PRR)
(Snyder et al., 2000).
What about humans? Jason Connolly and coworkers
(2003) did an experiment in which observers looking at a
fi xation point were given a cue indicating the location of a
target; in Figure 7.16a, it is located off to the left. The cue
then went off and the observers had to hold the target location in their mind during a 9-second delay period. When
the delay was up, the fi xation point disappeared and the observer pointed in the direction of the target, as indicated by
the arrow. Activity in the PRR during the 9-second delay was
measured using fMRI. In a control experiment, a 9-second
waiting period occurred first, followed by the cue and the
observer’s pointing movement. Activity was measured during the waiting period for the control condition.
The results of this experiment, shown in Figure 7.16b,
indicate that activity in the PRR was higher when the observers were holding a location in their mind during the
9-second delay than when they were simply waiting 9 seconds for the trial to begin. Connolly concluded from this
result that the PRR in humans encodes information related
to the observer’s intention to make a movement to a specific
location.
In the next section, we take our description of how the
brain is involved in action one step further by showing that
brain activity can be triggered not only by reaching for an
object or by having the intention to reach for an object, but
also by watching someone else reach for an object.
Acting on Objects: Reaching and Grasping
167
Observing Other People’s
Actions
We not only take action ourselves, but we regularly watch
other people take action. This “watching others act” is most
obvious when we watch other people’s actions on TV or in a
movie, but it also occurs any time we are around someone
else who is doing something. One of the most exciting outcomes of research studying the link between perception and
action was the discovery of neurons in the premotor cortex
(Figure 7.8) called mirror neurons.
Mirroring Others’ Actions in the Brain
In the early 1990s, Giacomo Rizzolatti and coworkers (2006;
also see Gallese et al., 1996) were investigating how neurons
in the monkey’s premotor cortex fired as the monkey performed actions like picking up a toy or a piece of food. Their
goal was to determine how neurons fired as the monkey carried out specific actions.
But as sometimes happens in science, they observed
something they didn’t expect. When one of the experimenters picked up a piece of food while the monkey was
watching, neurons in the monkey’s cortex fired. What was so
unexpected was that the neurons that fired to observing the
experimenter pick up the food were the same ones that had
fired earlier when the monkey had itself picked up the food.
This initial observation, followed by many additional
experiments, led to the discovery of mirror neurons—
neurons that respond both when a monkey observes someone else (usually the experimenter) grasping an object such
as food on a tray (Figure 7.17a) and when the monkey itself
grasps the food (Figure 7.17b; Rizzolatti et al., 1996). They
are called mirror neurons because the neuron’s response to
watching the experimenter grasp an object is similar to the
response that would occur if the monkey were performing
the action. Just looking at the food causes no response, and
watching the experimenter grasp the food with a pair of pliers, as in Figure 7.17c, causes only a small response (Gallese
et al., 1996; Rizzolatti, Forgassi, & Gallese, 2000).
Most mirror neurons are specialized to respond to only
one type of action, such as grasping or placing an object
somewhere. Although you might think that the monkey
may have been responding to the anticipation of receiving
food, the type of object made little difference. The neurons
responded just as well when the monkey observed the experimenter pick up an object that was not food.
Consider what is happening when a mirror neuron fires
in response to seeing someone else perform an action. This
firing provides information about the characteristics of the
action because the neuron’s response to watching someone
else perform the action is the same as the response that occurs when the observer performs the action. This means
that one function of the mirror neurons might be to help
understand another person’s (or monkey’s) actions and react appropriately to them (Rizzolatti & Arbib, 1998; Rizzolatti et al., 2000, 2006).
But what is the evidence that these neurons are actually
involved in helping “understand” an action? The fact that a
response occurs when the experimenter picks up the food
with his hand but not with pliers argues that the neuron
is not just responding to the pattern of motion. As further
evidence that mirror neurons are doing more than just responding to a particular pattern of stimulation, researchers
have discovered neurons that respond to sounds that are associated with actions. These neurons in the premotor cortex,
called audiovisual mirror neurons, respond when a monkey performs a hand action and when it hears the sound associated with this action (Kohler et al., 2002). For example,
the results in Figure 7.18 show the response of a neuron that
fires (a) when the monkey sees and hears the experimenter
break a peanut, (b) when the monkey just sees the experimenter break the peanut, (c) when the monkey just hears
the sound of the breaking peanut, and (d) when the monkey breaks the peanut. What this means is that just hearing a
peanut breaking or just seeing a peanut being broken causes
activity that is also associated with the perceiver’s action of
breaking a peanut. These neurons are responding, therefore, to the characteristics of observed actions—in this case,
what the action of breaking a peanut looks like and what it
sounds like.
Another characteristic of action is the intention to
carry out an action. We saw that there are neurons in the
PRR that respond as a monkey or a human is planning on
reaching for an object. We will now see that there is evidence
for neurons that respond to other people’s intentions to carry
out an action.
Firing rate
Figure 7.17 ❚ Response of a mirror neuron
(a)
168
(b)
CHAPTER 7
Taking Action
(c)
(a) to watching the experimenter grasp food on
the tray; (b) when the monkey grasps the food;
(c) to watching the experimenter pick up food
with pliers. (Reprinted from Rizzolatti, G., et al.,
Premotor cortex and the recognition of motor
actions, Cognitive Brain Research, 3, 131–141.
Copyright 2000, with permission from Elsevier.)
Predicting People’s Intentions
Let’s return to Serena as she observes the pedestrian who
looks as though he might step off the curb in front of her
oncoming bike. As Serena observes this pedestrian, she is
Firing rate
Sees experimenter break
peanut and hears sound
100
0
(a)
Firing rate
Sees experimenter
break peanut
100
0
Firing rate
(b)
100
Hears sound
0
Firing rate
(c)
100
Monkey breaks peanut
0
(d)
Figure 7.18 ❚ Response of an audiovisual mirror neuron to
four different stimuli. (From Kohler, E., et al., 2002, Hearing
sounds, understanding actions: Action representation in
mirror neurons. Science, 297, 846–848. Copyright © 2002 by
AAAS. Reprinted with permission from AAAS.)
(a) Grasp
(b) Gaze
attempting to predict that person’s intentions—whether or
not he intends to step off the curb. What information do we
use to predict others’ intentions? Sometimes the cues can
be obvious, such as watching the pedestrian start to step
off the curb and then rapidly step back, indicating that he
intended to step off the curb but suddenly decided not to.
Cues can also be subtle, such as noticing where someone
else is looking.
Andrea Pierno and coworkers (2006) studied the predictive power of watching where someone is looking by having observers view three different 4-second movies: (1) in the
grasping condition, a person reaches and looks at a grasped
target; (2) in the gaze condition, a person looks at target object; (3) in the control condition, the person does not look at
the object or grasp it (Figure 7.19). Meanwhile, the observers’ brain activity was being measured in a brain scanner.
The researchers measured brain activity in a network of
areas that Pierno has called the human action observation
system. This system encompasses areas that contain mirror neurons, including the premotor cortex, as well as some
other areas.
The results for the activity in two brain areas of the
action observation system are shown in Figure 7.20. The
activation is essentially the same in response to watching
the person grasp the ball (grasp condition) and watching the
person look at the ball (gaze condition). What this means, according to Pierno, is that seeing someone else look at the
ball activates the observer’s action observation system and
therefore indicates the person’s intention to grasp
VL 3
the ball.
When we described the function of mirror neurons, we
noted that these neurons might help us imitate the actions
of others and that mirror neurons may also help us understand another person’s actions and react appropriately to
them. Pierno’s experiment suggests that neurons in areas
that contain mirror neurons and in some neighboring areas may help us predict what another person is thinking of
doing, and therefore may help us predict what the person
might do next.
(c) Control
Figure 7.19 ❚ Frames from the films shown to Pierno et al.’s (2006) observers: (a) grasping condition; (b) gaze
condition; (c) control condition. (From Pierno, A. C., et al., When gaze turns into grasp, Journal of Cognitive
Neuroscience 18, 12.)
Observing Other People’s Actions
169
Premotor
Percent signal change
0.5
0.4
0.3
(a)
0.2
0.1
0
Grasp
(a)
Gaze
Control
Experimental conditions
(b)
Frontal
Percent signal change
0.5
0.4
0.3
0.2
0.1
0
Grasp
(b)
Gaze
Control
Experimental conditions
Figure 7.20 ❚ Results of Pierno et al.’s (2006) experiment
showing the increase in brain activity that occurred for the
three conditions shown in Figure 7.19 in (a) the premotor
cortex and (b) an area in the frontal lobe.
Figure 7.21 ❚ Sequence of frames from 3-second films
shown to Calvo-Merino’s (2005) observers: (a) ballet;
(b) capoeira dancing. (From Calvo-Merino, B., et al., Action
observation and acquired motor skills: An fMRI study with
expert dancers, Cerebral Cortex, August 2005, 15, No. 8,
1243–1249, Fig. 3, by permission of Oxford University Press.)
Mirror Neurons and Experience
Response to ballet
Does everybody have similar mirror neurons, or does activation of a person’s mirror neurons depend on that person’s
past experiences? Beatriz Calvo-Merino and coworkers
(2005, 2006) did an experiment to determine whether the
response of mirror neurons is affected by a person’s experience. They tested three groups of observers: (1) dancers
professionally trained in ballet; (2) dancers professionally
trained in capoeira dance (a Brazilian dance that includes
some karate-like movements); and (3) a control group of
nondancers. They showed these groups two videos, one
showing standard ballet movements and the other showing
standard capoeira movements (Figure 7.21).
Activity in the observer’s premotor cortex, where many
mirror neurons are located, was measured while the observers watched the fi lms. The results, shown in Figure 7.22,
indicate that activity in the PM cortex was greatest for the
ballet dancers when they watched ballet and was greatest
for the capoeira dancers when they watched capoeira. There
was no difference for the nondancer control observers.
Thus, even though all of the dancers saw the same videos,
the mirror areas of their brains responded most when they
watched actions that they had been trained to do. Apparently, mirror neurons are shaped by a person’s experience.
This means that each person has some mirror neurons that
fire most strongly when they observe actions they have previously carried out (also see Catmur et al., 2007).
Response to capoeira
170
CHAPTER 7
Taking Action
0.8
Percent signal change
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Ballet
dancers
Capoeira
dancers
Controls
Groups
Figure 7.22 ❚ Results of Calvo-Merino’s (2005)
experiment, showing increase in activity in PM cortex.
Red bars ⫽ response to ballet films; blue bars ⫽ response
to capoeira films. (From Calvo-Merino, B., et al., Action
observation and acquired motor skills: An fMRI study with
expert dancers, Cerebral Cortex , August 2005, 15, No. 8,
1243–1249, Fig. 3, Copyright © 2005, with permission from
Oxford University Press.)
Now that we are near the end of the chapter, you might
look back and notice that much of the research we have described in the last few sections is very recent. The research
on mirror neurons, which is just a little over a decade old,
has resulted in proposals that these neurons have functions
that include understanding other people’s actions, reading people’s intentions, helping imitate what they are doing, and understanding social situations (Rizzolatti et al.,
2006).
But as amazing as these neurons and their proposed
functions are, it is important to keep in mind that, because
they have just recently been discovered, more research is
needed before we can state with more certainty exactly
what their function is. Consider that when feature detectors that respond to oriented moving lines were discovered
in the 1960s, some researchers proposed that these feature
detectors could explain how we perceive objects. With the
information available at the time, this was a reasonable proposal. However, later, when neurons that respond to faces,
places, and bodies were discovered, researchers revised their
initial proposals to take these new fi ndings into account.
In all likelihood, a similar process will occur for these new
neurons. Some of the proposed functions will be verified,
but others may need to be revised. This evolution of thinking about what research results mean is a basic property
not only of perception research, but of scientific research in
general.
Something to Consider:
Controlling Movement
With the Mind
2. PRR
activated
3. Motor area
activated
1. Visual
areas
activated
4. Signals
sent to
muscles
5. Hand moves
mouse
(a)
Motor
signal
stopped
(b)
Computer uses
brain activity to
control cursor
Moving a cursor on a computer screen by moving a mouse
is a common example of coordination between perception
and movement. This coordination involves the following sequence (Figure 7.23a):
Person thinks
about moving
the mouse
1. The image of the cursor creates activity in visual
areas of the brain, so the cursor is perceived.
2. Signals from visual areas are sent to the PRR, which
calculates a motor plan that specifies the goal for
movement of the person’s hand that will cause the
cursor to reach its desired location on the screen.
3. Signals from the PRR are sent to the motor area of
the cortex.
4. Signals from the motor area are sent to the muscles.
5. The hand moves the mouse, which moves the cursor
on the screen.
When the cursor moves, the process repeats, with movement on the screen creating new activity in the visual area
of the brain (step 1) and the PRR (step 2). The PRR compares the new position of the cursor to the goal that was
set in step 2, and if the movement is off course, the PRR recalculates the motor plan and resends it to the motor area
(c)
Figure 7.23 ❚ (a) Sequence of events that occur as person
controls a cursor with a mouse. See text for details.
(b) Situation when there is spinal cord injury. The first three
steps, in which the visual area, the PRR, and the motor area
are activated, are the same as in (a). However, the injury,
indicated by the X, stops the motor signal from reaching the
arm and hand muscles, so the person is paralyzed.
(c) A neural prosthesis picks up signals from the PRR or
motor area that are created as the person thinks about
moving the mouse. This signal is then used to control the
cursor.
Something to Consider: Controlling Movement With the Mind
171
172
CHAPTER 7
Taking Action
Matthew McKee Photography
(step 3). Signals from the motor area are sent to the muscles
(step 4), the hand moves the mouse (step 5), and the process
continues until the cursor has reached its goal location.
But what happens if, in Step 4, the signals can’t reach
the muscles—a situation faced by hundreds of thousands
of people who are paralyzed because of spinal cord injury
or other problems that prevent signals from traveling from
the motor cortex to muscles in the hand (Figure 7.23b). Researchers are working to solve this problem by developing
neural prostheses—devices that substitute for the muscles
that move the mouse (Wolpaw, 2007). Figure 7.23c shows
the basic principle. The first three steps of the sequence
are the same as before. But the signal from the brain is sent,
not to the muscles, but to a computer that transforms these
signals into instructions to move the cursor, or in some
cases to control a robotic arm that can grasp and manipulate objects.
One approach to developing a prosthesis has used
signals from the motor cortex that normally would be sent
to the muscles (Scott, 2006; Serruya et al., 2002; Taylor,
2002). For example, Leigh Hochberg and coworkers (2006)
used this approach with a 25-year-old man (M.N.) who had
been paralyzed by a knife wound that severed his spinal
cord. The first step in designing the neural prosthesis was
to determine activity in M.N.’s brain that would normally
occur when he moved a computer mouse. To do this, Hochberg recorded activity with electrodes implanted in M.N.’s
motor area while he imagined moving his hand as if we were
using a computer mouse to move a cursor on a computer
screen.
The activity recorded from M.N.’s motor cortex was analyzed to determine the connection between brain activity
and cursor position. Eventually, enough data were collected
and analyzed to enable the computer to read out a cursor
position based on M.N.’s brain activity, and this readout
was used to control the position of the cursor based on what
M.N. was thinking. The test of this device was that M.N.
was able to move the cursor to different places on the computer screen just by thinking about where he wanted V
L 4
the cursor to move (Figure 7.24).
Although the majority of research on neural prosthetics has focused on using activity in the motor area to control devices, another promising approach has used signals
from the PRR (Andersen et al., 2004). Sam Musallam and
coworkers (2004) showed that signals recorded from a monkey’s PRR can be used to enable the monkey to move a cursor to different positions on a screen based only on its brain
activity.
While these results are impressive, many problems remain to be solved before a device can become routinely
available. One problem is that even under controlled laboratory conditions, using computer-analyzed brain activity to
control movement is much less accurate and more variable
than the control possible when signals are sent directly to
the muscles. One reason for this variability is that signals
are sent to the muscles in tens of thousands of neurons,
and these signals contain all of the information needed to
Figure 7.24 ❚ Matthew Nagle (M.N.) shown controlling the
location that is illuminated on a screen by imagining that he is
moving a computer mouse. (Courtesy of John Donoghue and
Cyberkinetics Neurotechnology Systems, Inc.)
achieve precise control of the muscles. In contrast, researchers developing neural prostheses are using signals from far
fewer neurons and must determine which aspects of these
signals are most effective for controlling movement. Thus,
just as vision researchers have been working toward determining how nerve firing in visual areas of the brain represent objects and scenes (see Chapter 5, p. 124), so researchers
developing neural prostheses are working toward determining how nerve firing in areas such as the PRR and motor
cortex represent movement.
T E S T YO U R S E L F 7. 2
1. What is an affordance? Describe the results of two
2.
3.
4.
5.
6.
experiments on brain-damaged patients that illustrate the operation of affordances.
Describe the experiments that support the idea of a
parietal reach region. Include monkey experiments
that record from individual neurons (a) as a monkey
reaches, and (b) as a monkey plans to reach; and (c)
human brain scan experiments that study the brain
activity associated with a person’s intention to make
a movement.
What are mirror neurons? Audiovisual mirror neurons? What are some of the potential functions of
mirror neurons?
Describe the experiment that studied the idea that
an action observation system responds to another
person’s intention to carry out an action.
What is the evidence that experience plays a role in
the development of mirror neurons?
What is a neural prosthesis? Compare how a neural
prosthesis can result in movement of a cursor on a
computer screen to how an intact brain produces
movement of a cursor.
Shaw, R. E. (2003). The agent–environment interface:
Simon’s indirect or Gibson’s direct coupling? Ecological Psycholog y, 15, 37–106.
Turvey, M. T. (2004). Space (and its perception): The
fi rst and fi nal frontier. Ecological Psycholog y, 16,
25–29.
THINK ABOUT IT
1.
We have seen that gymnasts appear to take visual information into account as they are in the act of executing
a somersault. In the sport of synchronized diving, two
people execute a dive simultaneously from two side-byside diving boards. They are judged based on how well
they execute the dive and how well the two divers are
synchronized with each other. What environmental
stimuli do you think synchronized divers need to take
into account in order to be successful? (p. 157)
2.
Can you identify specific environmental information
that you use to help you carry out actions in the environment? This question is often particularly relevant to
athletes. (p. 157)
3.
It is a common observation that people tend to slow
down as they are driving through long tunnels. Explain
the possible role of optic flow in this situation. (p. 157)
4.
What is the parallel between feeding brain activity into
a computer to control movement and feeding brain activity into a computer to recognize scenes, as discussed
in Chapter 5 (see page 125). (p. 171)
IF YOU WANT TO KNOW MORE
1.
Ecological psychology. Ecological psychologists have
studied many behaviors that occur in the natural environment. Here are a few papers that are associated
with the ecological approach. Also, looking at recent
issues of the journal Ecological Psycholog y will give you
a feel for modern research by psychologists who identify themselves with the ecological approach. (p. 156)
Lee, D. N., & Reddish, P. E. (1976). Plummeting gannets: A paradigm of ecological optics. Nature, 293,
293–294.
Rind, F. C., & Simmons, P. J. (1999). Seeing what
is coming: Building collision-sensitive neurons.
Trends in Neurosciences, 22, 215–220.
Schiff, W., & Detwiler, M. L. (1979). Information
used in judging impending collision. Perception, 8,
647–658.
2.
Gibson’s books. J. J. Gibson described his approach in
three books that explain his philosophy and approach
in detail. (p. 156)
Gibson, J. J. (1950). The perception of the visual world.
Boston: Houghton Miffl in.
Gibson, J. J. (1966). The senses considered as perceptual
systems. Boston: Houghton Miffl in.
Gibson, J. J. (1979). The ecological approach to visual
perception. Boston: Houghton Miffl in.
3.
Motor area of brain activated when sounds are associated
with actions. Research has shown that the motor area
of the cortex is activated when trained pianists hear
music. This does not occur in nonpianists, presumably because the link between fi nger movements and
sound is not present in these people. (p. 169)
Haueisen, J., & Knosche, T. R. (2001). Involunt ary
motor activity in pianists evoked by music perception. Journal of Cognitive Neuroscience, 136, 786–792.
4.
Event perception. Although people experience a continuously changing environment, they are able to divide
this continuous stream of experience into individual
events, such as preheating the oven, mixing the ingredients in a bowl, and putting the dough on a cookie
sheet when baking cookies. Recent research has studied how people divide experience into events, and
what is happening in the brain as they do. (p. 167)
Kurby, C. A., & Zacks, J. M. (2007). Segmentation in
the perception and memory of events. Trends in Cognitive Sciences, 12, 72–79.
Zacks, J. M, Speer, N. K., Swallow, K. M., Braver,
T. S., & Reynolds, J. R. (2007). Event perception: A
mind–brain perspective. Psychological Bulletin, 133,
273–293.
Zacks, J. M., & Swallow, K. M. (2007). Event segmentation. Current Directions in Psychological Science,
16, 80–84.
KEY TERMS
Affordance (p. 165)
Audiovisual mirror neuron (p. 168)
Ecological approach to
perception (p. 156)
Extinction (p. 165)
Focus of expansion (FOE) (p. 157)
Gradient of flow (p. 157)
Invariant information (p. 157)
Mirror neuron (p. 168)
Neural prosthesis (p. 172)
Optic array (p. 156)
Optic flow (p. 156)
Parietal reach region (PRR) (p. 167)
Self-produced information (p. 157)
Visual direction strategy (p. 160)
Key Terms
173
MEDIA RESOURCES
The Sensation and Perception Book
Companion Website
www.cengage.com/psychology/goldstein
See the companion website flashcards, practice quiz questions, Internet links, updates, critical thinking exercises,
discussion forums, games, and more!
CengageNow
www.cengage.com/cengagenow
Go to this site for the link to CengageNOW, your one-stop
shop. Take a pre-test for this chapter, and CengageNOW
will generate a personalized study plan based on your test
results. The study plan will identify the topics you need
to review and direct you to online resources to help you
master those topics. You can then take a post-test to help
you determine the concepts you have mastered and what
you will still need to work on.
Virtual Lab
VL
Your Virtual Lab is designed to help you get the most out
of this course. The Virtual Lab icons direct you to specific
174
CHAPTER 7
Taking Action
media demonstrations and experiments designed to help
you visualize what you are reading about. The number
beside each icon indicates the number of the media element
you can access through your CD-ROM, CengageNOW, or
WebTutor resource.
The following lab exercises are related to the material in
this chapter:
Flow From Walking Down a Hallway A computergenerated program showing the optic flow that occurs
when moving through a patterned hallway. (Courtesy of
William Warren.)
2. Stimuli Used in Warren Experiment Moving stimulus
pattern seen by observers in William Warren’s experiment.
(Courtesy of William Warren.)
3. Pierno Stimuli Stimuli for the Pierno experiment. (Courtesy of Andrea Pierno.)
4. Neural Prosthesis Video showing a paralyzed person
moving a cursor on a screen by mentally controlling the
cursor’s movement. (Courtesy of Cyberkinetics, Inc.)
1.
This page intentionally left blank
Chapter Contents
C H A P T E R
8
FUNCTIONS OF MOTION
PERCEPTION
Motion Helps Us Understand Events in
Our Environment
Motion Attracts Attention
Motion Provides Information About
Objects
DEMONSTRATION: Perceiving a
Camouflaged Bird
STUDYING MOTION PERCEPTION
Perceiving
Motion
When Do We Perceive Motion?
Comparing Real and Apparent Motion
What We Want to Explain
MOTION PERCEPTION: INFORMATION
IN THE ENVIRONMENT
NEURAL FIRING TO MOTION
ACROSS THE RETINA
Motion of a Stimulus Across the Retina:
The Aperture Problem
DEMONSTRATION: Motion of a Bar Across
an Aperture
Motion of Arrays of Dots on the Retina
METHOD: Microstimulation
❚ TEST YOURSELF 8.1
TAKING EYE MOTIONS INTO
ACCOUNT: THE COROLLARY
DISCHARGE
Corollary Discharge Theory
Behavioral Demonstrations of Corollary
Discharge Theory
DEMONSTRATION: Eliminating the
Image Displacement Signal With an
Afterimage
DEMONSTRATION: Seeing Motion by
Pushing on Your Eyelid
Physiological Evidence for Corollary
Discharge Theory
PERCEIVING BIOLOGICAL MOTION
Brain Activation by Point-Light Walkers
Linking Brain Activity and the Perception
of Biological Motion
METHOD: Transcranial Magnetic
Stimulation (TMS)
SOMETHING TO CONSIDER: GOING
BEYOND THE STIMULUS
Implied Motion
Apparent Motion
❚ TEST YOURSELF 8.2
Think About It
If You Want to Know More
Key Terms
Media Resources
VL VIRTUAL LAB
This stop-action photograph captures a sequence of
positions of a bird as it leaves a tree branch. This picture represents
the type of environmental motion that we perceive effortlessly every
day. Although we perceive motion easily, the mechanisms underlying
motion perception are extremely complex.
OPPOSITE PAGE
© Andy Rouse/Corbis
VL The Virtual Lab icons direct you to specific animations and videos
designed to help you visualize what you are reading about. The number beside
each icon indicates the number of the clip you can access through your
CD-ROM or your student website.
177
Functions of Motion
Perception
Some Questions We Will Consider:
❚ Why do some animals freeze in place when they sense
danger? (p. 179)
❚ How do films create movement from still pictures?
Motion perception has a number of different functions,
ranging from providing us with updates about what is happening to helping us perceive things such as the shapes of
objects and people’s moods. Perhaps most important of
all, especially for animals, the perception of motion is intimately linked to survival.
(p. 180)
❚ When we scan or walk through a room, the image of the
room moves across the retina, but we perceive the room
and the objects in it as remaining stationary. Why does
this occur? (p. 184)
A
Motion Helps Us Understand Events
in Our Environment
ction fills our world. We are always taking action, either dramatically—as in Serena’s bike ride in Chapter 7
(page 156) or a basketball player driving toward the basket—or routinely, as in reaching for a coffee cup or walking
across a room. Whatever form action takes, it involves motion, and one of the things that makes the study of motion
perception both fascinating and challenging is that we are
not simply passive observers of the motion of others. We are
often moving ourselves. Thus, we perceive motion when we
are stationary, as when we are watching other people cross
the street (Figure 8.1a), and we also perceive motion as we
ourselves are moving, as might happen when we participate in a basketball game (Figure 8.1b). We will see in this
chapter that both the “simple” case of a stationary observer
perceiving motion and the more complicated case of a moving observer perceiving motion involve complex “behind the
scenes” mechanisms.
George Doyle/Getty Images
© Cathrine Wessel/CORBIS
As you walk through a shopping mall, looking at the displays in the store windows, you are also observing other actions—a group of people engaged in an animated conversation, a salesperson rearranging piles of clothing and then
walking over to the cash register to help a customer, a program on the TV in a restaurant that you recognize as a dramatic moment in a soap opera.
Much of what you observe involves information provided by motion. The gestures of the people in the group
you observed indicate the intensity of their conversation;
the motions of the salesperson indicate what she is doing and when she has shifted to a new task; and motion
indicates, even in the absence of sound, that something
(a)
(b)
Figure 8.1 ❚ Motion perception occurs (a) when a stationary observer perceives moving stimuli, such as this couple
crossing the street, and (b) when a moving observer, like this basketball player, perceives moving stimuli, such as the other
players on the court.
178
CHAPTER 8
Perceiving Motion
important is happening in the soap opera (Zacks, 2004;
Zacks & Swallow, 2007).
Our ability to use motion information to determine
what is happening is an important function of motion perception that we generally take for granted. Motion perception is also essential for our ability to move through the environment. As we saw in Chapter 7 when we described how
people “navigate” (see page 159), one source of information
about where we are going and how fast we are moving is the
way objects in the environment flow past us as we move. As
a person moves forward, objects move relative to the person
in the opposite direction. This movement, called optic flow
(page 157), provides information about the walker’s direction and speed. In Chapter 7 we discussed how we can use
this information to help us stay on course.
But while motion provides information about what
is going on and where we are moving, it provides information for more subtle actions as well. Consider, for example,
the action of pouring water into a glass. As we perceive the
water, we watch the level rise, and this helps us know when
to stop pouring. We can appreciate the importance of this
ability by considering the case of a 43-year-old woman
who lost the ability to perceive motion when she suffered
a stroke that damaged an area of her cortex involved in
motion perception. Her condition, which is called motion
agnosia, made it difficult for her to pour tea or coffee into
a cup because the liquid appeared frozen, so she couldn’t
perceive the fluid rising in the cup and had trouble knowing
when to stop pouring (Figure 8.2). It was also difficult for
her to follow dialogue because she couldn’t see motions of a
speaker’s face and mouth (Zihl et al., 1983, 1991).
But the most disturbing effect of her brain damage
was the sudden appearance or disappearance of people and
objects. People suddenly appeared or disappeared because
she couldn’t see them walking. Crossing the street presented serious problems because at first a car might seem far
away, but then suddenly, without warning, it would appear very near. This disability was not just a social inconvenience, but enough of a threat to the woman’s well-being
that she rarely ventured outside into the world of moving—
and sometimes dangerous—objects. This case of a breakdown in the ability to perceive motion provides a dramatic
demonstration of the importance of motion perception in
day-to-day life.
Motion Attracts Attention
As you try to find your friend among the sea of faces in the
student section of the stadium, you realize that you have no
idea where to look. But you suddenly see a person waving
and realize it is your friend. The ability of motion to attract
attention is called attentional capture. This effect occurs
not only when you are consciously looking for something,
but also while you are paying attention to something else.
For example, as you are talking with a friend, your attention
may suddenly be captured by something moving in your peripheral vision.
The fact that movement can attract attention plays an
important role in animal survival. You have probably seen
animals freeze in place when they sense danger. For example, if a mouse’s goal is to avoid being detected by a cat, one
thing it can do is to stop moving. Freezing in place not only
eliminates the attention-attracting effects of movement,
but it also makes it harder for the cat to differentiate between the mouse and its background.
Motion Provides Information
About Objects
The idea that not moving can help an animal blend into
the background is illustrated by the following V
L 1, 2
demonstration.
D E M O N S T R AT I O N
Perceiving a Camouflaged Bird
Time 1
Time 2
Figure 8.2 ❚ The woman with motion agnosia was not able
to perceive the rising level as liquid was being poured into a
glass.
For this demonstration, you will need to prepare stimuli
by photocopying the bird and the hatched-line pattern in
Figure 8.3. Then cut out the bird and the hatched pattern so
they are separated. Hold the picture of the bird up against a
window during the day. Turn the copy of the hatched pattern
over so the pattern is facing out the window (the white side of
the paper should be facing you) and place it over the bird. If
the window is adequately illuminated by daylight, you should
be able to see the hatched pattern. Notice how the presence
of the hatched pattern makes it more difficult to see the bird.
Then, slide the bird back and forth under the pattern, and
notice what happens to your perception of the bird (from
Regan, 1986). ❚
Functions of Motion Perception
179
not camouflaged. But if you remember our discussion from
Chapter 5 (p. 101) about how even clearly visible objects may
be ambiguous, you can appreciate that moving relative to an
object can help us perceive its shape more accurately. For example, moving around the “horse” in Figure 8.4 reveals that
its shape is not exactly what you may have expected based on
your initial view. Thus, our motion relative to objects is constantly adding to the information we have about the objects.
This also happens when objects move relative to us, and
a great deal of research has shown that observers perceive
shapes more rapidly and accurately when an object V
L 3–7
is moving (Wexler et al., 2001).
Studying Motion Perception
To describe how motion perception is studied, the first question we will consider is: When do we perceive motion?
Figure 8.3 ❚ The bird becomes camouflaged when the
random lines are superimposed on the bird. When the bird
is moved relative to the lines, it becomes visible, an example
of how movement enhances the perception of form. (From
Regan, D. (1986). Luminance contrast: Vernier discrimination.
Spatial Vision, 1, 305–318.)
The answer to this question may seem obvious: We perceive
motion when something moves across our field of view. Actual motion of an object is called real motion. Perceiving a
car driving by, people walking, or a bug scurrying across a
tabletop are all examples of the perception of real motion.
There are also a number of ways to produce the perception of motion that involve stimuli that are not moving.
Perception of motion when there actually is none is called
illusory motion. The most famous, and well-studied, type
of illusory motion is called apparent motion. We introduced apparent motion in Chapter 5 when we told the story
of Max Wertheimer, who showed that when two stimuli in
slightly different locations are alternated with the correct
timing, an observer perceives one stimulus moving back
and forth smoothly between the two locations (Figure 8.5a).
This perception is called apparent motion because there is
no actual (or real) motion between the stimuli. This is the
basis for the motion we perceive in movies, on television, and
in moving signs that are used for advertising and
VL 8–11
entertainment (Figure 8.5b).
Bruce Goldstein
The stationary bird is difficult to see when covered by
the pattern because the bird and the pattern are made up
of similar lines. But as soon as all of the elements of the
bird begin moving in the same direction, the bird becomes
visible. What is happening here is that movement has perceptually organized all of the elements of the bird, so they
create a figure that is separated from the background. This
is why a mouse should stay stationary even if it is hidden
by other objects, if it wants to avoid becoming perceptually
organized in the cat’s mind!
You might say, after doing the camouflaged bird demonstration, that although motion does make the bird easy
to perceive amid the tangle of obscuring lines, this seems
like a special case, because most of the objects we see are
When Do We Perceive Motion?
(a)
(b)
(c)
Figure 8.4 ❚ Three views of a “horse.” Moving around an object can reveal its true shape.
180
CHAPTER 8
Perceiving Motion
Induced motion occurs when motion of one object (usually a large one) causes a nearby stationary object (usually
smaller) to appear to move. For example, the moon usually
appears stationary in the sky. However, if clouds are moving
Bruce Goldstein
(a)
past the moon on a windy night, the moon may appear to
be racing through the clouds. In this case, movement of the
larger object (clouds covering a large area) makes the smaller,
but actually stationary, moon appear to be moving V
L 12
(Figure 8.6a).
Motion aftereffects occur after viewing a moving
stimulus for 30 to 60 seconds and then viewing a stationary stimulus, which appears to move. One example of a
motion aftereffect is the waterfall illusion (Figure 8.6b). If
you look at a waterfall for 30 to 60 seconds (be sure it fills
up only part of your field of view) and then look off to the
side at part of the scene that is stationary, you will see everything you are looking at—rocks, trees, grass—appear to
move up for a few seconds (Figure 8.6c). Motion aftereffects
can also occur after viewing other kinds of motion. For
example, viewing a rotating spiral that appears to move
inward causes the apparent expansion of a stationary object. (See “If You Want to Know More,” item 3, page 196,
for a reference that discusses the mechanisms V
13–14
L
responsible for aftereffects.)
Researchers studying motion perception have investigated all of the types of perceived motion described above—
and some others as well. Our purpose, however, is not to
understand every type of motion perception but to understand some of the principles governing motion perception
in general. To do this, we will focus on real motion and apparent motion.
Comparing Real and Apparent Motion
(b)
Figure 8.5 ❚ Apparent motion (a) between these squares
Bruce Goldstein
when they are flashed rapidly on and off; (b) on a moving
sign. Our perception of words moving across a display such
as this one is so compelling that it is often difficult to realize
that signs like this are simply dots flashing on and off.
For many years, researchers treated the apparent motion
created by flashing stationary objects or pictures and the
real motion created by actual motion through space as
though they were separate phenomena, governed by different mechanisms. However, there is ample evidence that
these two types of motion have much in common. Consider,
for example, an experiment by Axel Larsen and coworkers
(a)
(b)
(c)
Figure 8.6 ❚ (a) Motion of the clouds induces the perception of motion in the stationary moon. (b) Observation of motion in
one direction, such as occurs when viewing a waterfall, can cause (c) the perception of motion in the opposite direction when
viewing stationary objects in the environment.
Studying Motion Perception
181
positions of the flashing dots even though no stimulus was
presented there.
Because of the similarities between the perception of real
and apparent motion, and between the brain mechanisms associated with these two types of motion, researchers study
both types of motion together and concentrate on discovering general mechanisms that apply to both. In this chapter,
we will follow this approach as we look for general mechanisms of motion perception.
(b) Real
(a) Control
What We Want to Explain
(c) Apparent
Figure 8.7 ❚ The three conditions in Larsen’s (2006)
experiment: (a) control; (b) real motion; and (c) apparent
motion (flashing dots). Stimuli are shown on top, and the
resulting brain activation below. In (c) the brain is activated
in the space that represents the area between the two dots,
indicating that movement was perceived even though no
stimuli were present. (Larsen, A., Madsen, K. H., Lund, T. E.,
& Bundesen, C., Images of illusory motion in primary visual
cortex. Journal of Cognitive Neuroscience, 18, 1174–1180.
© 2006 by the Massachusetts Institute of Technology.)
(2006). Larsen presented three types of displays to a
person in an fMRI scanner: (1) a control condition, in which
two dots in slightly different positions were flashed simultaneously (Figure 8.7a); (2) a real motion display, in which a small
dot moved back and forth (Figure 8.7b); and (3) an apparent
motion display, in which dots were flashed one after another
so that they appeared to move back and forth (Figure 8.7c).
Larsen’s results are shown below the dot displays. The
blue-colored area in Figure 8.7a is the area of visual cortex
activated by the control dots, which are perceived as two
dots simultaneously flashing on and off with no motion between them. Each dot activates a separate area of the cortex.
In Figure 8.7b, the red indicates the area of cortex activated
by real movement of the dot. In Figure 8.7c, the yellow indicates the area of cortex activated by the apparent motion
display. Notice that the activation associated with apparent motion is similar to the activation for the real motion
display. Two flashed dots that result in apparent motion activate the area of brain representing the space between the
TABLE 8.1
Our goal is to understand how we perceive things that are
moving. At first this may seem like an easy problem. For example, Figure 8.8a shows what Maria sees when she looks
straight ahead as Jeremy walks by. Because she doesn’t move
her eyes, Jeremy’s image sweeps across her retina. Explaining motion perception in this case seems straightforward
because as Jeremy’s image moves across Maria’s retina, it
stimulates a series of receptors one after another, and this
stimulation signals Jeremy’s motion.
Figure 8.8b shows what Maria sees when she follows
Jeremy’s motion with her eyes. In this case, Jeremy’s image
remains stationary on Maria’s foveas as he walks by. This
adds an interesting complication to explaining motion
perception, because although Maria perceives Jeremy’s motion, Jeremy’s image remains stationary on her retina. This
means that motion perception can’t be explained just by
considering what is happening on the retina.
Finally, let’s consider what happens if Jeremy isn’t present, and Maria decides to walk through the room (Figure
8.8c). When Maria does this, the images of the walls and objects in the room move across her retina, but Maria doesn’t
see the room or its contents as moving. In this case, there
is motion across the retina, but no perception that objects
are moving. This is another example of why we can’t simply
consider what is happening on the retina. Table 8.1 summarizes the three situations in Figure 8.8.
In the sections that follow, we will consider a number of
different approaches to explaining motion perception, with
the goal being to explain each of the situations in Table 8.1.
We begin by considering an approach that focuses on how
information in the environment signals motion.
❚ Conditions for Perceiving and Not Perceiving Motion Depicted in Figure 8.8
OBJECT MOVES?
OBSERVER
IMAGE ON
OBSERVER’S RETINA
PERCEPTION
SITUATION
(a) Jeremy walks
across room
Yes
Maria’s eyes
are stationary
Moves across retina
as object moves
Object (Jeremy) moves
(b) Jeremy walks
across room
Yes
Maria’s eyes
follow Jeremy
as he moves
Stationary, because
the image stays
on the fovea
Object (Jeremy) moves
(c) Maria walks
through the room
No, everything in the
room is stationary
Maria is moving
through the room
Moves across
retina as Maria walks
Objects (the room and its
contents) do not move
182
CHAPTER 8
Perceiving Motion
(a) Jeremy walks past Maria; Maria's eyes are stationary
(creates local disturbance in optic array)
(b) Jeremy walks past Maria; Maria follows him with her eyes
(creates local disturbance in optic array)
Figure 8.8 ❚ Three motion situations:
(c) Maria walks through the scene
(creates global optic flow)
Motion Perception:
Information in the
Environment
From the three examples in Figure 8.8, we saw that motion
perception can’t be explained by considering just what is
happening on the retina. So, what if we ignore the retina
altogether and focus instead on information “out there” in
the environment that signals motion? That is exactly what
J. J. Gibson, who founded the ecological approach to perception, did.
In Chapter 7 we noted that Gibson’s approach involves
looking for information in the environment that provides
(a) Maria is stationary and observes
Jeremy walking past; (b) Maria follows
Jeremy’s movement with her eyes;
(c) Maria walks through the room.
information for perception (see page 156). This information for perception, according to Gibson, is located not on
the retina but “out there” in the environment. He thought
about information in the environment in terms of the
optic array—the structure created by the surfaces, textures,
and contours of the environment—and he focused on how
movement of the observer causes changes in the optic array.
Let’s see how this works by returning to Jeremy and Maria
in Figure 8.8.
In Figure 8.8a, when Jeremy walks across Maria’s field
of view, portions of the optic array become covered as he
walks by and then are uncovered as he moves on. This result is called a local disturbance in the optic array. A local disturbance in the optic array occurs when one object
moves relative to the environment, covering and uncovering
Motion Perception: Information in the Environment
183
the stationary background. According to Gibson, this local
disturbance in the optic array provides information that
Jeremy is moving relative to the environment.
In Figure 8.8b, Maria follows Jeremy with her eyes. Remember that Gibson doesn’t care what is happening on the
retina. Even though Jeremy’s image is stationary on the retina, the same local disturbance information that was available when Maria was keeping her eyes still remains available
when she is moving her eyes, and this local disturbance information indicates that Jeremy is moving.
In Figure 8.8c, when Maria walks through the environment, something different happens: As Maria moves,
everything around her moves. The walls, the window, the
trashcan, the clock, and the furniture all move relative to
Maria as she walks through the scene. The fact that everything moves at once is called global optic flow; this signals
that Maria is moving but that the environment V
L 15
is stationary.
In identifying information in the environment that
signals what is moving and what is not, the ecological approach provides a nice solution to the problem that we can’t
explain how we perceive movement in some situations based
just on what is happening on the retina. However, this explanation does not consider what is happening physiologically. We will now consider that.
Image of
bar on
retina
Moving bar
(a)
Neuron’s
receptive field
(on retina)
(b)
Figure 8.9 ❚ (a) The rectangle area at the back of the eye
represents the receptive field of a neuron in the cortex that
responds to movement of vertical bars to the right. (b) When
the image of the vertical bar sweeps across the receptive
field, the neuron in the cortex fires.
Neural Firing to Motion
Across the Retina
Whereas the ecological approach focuses on environmental
information, the physiological approach to motion perception focuses on determining the connection between neural
firing and motion perception. First, let’s return to the case
of the observer looking straight ahead at something moving
across the field of view, as in Figure 8.8a. As we will now see,
even this “simple” case is not so simple.
Motion of a Stimulus Across the
Retina: The Aperture Problem
How can we explain how neural firing signals the direction that an object is moving? One possible answer to this
question is that as the stimulus sweeps across the retina, it
activates directionally selective neurons in the cortex that
respond to oriented bars that are moving in a specific direction (see Chapter 4, page 78). This is illustrated in Figure 8.9,
which shows a bar sweeping across a neuron’s receptive field.
Although this appears to be a straightforward solution
to signaling the direction an object is moving, it turns out
that the response of single directionally selective neurons
does not provide sufficient information to indicate the direction in which an object is moving. We can understand
why this is so by considering how a directionally selective
neuron would respond to movement of a vertically oriented
pole like the one being carried by the woman in Figure 8.10.
We are going to focus on the pole, which is essentially
a bar with an orientation of 90 degrees. The circle repre184
CHAPTER 8
Perceiving Motion
Figure 8.10 ❚ The pole’s overall motion is horizontal to the
right (blue arrows). The circle represents the area in Maria’s
field of view that corresponds to the receptive field of a
cortical neuron. The pole’s motion across the receptive field
(which is located on Maria’s retina) is also horizontal to the
right (red arrows).
D E M O N S T R AT I O N
Motion of a Bar Across an Aperture
Make a small aperture, about 1 inch in diameter, by
creating a circle with the fingers of your left hand, as shown
in Figure 8.12 (or you can create a circle by cutting a hole in a
piece of paper). Then orient a pencil vertically, and move the
pencil from left to right behind the circle, as in Figure 8.12a.
As you do this, focus on the direction that the front edge of
the pencil appears to be moving across the aperture. Now,
again holding the pencil vertically, position the pencil below
the circle, as shown in Figure 8.12b, and move it up behind
the aperture at a 45-degree angle (being careful to keep its
orientation vertical). Again, notice the direction in which
the front edge of the pencil appears to be moving V
L 16, 17
across the aperture. ❚
Figure 8.11 ❚ In this situation the pole’s overall motion is up
and to the right (blue arrows). The pole’s motion across the
receptive field, however, remains horizontal to the right (red
arrows), as in Figure 8.10. Thus, the receptive field “sees” the
same motion whether the overall motion is horizontal or up
and to the right.
sents the area of the receptive field of a complex neuron
in the cortex that responds when a vertically oriented bar
moves to the right across the neuron’s receptive field. Figure
8.10 shows the pole entering the receptive field. As the pole
moves to the right, it moves across the receptive field in the
direction indicated by the red arrow, and the neuron fires.
But what happens if the woman climbs some steps?
Figure 8.11 shows that as she walks up the steps she and the
pole are now moving up and to the right (blue arrow). We
know this because we can see the woman and the flag moving up. But the neuron, which only sees movement through
the narrow view of its receptive field, only receives information about the rightward movement. You can appreciate
this by noting that movement of the pole across the receptive field appears the same when the pole is moving to the
right (red arrow) and when it is moving up and to the right
(blue arrow). You can demonstrate this for yourself by doing the following demonstration.
(a)
(b)
Figure 8.12 ❚ Moving a pencil across an aperture. See text
for details.
If you were able to focus only on what was happening
inside the aperture, you probably noticed that the direction
that the front edge of the pencil was moving appeared the
same whether the pencil was moving horizontally to the
right or up and to the right. In both cases, the front edge of
the pencil moves across the aperture horizontally. Another
way to state this is that the movement of an edge across an
aperture occurs perpendicular to the direction in which the edge
is oriented. Because the pencil in our demonstration was
oriented vertically, motion through the aperture was horizontal. Because motion of the edge was the same in both
situations, a single directionally selective neuron would fire
similarly in both situations, so the activity of this neuron
would not provide accurate information about the direction of the pencil’s motion.
The fact that viewing only a small portion of a larger
stimulus can result in misleading information about the
direction in which the stimulus is moving is called the
Neural Firing to Motion Across the Retina
185
Superior temporal
sulcus
(STS)
Striate
cortex
(VI)
Medial
temporal area
(MT)
Fusiform face
area (FFA)
(underside
of brain)
Extrastriate
body area
(EBA)
Figure 8.13 ❚ Human brain, showing the location of a
number of the structures we will be discussing in this chapter.
MT = medial temporal cortex (motion perception); VI =
striate cortex (primary visual receiving area); STS = superior
temporal sulcul (biological motion); FFA = fusiform face area
(face perception); EBA = extrastriate body area (perceiving
bodies).
aperture problem. The visual system appears to solve the
aperture problem by pooling the responses of a number of
neurons like our complex cell. One place this may occur
is the medial temporal (MT) cortex, a nucleus in the dorsal (where or action) stream, which contains a large number
of directionally selective neurons and which we will see is
important for movement perception. Figure 8.13 shows the
location of MT cortex.
Evidence that the MT may be involved in pooling the
responses from a number of neurons was provided by an
experiment by Christopher Pack and Richard Born (2001),
in which they determined how neurons in the monkey’s
MT cortex responded to moving oriented lines like the pole
or our pencil. They found that the MT neurons’ initial response to the stimulus, at about 70 msec after the stimulus
was presented, was determined by the orientation of the bar.
Thus the neuron responded in the same way to a vertical bar
moving horizontally to the right and a vertical bar moving
up and to the right (red arrows in Figure 8.12). However,
140 ms after presentation of the moving bars, the neurons
began responding to the actual direction in which the bars
were moving (blue arrows in Figure 8.12). Apparently, MT
neurons receive signals from a number of neurons in the
striate cortex and then combine these signals to determine
the actual direction of motion.
Can you think of another way a neuron might indicate
that the pole in Figure 8.11 is moving up and to the right?
One of my students tried the demonstration in Figure 8.12
and noticed that when he followed the directions for the
demonstration, the edge of the pencil did appear to be moving horizontally across the aperture, whether the pencil
was moving to the right or up and to the right. However,
when he moved the pencil so that he could see its tip moving through the aperture, as in Figure 8.14, he could tell
that the pencil was moving up. Thus, a neuron could use
186
CHAPTER 8
Perceiving Motion
Figure 8.14 ❚ The circle represents a neuron’s receptive
field. When the pencil is moved up and to the right, as shown,
movement of the tip of the pencil provides information
indicating that the pencil is moving up and to the right.
information about the end of a moving object (such as the
tip of the pencil) to determine its direction of motion. As it
turns out, neurons that could signal this information, because they respond to the ends of moving objects, have been
found in the striate cortex (Pack et al., 2003).
What all of this means is that the “simple” situation of
an object moving across the visual field as an observer looks
straight ahead is not so simple because of the aperture problem. The visual system apparently can solve this problem (1)
by using information from neurons in the MT cortex that
pool the responses of a number of directionally selective
neurons, and (2) by using information from neurons in the
striate cortex that respond to the movement of the ends of
objects (also see Rust et al., 2006; Smith et al., 2005; Zhang &
Britten, 2006).
Motion of Arrays of Dots
on the Retina
The bar stimuli used in the research we have been describing are easy to detect. But what about stimuli that are more
difficult to detect? One tactic used in perception research is
to determine how the perceptual system responds to stimuli that we are just able to detect. We have described experiments such as this in Chapter 3 (measuring spectral sensitivity curves, dark adaptation, and visual acuity), Chapter 5
(determining how well people can detect briefly presented
stimuli), and Chapter 6 (determining how attention affects
the perception of contrast between bars in a grating). We
will now describe some experiments using a type of movement stimulus that makes it possible to vary how difficult it
is to determine the direction of motion.
Neural Firing and the Perception of MovingDot Stimuli William Newsome and coworkers (1989)
used a computer to create moving-dot displays in which
the direction of motion of individual dots can be varied.
Figure 8.15a represents a display in which all of the dots are
moving in random directions. Newsome used the term coherence to indicate the degree to which the dots move in
the same direction. When the dots are all moving in random directions, much like the “snow” you see when your
TV set is tuned between channels, coherence is 0 percent.
Figure 8.15b represents a coherence of 50 percent, as indicated by the darkened dots, which means that at any point
in time half of the dots are moving in the same direction.
Figure 8.15c represents 100 percent coherence, which means
that all of the dots are moving in the same direction.
Newsome and coworkers used these stimuli to determine the relationship between (1) a monkey’s ability
to judge the direction in which dots were moving and (2)
the response of a neuron in the monkey’s MT cortex. They
found that as the dots’ coherence increased, two things happened: (1) the monkey judged the direction of motion more
accurately, and (2) the MT neuron fired more rapidly. The
monkey’s behavior and the firing of the MT neurons were
so closely related that the researchers could predict one
from the other. For example, when the dots’ coherence was
0.8 percent, the monkey was not able to judge the direction
of the dots’ motion and the neuron’s response did not differ appreciably from its baseline firing rate. But at a coherence of 12.8 percent, the monkey judged the direction of the
dots that were moving together correctly on virtually every
trial, and the MT neuron always fired faster than its
VL 18
baseline rate.
These experiments are important because by simultaneously measuring the response of MT neurons and the monkey’s perception, Newsome directly measured the relationship between physiology and perception (relationship PH2
in the perceptual cycle in Figure 8.16). This is in contrast to
most of the experiments we have described in this book so
far, which have measured relationship PH1, the relationship
between stimuli and the physiological response. For example, remember Hubel and Wiesel’s (1959, 1965) experiments
from Chapter 4, which showed that moving bars cause neurons in the cortex to fire (see page 78). These experiments
provided important information about neurons in the cor-
No correlation
Coherence = 0
(a)
50% correlation
Coherence = 50%
(b)
tex, but did not provide any direct information about the
connection between these neurons and perception.
The simultaneous measurement of neural firing and
perception is extremely difficult because before the recording experiments can begin, monkeys must be trained for
months to indicate the direction in which they perceive the
dots moving. Only after this extensive behavioral training
can the monkey’s perception and neural firing be measured
simultaneously. The payoff, however, is that relationship
PH2 is measured directly, instead of having to be inferred
from measurements of the relationship between stimuli
and perception (PP) and between stimuli and physiological
responding (PH1).
Effect of Lesioning and Microstimulation
Measuring perception and the firing of neurons in the
MT cortex simultaneously is one way of showing that the
MT cortex is important for motion perception. The role of
the MT cortex has also been studied by determining how
the perception of motion is affected by (1) lesioning
Newsome dot
experiment
PH2
Physiological
processes
PP
Stimuli
PH1
Hubel and Wiesel
experiments
Figure 8.16 ❚ The perceptual cycle from Chapter 1.
Newsome measured relationship PH2 by simultaneously
recording from neurons and measuring the monkey’s
behavioral response. Other research we have discussed,
such as Hubel and Wiesel’s receptive field studies, have
measured relationship PH1.
100% correlation
Coherence = 100%
(c)
Experience
and action
Figure 8.15 ❚ Moving-dot displays used by
Newsome, Britten, and Movshon (1989). These
pictures represent moving-dot displays that were
created by a computer. Each dot survives for a
brief interval (20–30 microseconds), after which it
disappears and is replaced by another randomly
placed dot. Coherence is the percentage of dots
moving in the same direction at any point in
time. (From Newsome, W. T., & Paré, E. B. (1988).
A selective impairment of motion perception
following lesions of the middle temporal visual
area (MT). Journal of Neuroscience, 8, 2201–2211.)
Neural Firing to Motion Across the Retina
187
(destroying) some or all of the MT cortex or (2) electrically
stimulating neurons in the MT cortex.
A monkey with an intact MT cortex can begin detecting the direction dots are moving when coherence is as low
as 1–2 percent. However, after the MT is lesioned, the coherence must be 10–20 percent before monkeys can begin
detecting the direction of motion. (Newsome & Paré, 1988;
also see Movshon & Newsome, 1992; Newsome et al., 1995;
Pasternak & Merigan, 1994). This provides further evidence
linking the firing of MT neurons to the perception of the
direction of motion. Another way this link between MT
cortex and motion perception has been studied is by electrically stimulating neurons in the MT cortex using a technique called microstimulation.
Perception
(a) No stimulation
METHOD ❚ Microstimulation
Microstimulation is achieved by lowering a small wire
electrode into the cortex and passing a weak electrical
charge through the tip of the electrode. This weak shock
stimulates neurons that are near the electrode tip and
causes them to fire, just as they would if they were being stimulated by neurotransmitter released from other
neurons.
Remember from Chapter 4 that neurons are organized
in orientation columns in the cortex, with neurons in the
same column responding best to one specific orientation
(page 85). Taking advantage of this fact, Movshon and Newsome (1992) used microstimulation to activate neurons in a
column that responded best to a particular direction of motion while a monkey was judging the direction of dots that
were moving in a different direction.
When they applied the stimulation, the monkey suddenly shifted its judgment toward the direction signaled by
the stimulated neurons. For example, when the monkey was
judging the motion of dots that were moving horizontally
to the right (Figure 8.17a) and a column of MT neurons that
preferred downward motion was stimulated, the monkey
began responding as though the dots were moving downward and to the right (Figure 8.17b). The fact that stimulating the MT neurons shifted the monkey’s perception of the
direction of movement provides more evidence linking MT
neurons and motion perception.
T E S T YO U R S E L F 8 .1
1. Describe four different functions of motion perception.
2. Describe four different situations that can result
in motion perception. Which of these situations
involves real motion, and which involve illusions of
motion?
188
CHAPTER 8
Perceiving Motion
Perception
(b) Stimulation
Figure 8.17 ❚ (a) A monkey judges the motion of dots
moving horizontally to the right. (b) When a column of
neurons that prefer downward motion is stimulated, the
monkey judges the same motion as being downward and to
the right.
3. What is the evidence that real motion and apparent
motion may involve similar mechanisms?
4. Describe the ecological approach to motion percep-
tion. What is the advantage of this approach? (Give
a specific example of how the ecological approach
can explain the situations in Figure 8.8b and c.)
5. Describe the aperture problem—why the response
of individual directionally selective neurons does not
provide sufficient information to indicate the direction of motion. Also describe two ways that the brain
might solve the aperture problem.
6. Describe the series of experiments that used moving
dots as stimuli and (a) recorded from neurons in the
MT cortex, (b) lesioned the MT cortex, and (c) stimulated neurons in the MT cortex. What do the results
of these experiments enable us to conclude about
the role of the MT cortex in motion perception?
Taking Eye Motions Into
Account: The Corollary
Discharge
Motor signal
(MS)
Corollary
discharge
signal
(CDS)
Up until now we have been considering the situation like the
one in Figure 8.8a, in which a stationary person, keeping his
or her eyes still, watches a moving stimulus. But in real life
we often move our eyes to follow a moving stimulus, as in
Figure 8.8b. Remember that when Maria did this, she perceived Jeremy as moving even though his image remained
on the same place on her retina.
How does the perceptual system indicate that the stimulus is moving, even though there is no movement on the
retina? The answer, according to corollary discharge theory, is
that the perceptual system uses a signal called the corollary
discharge to take into account the fact that the observer’s eye
is moving (von Holst, 1954).
Stationary
image on
retina
Eye is
moving
to follow
person
Muscle
Moving person
(a)
To brain
“eye moving”
CDS
Corollary Discharge Theory
Imagine you are watching someone walk past by keeping
your head stationary but following the person with your
eyes, so the image of the moving person remains on the
same place on your retinas. Your eyes move because motor
signals (MS) are being sent from the motor area of your
brain to your eye muscles (Figure 8.18a). According to corollary discharge theory, another neural signal, called the
corollary discharge signal (CDS), splits off from the motor signal. The corollary discharge signal, which occurs
anytime a motor signal is sent to the eye muscles, indicates
that a signal has been sent from the brain to move the eye.
The corollary discharge signal reaches a hypothetical structure called the comparator, which relays information back
to the brain that the eye is moving (Figure 8.18b). Basically,
what corollary discharge theory says is that if there is no
movement of an image across the retina, but the comparator is receiving information indicating that the eye is V
L 19
moving, then the observer perceives motion.
The beauty of corollary discharge theory is that it can
also deal with the situation in which the observer’s eye remains stationary and a stimulus moves across the observer’s
field of view (Figure 8.19a). It does this by proposing that
the comparator not only receives the CDS, but also receives
the signal that occurs when an image moves across the
retina. This movement activates the retinal receptors and
sends a signal out the optic nerve that we will call the image
displacement signal (IDS) because it occurs when a stimulus is displaced across the retina.
According to corollary discharge theory, when the IDS
reaches the comparator, the comparator sends a signal to the
brain that results in the perception of motion (Figure 8.19b).
Corollary discharge theory is therefore a fairly simple idea,
which can be summarized by saying that the perception of
movement occurs if the comparator receives either (1) a sig-
Comparator
(b)
Figure 8.18 ❚ According to the corollary discharge model,
(a) when a motor signal (MS) is sent to the eye muscles, so
the eye can follow a moving object, a corollary discharge
signal (CDS) splits off from the motor signal. (b) When the
CDS reaches the comparator, it sends a signal to the brain
that the eye is moving, and motion is perceived.
nal that the eye is moving (CDS) or (2) a signal that an image is being displaced across the retina (IDS).
But what happens if both a CDS and an IDS reach
the comparator simultaneously? This would occur if you
were to move your eyes to inspect a stationary scene, as in
Figure 8.19c. In this case, a CDS is generated because the
eye is moving, and an IDS is generated because images of
the scene are sweeping across the retina. According to corollary discharge theory, when both the CDS and IDS reach
the comparator simultaneously, no signal is sent to the
brain, so no motion is perceived. In other words, if an image is moving across the retina, but the CDS indicates that
this movement of the image is being caused by movements
of the eyes, then no motion is perceived.
Upon hearing this explanation, students often wonder where the comparator is located. The answer is that the
comparator is most likely not located in one specific place in
the brain, but may involve a number of different structures.
Similarly, the corollary discharge signal probably originates
from a number of different places in the brain (Sommer &
Crapse, in press; Sommer & Wurtz, 2008). The important
thing, for our purposes, is that corollary discharge theory
proposes that the visual system takes into account information about both stimulation of the receptors and movement
Taking Eye Motions Into Account: The Corollary Discharge
189
D E M O N S T R AT I O N
Eliminating the Image Displacement Signal
With an Afterimage
Image moves
across retina
Image displacement
signal (IDS)
Eye is
stationary
Moving person
Illuminate the circle in Figure 8.20 with your desk lamp and
look at it for about 60 seconds. Then go into your closet (or
a completely dark room) and observe what happens to the
circle’s afterimage (blink to make it come back if it fades) as
you look around. Notice that the afterimage moves in synchrony with your eye motions (Figure 8.21). ❚
(a)
To brain
“image moving”
IDS
Figure 8.20 ❚ Afterimage stimulus.
Comparator
Eye moves in dark
(b)
MS
No signal
to brain
CDS
IDS
Comparator
Image of scene
moves across
retina
Eye is
moving
Bleached patch stays stationary
on retina as eye moves
Stationary scene
(c)
Figure 8.19 ❚ (a) When a stationary observer watches a
moving object, movement of the image across the retina
creates an image displacement signal (IDS). (b) When the
IDS reaches the comparator, it sends a signal to the brain,
and motion is perceived. (c) If both a CDS and IDS reach
the comparator simultaneously, as would occur if a person
is scanning a stationary scene, then no signal is sent to the
brain, and no motion is perceived.
of the eye to determine our perception of motion. And although we can’t pinpoint exactly where the CDS and comparator are located, there is both behavioral and physiological
evidence that supports the theory.
Behavioral Demonstrations of
Corollary Discharge Theory
Here are two demonstrations that enable you to create situations in which motion perception occurs even though there
is no motion across the retina.
190
CHAPTER 8
Perceiving Motion
Figure 8.21 ❚ When the eye moves in the dark, the image
remains stationary (the bleached area on the retina), but a
corollary discharge signal is sent to the comparator, so the
afterimage appears to move.
Why does the afterimage appear to move when you
move your eyes? The answer cannot be that an image is moving across your retina because the circle’s image always remains at the same place on the retina. (The circle’s image
on the retina has created a circular area of bleached visual
pigment, which remains in the same place no matter where
the eye is looking.) Without motion of the stimulus across
the retina, there is no image displacement signal. However,
a corollary discharge signal accompanies the motor signals
sent to your eye muscles as you move your eyes, as in Figure
8.18a. Thus, only the corollary discharge signal reaches the
comparator, and you see the afterimage move.
D E M O N S T R AT I O N
Seeing Motion by Pushing on Your Eyelid
Pick a point in the environment and keep looking at it while
very gently pushing back and forth on the side of your eyelid,
Figure 8.18b, so Stark and Bridgeman’s observers saw the
scene move (also see Bridgeman & Stark, 1991; Ilg, Bridgeman, & Hoffmann, 1989). (See “Think About It” #3 on page
196 for a question related to this explanation.)
These demonstrations support the central idea proposed by corollary discharge theory that there is a signal
(the corollary discharge) that indicates when the observer
moves, or tries to move, his or her eyes. (Also see “If You
Want to Know More” #5, at the end of the chapter, for another demonstration). When the theory was first proposed,
there was little physiological evidence to support it, but now
there is a great deal of physiological evidence for the theory.
as shown in Figure 8.22. As you do this, you will see the
scene move. ❚
Physiological Evidence for Corollary
Discharge Theory
Bruce Goldstein
In both of our demonstrations, there was a corollary discharge signal but no image displacement signal. What
would happen if there was no corollary discharge but there
was an image displacement signal? That is apparently what
happened to R.W., a 35-year-old male who experienced vertigo (dizziness) anytime he moved his eyes or experienced
motion when he looked out the window of a moving car.
A brain scan revealed that R.W. had lesions in an area of
his cortex called the medial superior temporal area (MST),
which is just above the MT cortex (Figure 8.13). Behavioral
testing of R.W. also revealed that as he moved his eyes, the
stationary environment appeared to move with a velocity
that matched the velocity with which he was moving his eyes
(Haarmeier et al., 1997). Thus, when he moved his eyes, there
was an IDS, because images were moving across his retina,
but the damage to his brain had apparently eliminated the
CDS. Because only the IDS reached the comparator, R.W.
saw motion when there actually was none.
Other physiological evidence for the theory comes from
experiments that involve recording from neurons in the
monkey’s cortex. Figure 8.23 shows the response recorded
Figure 8.22 ❚ Why is this woman smiling? Because when
she pushes on her eyelid while keeping her eye fixed on one
place, she sees the world jiggle.
Why do you see motion when you push on your eyeball?
Lawrence Stark and Bruce Bridgeman (1983) did an experiment in which they instructed observers to keep looking at
a particular point while pushing on their eyelid. Because
the observers were paying strict attention to the instructions (“Keep looking at that point!”), the push in their eyelid
didn’t cause their eyes to move. This lack of movement occurred because the observer’s eye muscles were pushing back
against the force of the finger to keep the eye in place. According to corollary discharge theory, the motor signal sent
to the eye muscles to hold the eye in place created a corollary
discharge signal, which reached the comparator alone, as in
FP
S
RF
(a) Bar moves
Figure 8.23 ❚ Responses of a realFP
S
RF
(b) Eye moves
motion neuron in the extrastriate cortex of a
monkey. In both cases, a bar sweeps across
the neuron’s receptive field. (a) The neuron
fires when the bar moves to the left across
the retina. (b) The neuron doesn’t fire when
the eye moves to the right past the bar.
(Adapted from Galletti, C., & Fattori, P.
(2003). Neuronal mechanisms for
detection of motion in the field of view.
Neuropsychologia, 41, 1717–1727.)
Taking Eye Motion Into Account: The Corollary Discharge
191
from a motion-sensitive neuron in the monkey’s extrastriate cortex. This neuron responds strongly when the monkey looks steadily at the fi xation point (FP) as a moving
bar sweeps across the cell’s receptive field (Figure 8.23a),
but does not respond when the monkey follows a moving
fi xation point with its eyes and the bar remains stationary
(Figure 8.23b; Galletti & Fattori, 2003).
This neuron is called a real-motion neuron because
it responds only when the stimulus moves and doesn’t respond when the eye moves, even though the stimulus on the
retina—a bar sweeping across the cell’s receptive field—is the
same in both situations. This real-motion neuron must be
receiving information like the corollary discharge, which
tells the neuron when the eye is moving. Real-motion neurons have also been observed in many other areas of the
cortex (Battaglini, Galletti, & Fattori, 1996; Robinson &
Wurtz, 1976), and more recent research has begun to determine where the corollary discharge is acting in the brain
(Sommer & Wurtz, 2006; Wang et al., 2007).
Perceiving Biological Motion
One of the most common and important types of motion
we perceive is the movement of people. We watch other people’s movements not only to see where they are going but
also to determine their intentions, what they are doing, and
perhaps also their moods and feelings.
Although information about people’s actions, intentions, and moods can be determined from many types
of cues, including facial expressions and what they are saying, this information can also be obtained based solely
on motion information (Puce & Perrett, 2003). This was
demonstrated by Gunnar Johansson (1973, 1975), who
created point-light walker stimuli by placing small lights
on people’s joints and then filming the patterns created
by these lights when people worked and carried out other
actions in the dark (Figure 8.24). When the person wearing the lights is stationary, the lights look like a meaningless pattern. However, as soon as the person starts
walking, with arms and legs swinging back and forth
and feet moving in flattened arcs, first one leaving the
ground and touching down, and then the other, the lights
are immediately perceived as being caused by a walking person. This motion of a person or other V
20,21
L
living organism is called biological motion.
Brain Activation by Point-Light
Walkers
The perception of the point-light walker stimulus as a person is seen walking is an example of how movement can
create perceptual organization, because the movement
transforms dots that appear unrelated into a pattern that
is almost immediately seen as a meaningful figure. One reason we are particularly good at perceptually organizing the
192
CHAPTER 8
Perceiving Motion
Figure 8.24 ❚ A person wearing lights for a biological
motion experiment. In the actual experiment, the room is
totally dark, and only the lights can be seen.
complex motion of an array of moving dots into the perception of a walking person is that we see biological motion all
the time. Every time you see a person walking, running, or
behaving in any way that involves movement, you are seeing
biological motion. Our ability to easily organize biological
motions into meaningful perceptions led some researchers to suspect that there may be an area in the brain that
responds to biological motion, just as there are areas such
as the extrastriate body area (EBA) and fusiform face area
(FFA) that are specialized to respond to bodies and faces,
respectively (Figure 8.13).
Emily Grossman and Randolph Blake (2001) provided
evidence supporting the idea of a specialized area in the
brain for biological motion by measuring observers’ brain activity as they viewed the moving dots created by a point-light
walker (Figure 8.25a) and as they viewed dots that moved
similarly to the point-light walker dots, but were scrambled
so they did not result in the impression of a person walking
(Figure 8.25b). They found that activity in a small area in the
superior temporal sulcus (STS; see Figure 8.13) was greater
for biological motion than for scrambled motion in all eight
of their observers. In another experiment, Grossman and
Blake (2002) showed that other regions, such as the FFA,
were activated more by biological motion than by scrambled
motion, but that activity in the EBA did not distinguish between biological and scrambled motion. Based on these results, they concluded that there is a network of areas, which
includes the STS and FFA, that are specialized for the perception of biological motion (also see Pelphrey et al., 2003).
(a) Biological
(b) Scrambled
(a)
(b)
(c)
(d)
Time
Figure 8.25 ❚ Frames from the stimuli used by Grossman
and Blake (2001). (a) Sequence from the point-light walker
stimulus. (b) Sequence from the scrambled point-light
stimulus.
Linking Brain Activity and the
Perception of Biological Motion
One of the principles we have discussed in this book is that
just showing that a structure responds to a specific type of
stimulus does not prove that the structure is involved in
perceiving that stimulus. Earlier in the chapter we described
how Newsome used a number of different methods to show
that MT cortex is specialized for the perception of motion.
In addition to showing that MT cortex is activated by motion, he also showed that perception of motion is decreased
by lesioning MT cortex and is influenced by stimulating
neurons in MT cortex. Directly linking brain processes and
perception enabled Newsome to conclude that the MT cortex is important for the perception of motion.
Just as Newsome showed that disrupting operation of
the MT cortex decreases a monkey’s ability to perceive the
direction of moving dots, Emily Grossman and coworkers
(2005) showed that disrupting operation of the STS in humans decreases the ability to perceive biological motion.
Newsome disrupted operation of the monkey’s MT cortex
by lesioning that structure. Because Grossman’s experiments were on humans, she used a more gentle and temporary method of disrupting brain activity—a procedure
called transcranial magnetic stimulation.
M E T H O D ❚ Transcranial Magnetic
Stimulation (TMS)
One way to investigate whether an area of the brain is involved in determining a particular function is to remove
that part of the brain in animals or study cases of brain
Figure 8.26 ❚ (a) Biological motion stimulus; (b) scrambled
stimulus; (c) stimulus from a, with “noise” added (dots
corresponding to the walker are indicated by lines, which
were not seen by the observer); (d) how the stimulus
appears to the observer. (From Grossman, E. D., Batelli, L., &
Pascual-Leone, A. (2005). Repetitive TMS over posterior STS
disrupts perception of biological motion. Vision Research, 45,
2847–2853.)
damage in humans. Of course, we cannot purposely remove a portion of a person’s brain, but it is possible to
temporarily disrupt the functioning of a particular area
by applying a pulsating magnetic field using a stimulating coil placed over the person’s skull. A series of pulses
presented to a particular area of the brain for a few seconds decreases or eliminates brain functioning in that
area for seconds or minutes. A participant’s behavior is
tested while the brain area is deactivated. If the behavior
is disrupted, researchers conclude that the deactivated
area of the brain is causing that behavior.
The observers in Grossman’s (2005) experiment viewed
point-light stimuli for activities such as walking, kicking, and throwing (Figure 8.26a), and they also viewed
scrambled point-light displays (Figure 8.26b). Their task
was to determine whether a display was biological motion
or scrambled motion. This is normally an extremely easy
task, but Grossman made it more difficult by adding extra
dots to create “noise” (Figure 8.26c and d). The amount of
noise was adjusted for each observer so that they could distinguish between biological and scrambled motion with 71
percent accuracy.
The key result of this experiment was that presenting
TMS to the area of the STS that is activated by biological motion caused a significant decrease in the observers’
Perceiving Biological Motion
193
ability to perceive biological motion. TMS stimulation of
other motion-sensitive areas, such as the MT cortex, had no
effect on the perception of biological motion. From this result, Grossman concluded that normal functioning of the
“biological motion” area, STS, is necessary for perceiving biological motion. This conclusion is also supported by studies that have shown that people who have suffered damage to this area have trouble perceiving biological motion
(Battelli et al., 2003). What all of this means is that biological motion is more than just “motion”—it is a special type of
motion that is served by specialized areas of the brain.
Image not available due to copyright restrictions
Something to Consider: Going
Beyond the Stimulus
We have seen that the brain responds to a number of different types of stimuli, including moving bars, moving dots,
and moving people. But is our perception of motion determined solely by automatic responding to different types
of stimuli? There is evidence that the meaning of a stimulus and the knowledge people have gained from their past
experiences in perceiving motion can influence both the
perception of motion and the activity of the brain. One example of how meaning and knowledge influence perception
and brain activity is provided by a phenomenon called
implied motion.
Implied Motion
Look at the picture in Figure 8.27. Most people perceive this
picture as a “freeze frame” of an action—dancing—that involves motion. It is not hard to imagine the person’s dress
and feet moving to a different position in the moments following the situation depicted in this picture. A situation
such as this, in which a still picture depicts a situation involving motion, is called implied motion.
Jennifer Freyd (1983) did an experiment involving implied motion pictures by briefly showing observers pictures
that depicted a situation involving motion, such as a person
jumping off of a low wall. After a pause, she showed her observers either (1) the same picture; (2) a picture slightly forward in time (the person who had jumped off the wall was
closer to the ground); or (3) a picture slightly backward in
time (the person was further from the ground). The observers’ task was to indicate, as quickly as possible, whether the
second picture was the same as or different from the first
picture.
Freyd predicted that her observers would “unfreeze” the
implied motion depicted in the picture, and therefore anticipate the motion that was going to occur in a scene. If this
occurred, observers might “remember” a picture as depicting a situation that occurred slightly later in time. For the
picture of the person jumping off the wall, that would mean
the observers might remember the person as being closer to
194
CHAPTER 8
Perceiving Motion
the ground than he was in the initial picture. Freyd’s results
confirmed this prediction, because observers took longer
to decide whether the “time-forward” picture was different
from the original picture.
The idea that the motion depicted in a picture tends to
continue in the observers’ mind is called representational
momentum (David & Senior, 2000; Freyd, 1983). Representational momentum is an example of experience influencing
perception because it depends on our knowledge of the way
situations involving motion typically unfold.
Catherine Reed and Norman Vinson (1996) studied
the effect of experience on representational momentum by
presenting a sequence of pictures, as in Figure 8.28. Each
picture was seen as a still picture because the sequence was
presented slowly enough so that no apparent motion occurred. Thus, any motion that did occur was implied by the
positions and meanings of the objects in the pictures. After
the third picture, which was called the memory picture, the
observer saw the test picture. The test picture could appear in
the same position as the memory picture or slightly lower
or slightly higher. The observer’s task was to indicate as
quickly as possible whether the test picture was in the same
position as the memory picture.
Reed and Vinson wanted to determine whether the
meaning of a picture had any effect on representational
momentum, so they used pictures with different meanings.
Figure 8.28a shows rocket pictures, and Figure 8.28b shows
weight pictures. They found that the rocket pictures showed a
greater representational momentum effect than the weight
pictures. That is, observers were more likely to say that
the test picture of the rocket that appeared in a position
higher than the memory picture was in the same position as
the memory picture. Reed and Vinson therefore concluded
that the representational momentum effect is affected by
a person’s expectations about the motion of an object and
At rest (R)
Houses (H)
R
H
Stimuli
Implied
No-implied
motion (IM) motion (no-IM)
16
Tons
16
Tons
16
Tons
16
Tons
(b)
% Signal change
(a)
2.0
1.0
0
1
2
3
Memory
picture
4
Test picture
Same or different
position as memory
picture?
Figure 8.28 ❚ Stimuli used by Reed and Vinson (1996) to
demonstrate the effect of experience on representational
momentum. In this example, the test pictures are lower than
the memory picture. On other trials, the rocket or weight
would appear in the same position as or higher than the
memory picture.
that learned properties of objects (that rockets go up, for example) contributes to these expectations (Vinson & Reed,
2002).
If implied motion causes an object to continue moving in a person’s mind, then it would seem reasonable that
this continued motion would be reflected by activity in the
brain. When Zoe Kourtzi and Nancy Kanwisher (2000)
measured the fMRI response in areas MT and MST to pictures like the ones in Figure 8.29, they found that the area
of the brain that responds to actual motions also responds
to pictures of motion and that implied-motion pictures (IM)
caused a greater response than non-implied-motion pictures
(No-IM), rest pictures (R), or house pictures (H). Thus, activity occurs in the brain that corresponds to the continued
motion that implied-motion pictures create in a person’s
mind (also see Lorteije et al., 2006; Senior et al., 2000).
Apparent Motion
The effect of a person’s past experience on motion perception
has also been determined using apparent motion displays.
Remember that apparent motion occurs when one stimulus is flashed, followed by another stimulus at a slightly different position (see Figure 8.5). When V. S. Ramachandran
and Stuart Anstis (1986) flashed the two dots on the left in
Figure 8.30a followed by the single dot on the right, their
observers saw the top dot move horizontally to the right
and the bottom one move diagonally, so both dots appeared
to move to the dot on the right (Figure 8.30b). But adding a
square, as in Figure 8.30c, caused a change in this perception. Now observers perceived both dots as moving horizon-
IM
No-IM
Figure 8.29 ❚ Examples of pictures used by Kourtzi and
Kanwisher (2000) to depict implied motion (IM), no implied
motion (no-IM), at rest (R), and a house (H). The height of the
bar below each picture indicates the average fMRI response
of the MT cortex to that type of picture. (From Kourtzi, Z., &
Kanwisher, N., Activation in human MT/MST by static images
with implied motion, Journal of Cognitive Neuroscience, 12, 1,
January 2000, 48–55. © 2000 by Massachusetts Institute of
Technology. All rights reserved. Reproduced by permission.)
First
these
(a)
Then
this
(b)
(c)
Figure 8.30 ❚ Stimuli from the Ramachandran and Anstis
(1986) experiment. (a) The initial stimulus condition. Both
dots move to the position of the dot on the right. (b) Placing
a square in the position shown changes the perception of the
movement of the lower dot, which now moves to the right and
under the square.
tally to the right, with the bottom dot sliding behind the
square. According to Ramachandran and Anstis, this perception occurs because of our past experience in V
22–25
L
seeing objects disappear behind other objects.
T E S T YO U R S E L F 8 . 2
1. Describe the corollary discharge model. In your de-
scription, indicate (1) what the model is designed to
explain; (2) the three types of signals—motor signal,
corollary discharge signal, and image displacement
signal; and (3) when these signals cause motion perception when reaching the comparator, and when
Something to Consider: Going Beyond the Stimulus
195
they do not cause motion perception when reaching
the comparator.
2. What is biological motion, and how has it been studied using point-light displays?
3. Describe the experiments that have shown that an
area in the STS is specialized for perceiving biological motion.
4. What is implied motion, and what does it tell us
about the role of experience in perceiving motion?
Describe Ramachandran and Anstis’ apparent
motion experiment.
THINK ABOUT IT
1.
We perceive real motion when we see things that are
physically moving, such as cars on the road and people
on the sidewalk. But we also see motion on TV, in movies, on our computer screens, and in electronic displays
such as those in Las Vegas or Times Square. How are
images presented in these situations in order to result
in the perception of motion? (This may require some research.) (p. 180)
2.
In this chapter, we described a number of principles
that also hold for object perception (Chapter 5). Find
examples from Chapter 5 of the following (page numbers are for this chapter):
3.
4.
•
There are neurons that are specialized to respond to
specific stimuli. (p. 187)
•
More complex stimuli are processed in higher areas
of the cortex. (p. 186)
•
Top-down processing and experience affect perception. (p. 194)
•
There are parallels between physiology and perception. (pp. 187, 193)
Stark and Bridgeman explained the perception of
movement that occurs when pushing gently on the
eyelid by a corollary discharge generated when muscles
are pushing back to counteract the push on the side
of the eye. What if the push on the eyelid causes the
eye to move, and the person sees the scene move? How
would perception of the scene’s movement in this
situation be explained by corollary discharge theory?
(p. 191)
In the “Something to Consider” section, we stated that
the representational momentum effect shows how
knowledge can affect perception. Why could we also say
that representational momentum illustrates an interaction between perception and memory? (p. 194)
196
CHAPTER 8
Perceiving Motion
IF YOU WANT TO KNOW MORE
1.
Perceiving events. People are able to segment the ongoing stream of behavior into individual events, such
as when the salesperson in the mall first was sorting
clothes and then moved to check people out at the
cash register. New research has shown that motion is
central to perceiving different events in our environment. (p. 178)
Zacks, J. M. (2004). Using movement and intentions
to understand simple events. Cognitive Science, 28,
979–1008.
Zacks, J. M., & Swallow, K. M. (2007). Event segmentation. Current Directions in Psychological Science,
16, 80–84.
2.
Effect of early experience on motion perception. When kittens are raised in an environment that is illuminated
by flashing lights, they lose the ability to detect the
direction of moving stimuli. Experience in perceiving
motion is necessary in order for motion perception to
develop.
Pasternak, T. (1990). Vision following loss of corti
cal directional selectivity. In M. A. Berkley & W. C.
Stebbins (Eds.), Comparative perception (Vol. 1, pp.
407–428). New York: Wiley.
3.
Motion aftereffects and the brain. After viewing a waterfall, a rotating spiral, or moving stripes, an illusion
of motion called a motion aftereffect occurs. These
effects have been linked to activity in the brain.
(p. 181)
Anstis, S. M., Verstraten, F. A. J., & Mather, G.
(1998). The motion aftereffect: A review. Trends in
Cognitive Science, 2, 111–117.
4.
New research on the corollary discharge signal. When neurons in an area in the monkey’s thalamus are deactivated by a chemical injection, the monkeys have trouble locating objects after moving their eyes because of
a disruption in the corollary discharge that signals
when the eyes are moving. (p. 191)
Sommer, M., & Wurtz, R. H. (2006). Influence of the
thalamus on spatial visual processing in frontal
cortex. Nature, 444, 374–376.
5.
Eliminating the image movement signal by paralysis.
Experiments have been done in which a person has
been temporarily paralyzed by a drug injection. When
the person tries to move his or her eyes, a motor signal
(MS) and corollary discharge signal (CDS) are sent
from the brain, but no image displacement signal
(IDS) occurs because the person can’t actually move
the eyes. Corollary discharge theory predicts that the
person should see the environment move, which is
what happens. (p. 191)
Matin, L., Picoulet, E., Stevens, J., Edwards, M., &
McArthur, R. (1982). Oculoparalytic illusion:
Visual-field dependent spatial mislocations by
humans partially paralyzed by curare. Science, 216,
198–201.
6.
Cats perceive biological motion. The perception of biological motion is not restricted to humans. There is
evidence that cats can perceive it as well. (p. 192)
Blake, R. (1993). Cats perceive biological motion.
Psychological Science, 4, 54–57.
7.
Motions of face and body as social signals. Motion of faces
and bodies provide information that can be used to
decode complex social signals. Neurons on the superior temporal sulcus (STS) play a role in perceiving
this motion. (p. 193)
Puce, A., & Perrett, D. (2003). Electrophysiology
and brain imaging of biological motion. Philosophical Transactions of the Royal Society of London, 358,
435–445.
KEY TERMS
Aperture problem (p. 186)
Apparent motion (p. 180)
Attentional capture (p. 179)
Biological motion (p. 192)
Coherence (p. 187)
Comparator (p. 189)
Corollary discharge signal (CDS)
(p. 189)
Global optic flow (p. 184)
Illusory motion (p. 180)
Image displacement signal (IDS)
(p. 189)
Implied motion (p. 194)
Induced motion (p. 181)
Local disturbance in the optic array
(p. 183)
Microstimulation (p. 188)
Motion aftereffect (p. 181)
The following lab exercises are related to the material
in this chapter:
MEDIA RESOURCES
The Sensation and Perception
Book Companion Website
Motion Providing Organization: The Hidden Bird How
movement can cause an image to stand out from a complex
background. (Courtesy of Michael Bach.)
2. Perceptual Organization: The Dalmatian Dog How a
black-and-white pattern can be perceived as a Dalmatian.
(Courtesy of Michael Bach.)
3. Motion Parallax and Object Form How the image of
an object changes when it is viewed from different
angles.
4. Shape From Movement How movement of some dots in a
field of dots can create perception of an object.
5. Form and Motion How moving dot patterns can create
the perception of three-dimensional forms. Click on
“parameters” to set up this demonstration.
6. Motion Reference How the presence of two moving
“reference” dots can influence the perceived movement of
another dot that is moving between them.
7. Motion Binding Like the Motion Reference demonstration, this illustrates how adding an object to a display can
influence how we perceive motion. (Courtesy of Michael
Bach.)
8. The Phi Phenomenon, Space, and Time How the perception of apparent motion created by flashing two spheres
depends on the distance and time interval between the
spheres.
1.
www.cengage.com/psychology/goldstein
See the companion website for flashcards, practice quiz
questions, Internet links, updates, critical thinking
exercises, discussion forums, games, and more!
CengageNOW
www.cengage.com/cengagenow
Go to this site for the link to CengageNOW, your one-stop
shop. Take a pre-test for this chapter, and CengageNOW
will generate a personalized study plan based on your test
results. The study plan will identify the topics you need to
review and direct you to online resources to help you master those topics. You can then take a post-test to help you
determine the concepts you have mastered and what you
will still need to work on.
Virtual Lab
Motion agnosia (p. 179)
Motor signal (MS) (p. 189)
Optic array (p. 183)
Point-light walker (p. 192)
Real motion (p. 180)
Real-motion neuron (p. 192)
Representational momentum (p. 194)
Waterfall illusion (p. 181)
VL
Your Virtual Lab is designed to help you get the most out
of this course. The Virtual Lab icons direct you to specific
media demonstrations and experiments designed to help
you visualize what you are reading about. The number
beside each icon indicates the number of the media element
you can access through your CD-ROM, CengageNOW, or
WebTutor resource.
Media Resources
197
Illusory Contour Motion How alternating two displays
that contain illusory contours can result in perception of a
moving contour.
10. Apparent Movement and Figural Selection How movement is perceived when vertical and horizontal rectangles
are flashed in different positions.
11. Motion Capture How dots on a surface are “captured”
by apparent movement of that surface.
12. Induced Movement How the perception of movement
can be influenced by movement of the background.
13. Waterfall Illusion How viewing a moving horizontal
grating can cause an aftereffect of motion.
14. Spiral Motion Aftereffect How viewing a rotating spiral
can cause an aftereffect of motion that is opposite to the
direction of rotation.
15. Flow From Walking Down a Hallway Global optical flow.
(Courtesy of William Warren.)
16. Aperture Problem (Wenderoth) A demonstration of
why viewing movement through an aperture poses a problem for motion perception. (Courtesy of Peter Wenderoth.)
17. Barberpole Illusion (Wenderoth) A version of the aperture problem with an elongated aperture. (Courtesy of
Peter Wenderoth.)
18. Cortical Activation by Motion Video showing how motion activates areas outside the primary visual receiving
area. (Courtesy of Geoffrey Boynton.)
9.
198
CHAPTER 8
Perceiving Motion
Corollary Discharge Model How the corollary discharge
model operates for movement of objects and movement of
the observer.
20. Biological Motion 1 Shows how biological motion
stimuli for a human walker change when gender, weight,
and mood are varied. (Courtesy of Nikolaus Troje.)
19.
Biological Motion 2 Illustrates biological motion
stimuli for humans, cats, and pigeons and what happens
when these stimuli are inverted, scrambled, and masked.
(Courtesy of Nikolaus Troje.)
22. Motion and Introduced Occlusion How placing your
finger over an apparent movement display can influence the
perception of an object’s motion.
23. Field Effects and Apparent Movement How introducing
an occluder in an apparent-movement display can influence
the perception of an object’s motion.
24. Line-Motion Effect An illusion of motion that is
created by directing attention to one location and then
flashing a line. (Courtesy of Peter Wenderoth.)
25. Context and Apparent Speed How the perceived speed of
a bouncing ball changes when it is near a border.
21.
This page intentionally left blank
Chapter Contents
C H A P T E R
9
INTRODUCTION TO COLOR
What Are Some Functions of Color Vision?
What Colors Do We Perceive?
Color and Wavelength
Wavelengths Do Not Have Color!
TRICHROMATIC THEORY OF
COLOR VISION
Behavioral Evidence for the Theory
The Theory: Vision Is Trichromatic
Physiology of Trichromatic Vision
❚ TEST YOURSELF 9.1
Perceiving
Color
COLOR DEFICIENCY
Monochromatism
Dichromatism
Physiological Mechanisms of ReceptorBased Color Deficiency
OPPONENT-PROCESS THEORY OF
COLOR VISION
Behavioral Evidence for the Theory
DEMONSTRATION: The Colors of the Flag
DEMONSTRATION: Afterimages and
Simultaneous Contrast
DEMONSTRATION: Visualizing Colors
The Theory: Vision Is an Opponent Process
The Physiology of Opponent-Process Vision
COLOR IN THE CORTEX
❚ TEST YOURSELF 9.2
PERCEIVING COLORS UNDER
CHANGING ILLUMINATION
DEMONSTRATION: Color Perception
Under Changing Illumination
Chromatic Adaptation
DEMONSTRATION: Adapting to Red
The Effect of the Surroundings
DEMONSTRATION: Color and the
Surroundings
Memory and Color
LIGHTNESS CONSTANCY
Intensity Relationships: The Ratio Principle
Lightness Perception Under Uneven
Illumination
DEMONSTRATION: The Penumbra and
Lightness Perception
DEMONSTRATION: Perceiving Lightness at
a Corner
Something to Consider: Experiences
That Are Created by the Nervous
System
❚ TEST YOURSELF 9.3
Think About It
If You Want to Know More
Key Terms
Media Resources
The multicolored facades of buildings in the La Placita
Village in downtown Tucson, Arizona, which houses the Chamber of
Commerce and corporate offices.
OPPOSITE PAGE
Bruce Goldstein
VL The Virtual Lab icons direct you to specific animations and videos
designed to help you visualize what you are reading about. The number beside
each icon indicates the number of the clip you can access through your
CD-ROM or your student website.
VL VIRTUAL LABS
201
Some Questions We Will Consider:
❚ What does someone who is “color-blind” see? (p. 211)
❚ Why do we perceive blue dots when a yellow flash bulb
goes off? (p. 214)
❚ What colors does a honeybee perceive? (p. 224)
C
olor is one of the most obvious and pervasive qualities in
our environment. We interact with it every time we note
the color of a traffic light, choose clothes that are color coordinated, or appreciate the colors of a painting. We pick favorite
colors (blue is the most favored; Terwogt & Hoeksma, 1994),
we associate colors with emotions (we turn purple with rage,
red with embarrassment, green with envy, and feel blue; Terwogt & Hoeksma, 1994; Valdez & Mehribian, 1994), and we
imbue colors with special meanings (for example, red signifies danger; purple, royalty; green, ecology). But for all of our
involvement with color, we sometimes take it for granted,
and—just as with our other perceptual abilities—we may not
fully appreciate color unless we lose our ability to experience
it. The depth of this loss is illustrated by the case of Mr. I, a
painter who became color-blind at the age of 65 after suffering a concussion in an automobile accident.
In March of 1986, the neurologist Oliver Sacks1
received an anguished letter from Mr. I, who,
identifying himself as a “rather successful
artist,” described how ever since he had been
involved in an automobile accident, he had lost
his ability to experience colors, and he exclaimed
with some anguish, that “My dog is gray. Tomato
juice is black. Color TV is a hodge-podge. . . .”
In the days following his accident, Mr. I
became more and more depressed. His studio,
normally awash with the brilliant colors of his
abstract paintings, appeared drab to him, and
his paintings, meaningless. Food, now gray,
became difficult for him to look at while eating;
and sunsets, once seen as rays of red, had become
streaks of black against the sky.
Mr. I’s color blindness was caused by cortical injury after a lifetime of experiencing color, whereas most cases of
total color blindness or of color deficiency (partial color
blindness, which we’ll discuss in more detail later in this
chapter) occur at birth because of the genetic absence of
one or more types of cone receptors. Most people who are
born color-blind are not disturbed by their lack of color
perception, because they have never experienced color.
However, some of their reports, such as the darkening of
reds, are similar to Mr. I’s. People with total color blindness often echo Mr. I’s complaint that it is sometimes difficult to distinguish one object from another, as when his
brown dog, which he could easily see silhouetted against a
1
Dr. Sacks, well known for his elegant writings describing interesting
neurological cases, came to public attention when he was played by Robin
Williams in the 1995 fi lm Awakenings.
202
CHAPTER 9
Perceiving Color
light-colored road, became very difficult to perceive when
seen against irregular foliage.
Eventually, Mr. I overcame his strong psychological
reaction and began creating striking black-and-white pictures. But his account of his color-blind experiences provides an impressive testament to the central place of color in
our everyday lives. (See Heywood et al., 1991; Nordby, 1990;
Young et al., 1980; and Zeki, 1990, for additional descriptions of cases of complete color blindness.)
In this chapter, we consider color perception in three
parts. We first consider some basic facts about color perception, and then focus on two questions: (1) What is the connection between color perception and the firing of neurons?
(2) How do we perceive the colors and lightness of objects in
the environment under changing illumination?
Introduction to Color
Why do we perceive different colors? We will begin answering this question by first speculating about some of the
functions that color serves in our lives and in the lives of
monkeys. We will then look at how we describe our experience of color and how this experience is linked to the properties of light.
What Are Some Functions
of Color Vision?
Color adds beauty to our lives, but it does more than that.
Color serves important signaling functions, both natural
and contrived by humans. The natural and human-made
world provides many color signals that help us identify and
classify things. I know the rock on my desk contains copper
by the rich blue vein that runs through it; I know a banana
is ripe when it has turned yellow; and I know to stop when
the traffic light turns red.
In addition to its signaling function, color helps facilitate perceptual organization, the process we discussed in
Chapter 5 (p. 105) by which small elements become grouped
perceptually into larger objects. Color perception greatly
facilitates the ability to tell one object from another and
especially to pick out objects within scenes, an ability crucial to the survival of many species. Consider, for example,
a monkey foraging for fruit in the forest or jungle. A monkey with good color vision easily detects red fruit against a
green background (Figure 9.1a), but a color-blind monkey
would find it more difficult to find the fruit (Figure 9.1b).
Color vision thus enhances the contrast of objects that, if
they didn’t appear colored, would appear more similar.
This link between good color vision and the ability to
detect colored food has led to the proposal that monkey
and human color vision may have evolved for the express
purpose of detecting fruit (Mollon, 1989, 1997; Sumner &
Mollon, 2000; Walls, 1942). This suggestion sounds reasonable when we consider the difficulty color-blind human
observers have when confronted with the seemingly simple
task of picking berries. Knut Nordby (1990), a totally color-
Bruce Goldstein
(a)
(b)
Figure 9.1 ❚ (a) Red berries in green foliage. (b) These
berries become more difficult to detect without color vision.
blind visual scientist who sees the world in shades of gray,
described his experience as follows: “Picking berries has
always been a big problem. I often have to grope around
among the leaves with my fingers, feeling for the berries by
their shape” (p. 308). If Nordby’s experience, which is similar to Mr. I’s difficulty in seeing his dog against foliage, is
any indication, a color-blind monkey would have difficulty
finding berries or fruit and might be less likely to survive
than monkeys with color vision.
Our ability to perceive color not only helps us detect objects that might otherwise be obscured by their surroundings; it also helps us recognize and identify things we can
see easily. James W. Tanaka and L. M. Presnell (1999) demonstrated this by asking observers to identify objects like
the ones in Figure 9.2, which appeared either in their normal colors, like the yellow banana, or in inappropriate colors, like the purple banana. The result was that observers
recognized the appropriately colored objects more rapidly
and accurately. Thus, knowing the colors of familiar objects
helps us to recognize these objects (Tanaka et al., 2001).
(Remember from Chapter 5, page 115, that color also helps
us process complex scenes.)
(Abramov & Gordon, 1994; Hurvich, 1981). When people
are presented with many different colors and are asked to
describe them, they can describe all of them when they are
allowed to use all four of these terms, but they can’t when
one of these terms is omitted. Other colors, such as orange,
violet, purple, and brown, are not needed to achieve these
descriptions (Fuld et al., 1981; Quinn et al., 1988). Color researchers therefore consider red, yellow, green, and blue to
be basic colors (Backhaus, 1998).
Figure 9.3 shows the four basic colors arranged in a
circle, so that each is perceptually similar to the one next
to it. The order of the four basic colors in the color circle—
blue, green, yellow, and red—matches the order of the colors in the visible spectrum, shown in Figure 9.4, in which
the short-wavelength end of the spectrum is blue, green is
in the middle of the spectrum, and yellow and red are at
the long-wavelength end of the spectrum. The color circle
also contains the colors brown and purple, which are called
extraspectral colors because they do not appear in the spectrum. Brown is actually a mixture of either red, orange,
or yellow with black, and purple is created by mixing red
and blue.
Although the color circle is based on four colors, there
are more than four colors in the circle. In fact, people can
discriminate between about 200 different colors across
the length of the visible spectrum (Gouras, 1991). Furthermore, we can create even more colors by changing the intensity to make colors brighter or dimmer, or by adding white
to change a color’s saturation. White is equal amounts
What Colors Do We Perceive?
We can describe all the colors we can perceive by using
the terms red, yellow, green, blue, and their combinations
Image not available due to copyright restrictions
Figure 9.2 ❚ Normally colored fruit and inappropriately
colored fruit. (From Tanaka, J. W., Weiskopf, D., & Williams, P.
The role of color in high-level vision. Trends in Cognitive
Sciences, 5, 211–215. Copyright 2001, with permission from
Elsevier.)
Introduction to Color
203
400
500
600
700
Wavelength (nm)
Figure 9.4 ❚ The visible spectrum.
90
White paper
80
Reflectance (percentage)
of all wavelengths across the spectrum, and adding white
decreases a color’s saturation. For example, adding white
to the deep red at the top of the color circle makes it become pink, which is a less saturated (or desaturated) form
of red.
By changing the wavelength, the intensity, and the
saturation, we can create about a million or more different
discriminable colors (Backhaus, 1998; Gouras, 1991). But
although we may be able to discriminate millions of colors,
we encounter only a fraction of that number in everyday experience. The paint chips at the paint store total less than a
thousand, and the Munsell Book of Colors, once the color “bible” for designers, contained 1,225 color samples (Wysecki &
Stiles, 1965). The Pantone Matching System in current use
by graphic artists has about 1,200 color choices.
Having described the different colors we can perceive,
we now turn to the question of how these colors come about.
What causes us to perceive a tomato as red or a banana as
yellow? Our first answer to this question is that these colors
are related to the wavelength of light.
70
60
50
Yellow
pigment
Green
pigment
Blue
pigment
Tomato
40
30
Gray card
20
10
0
400
Black paper
450
500
550
600
650
700
Wavelength (nm)
Figure 9.5 ❚ Reflectance curves for surfaces that appear
white, gray, and black, and for blue, green and yellow
pigments. (Adapted from Clulow, F. W. (1972). Color: Its
principles and their applications. New York: Morgan &
Morgan.)
Color and Wavelength
The first step in understanding how our nervous system
creates our perception of color is to consider the visible
spectrum in Figure 9.4. When we introduced this spectrum
in Chapter 3 (page 44), we saw that the perception of color
is associated with the physical property of wavelength.
The spectrum stretches from short wavelengths (400 nm)
to long wavelengths (700 nm), and bands of wavelengths
within this range are associated with different colors. Wavelengths from about 400 to 450 nm appear violet; 450 to 490
nm, blue; 500 to 575 nm, green; 575 to 590 nm, yellow; 590
to 620 nm, orange; and 620 to 700 nm, red.
TABLE 9.1
❚ Relationship Between Predominant
Wavelengths Reflected and Color
Perceived
WAVELENGTHS REFLECTED
PERCEIVED COLOR
Short
Blue
Medium
Green
Long
Red
Long and medium
Yellow
Long, medium, and short
White
Reflectance and Transmission The colors of
light in the spectrum are related to their wavelengths, but
what about the colors of objects? The colors of objects are
largely determined by the wavelengths of light that are reflected from the objects into our eyes. This is illustrated in
Figure 9.5, which shows reflectance curves—plots of the
percentage of light reflected versus wavelength—for a number of objects. Notice that black paper and white paper
both reflect all wavelengths equally across the spectrum,
but blue, green, and yellow paint and a tomato reflect some
wavelengths but not others.
When some wavelengths are reflected more than
others—as for the colored paints and the tomato—we call
204
CHAPTER 9
Perceiving Color
these chromatic colors, or hues.2 This property of reflecting some wavelengths more than others, which is a characteristic of chromatic colors, is called selective reflection.
Table 9.1 indicates the relationship between the wavelengths reflected and the color perceived. When light reflection is similar across the full spectrum—that is, contains no
2
The term hue is rarely used in everyday language. We usually say “The color
of the fi re engine is red” rather than “The hue (or chromatic color) of the fi re
engine is red.” Therefore, throughout the rest of this book, we will use the word
color to mean “chromatic color” or “hue,” and we will use the term achromatic
color to refer to white, gray, or black.
hue—as in white, black, and all the grays between these two
extremes, we call these colors achromatic colors.
Most colors in the environment are created by the way
objects selectively reflect some wavelengths. But in the case
of things that are transparent, such as liquids, plastics, and
glass, chromatic color is created by selective transmission,
meaning that only some wavelengths pass through the object or substance. For example, cranberry juice selectively
transmits long-wavelength light and appears red, whereas
limeade selectively transmits medium-wavelength light and
appears green.
The idea that the color we perceive depends largely on
the wavelengths of light that are reflected into our eye provides a way to explain what happens when we mix different
colors together. We will describe two ways of mixing colors:
mixing lights and mixing paints.
Mixing Lights If a light that appears blue is projected
onto a white surface and a light that appears yellow is superimposed onto the blue, the area that is superimposed
is perceived as white (Figure 9.6). Although this result may
surprise you if you have ever mixed blue and yellow paints
to create green, we can understand why this occurs by considering the wavelengths that the mixture of blue and yellow lights reflect into the eye. Because the two spots of light
are projected onto a white surface, all of the wavelengths
that hit the surface are reflected into an observer’s eye (see
the reflectance curve for white paper in Figure 9.5). The blue
spot consists of a band of short wavelengths; when it is projected alone, the short-wavelength light is reflected into the
observer’s eyes (Table 9.2). Similarly, the yellow spot consists
of medium and long wavelengths, so when presented alone,
these wavelengths are reflected into the observer’s eyes. The
key to understanding what happens when colored lights
are superimposed is that all of the light that is reflected from the
surface by each light when alone is also reflected when the lights are
Short
wavelengths
TABLE 9.2
❚ Mixing Blue and Yellow Lights (Additive
Color Mixture)
Parts of the spectrum that are reflected from a white surface
for blue and yellow spots of light projected onto the surface.
Wavelengths that are reflected are highlighted.
WAVELENGTHS
Spot of
blue light
Spot of
yellow
light
Overlapping
blue and
yellow
spots
SHORT
MEDIUM
Reflected
No Reflection No Reflection
No Reflection
Reflected
Reflected
Reflected
Reflected
Reflected
superimposed. Thus, where the two spots are superimposed,
the light from the blue spot and the light from the yellow
spot are still reflected into the observer’s eye. The addedtogether light therefore contains short, medium, and long
wavelengths, which results in the perception of white. Because mixing lights involves adding up the wavelengths
of each light in the mixture, mixing lights is called V
L 1
additive color mixture.
Mixing Paints We can appreciate why we see differ-
ent colors when mixing paints than when mixing lights by
considering the blobs of paint in Figure 9.7. The blue blob
absorbs long-wavelength light and reflects some shortwavelength light and some medium-wavelength light (see
the reflectance curve for “blue pigment” in Figure 9.5). The
yellow blob absorbs short-wavelength light and reflects
some medium- and long-wavelength light (see the reflectance curve for “yellow pigment” in Figure 9.5).
The key to understanding what happens when colored
paints are mixed together is that when mixed, both paints still
absorb the same wavelengths they absorbed when alone, so the only
wavelengths reflected are those that are reflected by both paints in
common. Because medium wavelengths are the only ones reflected by both paints in common, a mixture of blue and
S M L
S
Medium +
long wavelengths
LONG
S M L
S M L
L
m
m
Blue paint
Yellow paint
m
Short + medium +
long wavelengths
Figure 9.6 ❚ Color mixing with light. Superimposing a blue
light and a yellow light creates the perception of white in the
area of overlap. This is additive color mixing.
Blue paint
+ Yellow paint
Figure 9.7 ❚ Color mixing with paint. Mixing blue paint
and yellow paint creates a paint that appears green. This is
subtractive color mixture.
Introduction to Color
205
TABLE 9.3
❚ Mixing Blue and Yellow Paints
(Subtractive Color Mixture)
Parts of the spectrum that are absorbed and reflected by
blue and yellow paint. Wavelengths that are reflected are
highlighted for each paint. Light that is usually seen as green
is the only light that is reflected in common by both paints.
WAVELENGTHS
SHORT
MEDIUM
LONG
Blob of
blue paint
Reflects all
Reflects some
Absorbs all
Blob of
yellow paint
Absorbs all Reflects some
Mixture of
blue and
yellow blobs
Absorbs all
Reflects some
Reflects some Absorbs all
yellow paints appears green (Table 9.3). Because each blob
of paint absorbs wavelengths and these wavelengths are
still absorbed by the mixture, mixing paints is called subtractive color mixture. The blue and yellow blobs subtract
all of the wavelengths except some that are associated with
green.
The reason that our blue and yellow mixture results
in green is that both paints reflect a little green (see the
overlap between the blue and yellow pigment curves in
Figure 9.5). If our blue paint had reflected only short wavelengths and our yellow paint had reflected only medium
and long wavelengths, these paints would reflect no color
in common, so mixing them would result in little or no
reflection across the spectrum, and the mixture would appear black. It is rare, however, for paints to reflect light in
only one region of the spectrum. Most paints reflect a broad
band of wavelengths. If paints didn’t reflect a range of wavelengths, then many of the color-mixing effects of paints
that we take for granted would not occur.
We can summarize the connection between wavelength
and color as follows:
■
Colors of light are associated with wavelengths in the
visible spectrum.
■
The colors of objects are associated with which wavelengths are refl ected (for opaque objects) or transmitted
(for transparent objects).
■
The colors that occur when we mix colors are also associated with which wavelengths are reflected into
the eye. Mixing lights causes more wavelengths to be
reflected (each light adds wavelengths to the mixture);
mixing paints causes fewer wavelengths to be reflected
(each paint subtracts wavelengths from the mixture).
We will see later in the chapter that things other than
the wavelengths reflected into our eye can influence color
206
CHAPTER 9
Perceiving Color
perception. For example, our perception of an object’s color
can be influenced by the background on which the object is
seen. But for now our main goal is to focus on the connection between wavelength and color.
Wavelengths Do Not Have Color!
After establishing that our perception of color is closely
linked to wavelength, how can the title of this section—that
wavelengths don’t have color—be true? Our explanation begins with the following statement by Isaac Newton.
The Rays to speak properly are not coloured. In
them there is nothing else than a certain Power
and Disposition to stir up a Sensation of this
or that Colour. . . . So Colours in the Object are
nothing but a Disposition to reflect this or that
sort of Rays more copiously than the rest. . . .
(Optiks, 1704)
Newton’s idea is that the colors that we see in response
to different wavelengths are not contained in the rays of
light themselves. Instead, these colors are created by our perceptual system. What this means is that although we can relate specific colors to specific wavelengths, the connection
between wavelength and the experience we call “color” is
an arbitrary one. Light rays are simply energy, and there
is nothing intrinsically “blue” about short wavelengths or
“red” about long wavelengths. Looking at it this way, color
is not a property of wavelength but is the brain’s way of informing us what wavelengths are present.
We can appreciate the role of the nervous system in creating color experience by considering that people like Mr. I
see no colors, even though they are receiving the same stimuli as people with normal color vision. Also, many animals
perceive either no color or a greatly reduced palette of colors
compared to humans. This occurs not because they receive
different kinds of light energy than humans, but because
their nervous system processes wavelength information differently and doesn’t transform wavelength information into
the perception of color.
The question of exactly how the nervous system accomplishes the transformation from wavelengths into the experience of color has not been answered. Rather than try to
answer the extremely difficult question of how the nervous
system creates experiences (see “The Mind–Body Problem,”
Chapter 2, p. 39), researchers have instead focused on the
question of how the nervous system determines which wavelengths are present. We will now consider two theories of
color vision that deal with that question. Both of these theories were proposed in the 1800s based on behavioral data,
and both are basically correct. As we will see, the physiological evidence to support them didn’t become available until
more than 100 years after they were originally proposed.
We will consider each of the theories in turn, first describing the behavioral evidence on which the theory was
based and then describing the physiological evidence that
became available later.
Trichromatic Theory
of Color Vision
The trichromatic theory of color vision, which states that
color vision depends on the activity of three different receptor mechanisms, was proposed by two eminent 19thcentury researchers, Thomas Young (1773–1829) and Hermann von Helmholtz (1821–1894). They based their theory
on the results of a psychophysical procedure called color
matching.
Behavioral Evidence for the Theory
In Helmholtz’s color-matching experiments, observers adjusted the amounts of three different wavelengths of light
mixed together in a “comparison field” until the color of
this mixture matched the color of a single wavelength in
a “test field.” For example, an observer might be asked to
adjust the amount of 420-nm, 560-nm, and 640-nm light
in a comparison field until the field matched the color of a
500-nm light presented in the test field (Figure 9.8). (Any
three wavelengths can be used, as long as any of them can’t
be matched by mixing the other two.) The key findings of
these color-matching experiments were as follows:
1. By correctly adjusting the proportions of three wavelengths in the comparison field, it was possible to
match any wavelength in the test field.
2. People with normal color vision cannot match all
wavelengths in the spectrum with only two wavelengths. For example, if they were given only the 420nm and 640-nm lights to mix, they would be unable
to match certain colors. People who are color deficient, and therefore can’t perceive all colors in the
spectrum, can match the colors of all wavelengths in
the spectrum by mixing only two other wavelengths.
The Theory: Vision Is Trichromatic
Thomas Young (1802) proposed the trichromatic theory of
color vision based on the finding that people with normal
420 nm
500 nm
Test field
560 nm
640 nm
Comparison field
Figure 9.8 ❚ In a color-matching experiment, the observer
adjusts the amount of three wavelengths in one field (right)
until it matches the color of the single wavelength in another
field (left).
color vision need at least three wavelengths to match any
wavelength in the test field. This theory was later championed and refined by Helmholtz (1852) and is therefore also
called the Young-Helmholtz theory of color vision. The
central idea of the theory is that color vision depends on
three receptor mechanisms, each with different spectral
sensitivities. (Remember from Chapter 3 that spectral sensitivity indicates the sensitivity to wavelengths across the visible spectrum, as shown in Figure 3.22.)
According to this theory, light of a particular wavelength stimulates the three receptor mechanisms to different degrees, and the pattern of activity in the three mechanisms results in the perception of a color. Each wavelength
is therefore represented in the nervous system by its own
pattern of activity in the three receptor mechanisms.
Physiology of Trichromatic Theory
More than a century after the trichromatic theory was first
proposed, physiological research identified the three receptor mechanisms proposed by the theory.
Cone Pigments Physiological researchers who were
working to identify the receptor mechanisms proposed by
trichromatic theory asked the following question: Are there
three mechanisms, and if so, what are their physiological
properties? This question was answered in the 1960s, when
researchers were able to measure the absorption spectra of
three different cone visual pigments, with maximum absorption in the short- (419-nm), middle- (531-nm), and longwavelength (558-nm) regions of the spectrum (S, M, and L in
Figure 9.9; P. K. Brown & Wald, 1964; Dartnall et al., 1983;
Schnapf et al., 1987). All visual pigments are made up of
a large protein component called opsin and a small lightsensitive component called retinal (see Chapter 3, page 48).
Differences in the structure of the long opsin part of the
pigments are responsible for the three different absorption
spectra (Nathans et al., 1986).
Cone Responding and Color Perception If
color perception is based on the pattern of activity of these
three receptor mechanisms, we should be able to determine
which colors will be perceived if we know the response of
each of the receptor mechanisms. Figure 9.10 shows the
relationship between the responses of the three kinds of
receptors and our perception of color. In this figure, the responses in the S, M, and L receptors are indicated by the size
of the receptors. For example, blue is signaled by a large response in the S receptor, a smaller response in the M receptor, and an even smaller response in the L receptor. Yellow
is signaled by a very small response in the S receptor and
large, approximately equal responses in the M and L receptors. White is signaled by equal activity in all of V
L 2, 3
the receptors.
Thinking of wavelengths as causing certain patterns of
receptor responding helps us to predict which colors should
result when we combine lights of different colors. We have
already seen that combining blue and yellow lights results
Trichromatic Theory of Color Vision
207
S
M
L
Relative proportion of
light absorbed
1.0
.75
.50
Figure 9.9 ❚ Absorption spectra of the
.25
three cone pigments. (From Dartnall, H. J. A.,
Bowmaker, J. K., & Mollon, J. D. (1983). Human
visual pigments: Microspectrophotometric results
from the eyes of seven persons. Proceedings of
the Royal Society of London B, 220, 115–130.)
0
400
450
500
550
600
650
Wavelength (nm)
S
M
L
S
Blue
M
L
530 + 620
580
L
Yellow
S
L
S
M
Green
8.0
1.0
5.0
Red
White
Figure 9.10 ❚ Patterns of firing of the three types of cones
to different colors. The size of the cone symbolizes the size of
the receptor’s response.
in white. The patterns of receptor activity in Figure 9.10
show that blue light causes high activity in the S receptors
and that yellow light causes high activity in the M and L
receptors. Thus, combining both lights should stimulate all
three receptors equally, which is associated with the perception of white.
Now that we know that our perception of colors is determined by the pattern of activity in different kinds of receptors, we can explain the physiological basis behind the
color-matching results that led to the proposal of trichromatic theory. Remember that in a color-matching experiment, a wavelength in one field is matched by adjusting the
proportions of three different wavelengths in another field
(Figure 9.8). This result is interesting because the lights in
the two fields are physically different (they contain different
wavelengths) but they are perceptually identical (they look
the same). This situation, in which two physically different
stimuli are perceptually identical, is called metamerism,
and the two identical fields in a color-matching experiment
are called metamers.
The reason metamers look alike is that they both result in the same pattern of response in the three cone receptors. For example, when the proportions of a 620-nm red
light and a 530-nm green light are adjusted so the mixture
matches the color of a 580-nm light, which looks yellow, the
208
CHAPTER 9
Perceiving Color
M
8.0
1.0
5.0
Figure 9.11 ❚ Principle behind metamerism. The
proportions of 530- and 620-nm lights in the field on the
left have been adjusted so that the mixture appear identical
to the 580-nm light in the field on the right. The numbers
indicate the responses of the short-, medium-, and longwavelength receptors. Because there is no difference in the
responses of the two sets of receptors, the two fields are
perceptually indistinguishable.
two mixed wavelengths create the same pattern of activity in
the cone receptors as the single 580-nm light (Figure 9.11).
The 530-nm green light causes a large response in the M receptor, and the 620-nm red light causes a large response in
the L receptor. Together, they result in a large response in
the M and L receptors and a much smaller response in the S
receptor. This is the pattern for yellow and is the same as the
pattern generated by the 580-nm light. Thus, even though
the lights in these two fields are physically different, the two
lights result in identical physiological responses and so are
identical, as far as the visual system is concerned.
Are Three Receptor Mechanisms Necessary for Color Vision? According to trichromatic
theory, a light’s wavelength is signaled by the pattern of activity of three receptor mechanisms. But do we need three
different mechanisms to see colors? The answer to this
question is that color vision is possible with two receptor
types but not with one. Let’s first consider why color vision
does not occur with just one receptor type.
We can understand why color vision is not possible with
just one receptor type by considering how Jay, who has just
one type of receptor, which contains a single visual pigment,
perceives the dresses worn by two women, Mary and Barbara. Mary and Barbara have just purchased dresses from
the “Monochromatic Dress Company,” which specializes in
dresses that reflect only one wavelength. (Such dresses don’t
exist, but let’s assume they do, for the purposes of this example.) Mary’s dress reflects only 550-nm light, and Barbara’s reflects only 590-nm light.
Let’s assume that Mary’s and Barbara’s dresses are illuminated by spotlights that are adjusted so that each dress
reflects 1,000 photons of light into Jay’s eye. (Remember
from page 49 in Chapter 3 that a photon is a small packet
of light energy, and that a visual pigment molecule is activated if it absorbs one photon.) To determine how this light
affects the pigment in Jay’s receptor, we refer to the absorption spectrum of Jay’s pigment, shown in Figure 9.12a. This
absorption spectrum indicates the fraction of light at each
wavelength that the pigment absorbs.
By taking into account the amount of light present
(1,000 photons) and the absorption spectrum, we can see
that 100 photons of the 550-nm light from Mary’s dress are
absorbed by Jay’s visual pigment (1,000 ⫻ 0.10 ⫽ 100) (Figure 9.12b), and 50 photons of the 590-nm light from Barbara’s dress are absorbed (1,000 ⫻ 0.05 ⫽ 50) (Figure 9.12c).
Because each photon of light activates one visual pigment
molecule, and each activated molecule increases the receptor’s electrical response, this means that Mary’s dress generates a larger signal in Jay’s retina than Barbara’s dress.
At this point you might say that Jay’s single pigment
did, in fact, enable him to distinguish Mary’s dress from
Barbara’s dress. However, if we increase the intensity of
Pigment 1
Fraction absorbed
.10
1
100
.05
1,000
of
550 nm
0
400
500
600
550 590
Wavelength (nm)
Mary
(a)
(b)
1
1
100
50
1,000
of
590 nm
2,000
of
590 nm
Barbara
(c)
Barbara
(d)
Figure 9.12 ❚ (a) Absorption spectrum of Jay’s visual pigment. The fractions of 550-nm and 590-nm lights
absorbed are indicated by the dashed lines. (b) The size of the cone indicates activation caused by the
reflection of 1,000 photons of 550-nm light by Mary’s dress. (c) The activation caused by the reflection of 1,000
photons of 590-nm light by Barbara’s dress. (d) The activation caused by the reflection of 2,000 photons of
590-nm light from Barbara’s dress. Notice that the cone response is the same in (b) and (d).
Trichromatic Theory of Color Vision
209
the spotlight on Barbara’s dress so that 2,000 photons of
590-nm light are reflected into Jay’s eyes, his pigment absorbs 100 photons of 590-nm light; now 100 pigment molecules are activated—the same as were activated by Mary’s
dress when illuminated by the dimmer light (Figure 9.12d).
(Notice that it doesn’t matter if the light absorbed by the
pigment is 550-nm light or 590-nm light. Once a photon is
absorbed, no matter what its wavelength, it has the same effect on the visual pigment.) Thus, by adjusting the intensity
of the light, we can cause Mary’s and Barbara’s dresses to
have exactly the same effect on Jay’s pigment. Therefore, Jay
cannot tell the difference between the two dresses based on
the wavelengths they reflect.
Another way to state this result is that a person with
only one visual pigment can match any wavelength in the
spectrum by adjusting the intensity of any other wavelength.
Thus, by adjusting the intensity appropriately, Jay can make
the 550-nm and 590-nm lights (or any other wavelengths)
look identical. Furthermore, Jay will perceive all of these
wavelengths as shades of gray.
How can the nervous system tell the difference between
Mary and Barbara’s dresses, no matter what the light intensity? The answer to this question is that adding a second pigment makes it possible to distinguish between wavelengths
independent of light intensity. We can see why this is so by
considering Dan, who has two visual pigments, pigment 1,
which is the same as Jay’s pigment, and pigment 2, which
has an absorption spectrum that indicates that the fraction
of light absorbed for 550-nm is 0.05 and the fraction for
590-nm is 0.01 (Figure 9.13a).
Figure 9.13b shows that when Mary’s dress is illuminated by the dim light, 100 molecules of pigment 1 are acti-
Pigment 2
2
1
50
100
Fraction absorbed
.10
1,000
of
550 nm
.05
.01
0
400
500
600
550 590
Wavelength (nm)
(a)
Mary
(b)
2
2
1
1
20
10
100
50
1,000
of
590 nm
2,000
of
590 nm
Barbara
(c)
Barbara
(d)
Figure 9.13 ❚ The same as Figure 9.12, but with a second pigment added. (a) Absorption spectrum of pigment 2, with
the fraction absorbed by 550-nm and 590-nm indicated by the dashed lines. (b) Response of the two types of cones
when they absorb light from Mary’s dress. The response of cone 1 is on the right. (c) Response caused by light reflected
from Barbara’s dress at the same intensity. (d) Response from Barbara’s dress at a higher intensity. Notice that the cone
response is different in (b) and (d).
210
CHAPTER 9
Perceiving Color
vated, as before, and 50 molecules of pigment 2 are activated
(1,000 ⫻ 0.05 ⫽ 50). Figure 9.13c shows that for Barbara’s
dress, 50 molecules of pigment 1 are activated, as before,
and 10 molecules of pigment 2 are activated (1,000 ⫻ 0.01
⫽ 10).
Thus, when both Mary and Barbara are illuminated
by the dim light, their dresses activate the receptors differently, just as occurred in the single-pigment example. But
when we increase the illumination on Barbara, as we did before, we see that the pattern of receptor activation caused by
Barbara’s dress is still different from the pattern for Mary’s
dress (Figure 9.13d). Adding the second pigment causes
Mary’s and Barbara’s dresses to have different effects, even
when we change the illumination. So color vision becomes
possible when there are two pigments.
Notice that the ratios of response caused by the two pigments are the same for a particular wavelength, no matter
what the intensity. The ratio for the 550-nm light is always 2
to 1, and the ratio for the 590-nm light is always 5 to 1. Thus,
the visual system can use this ratio information to determine the wavelength of any light. This is what trichromatic
theory proposes when it states that color perception depends
on the pattern of activity in three receptor mechanisms.
As we will see when we consider color deficiency in the
next section, there are people with just two types of cone
pigment. These people, called dichromats, do see colors, just
as our calculations predict, but they see fewer colors than
people with three visual pigments, who are called trichromats. The addition of a third pigment, although not necessary for creating color vision, increases the number of colors
that can be seen across the visual spectrum.
T E S T YO U R S E L F 9.1
1. What are the various functions of color vision?
2. What physical characteristic is most closely as-
3.
4.
5.
6.
7.
sociated with color perception? How is this demonstrated by differences in reflection of different
objects?
Describe additive color mixture and subtractive
color mixture. How can the results of these two
types of color mixing be related to the wavelengths
that are reflected into an observer’s eyes?
Describe trichromatic theory and the experiments
on which it was based. How does this theory explain
the results of color-matching experiments?
Describe how trichromatic theory is based on cone
pigments and how the code for color can be determined by the activity of the cones.
What are metamers, and how can our perception
of metamers be explained by the code for color
described above?
Why is color vision possible when there are only two
different cone pigments but not possible when there
is just one pigment? What is the effect on color vision of having three pigments rather than just two?
Color Deficiency
It has long been known that some people have difficulty
perceiving certain colors. We have described the case of Mr.
I, who lost his ability to see color due to brain damage. However, most problems with color vision involve only a partial
loss of color perception, called color deficiency, and are associated with problems with the receptors in the retina.
In a famous early report of color deficiency, the wellknown 18th-century chemist John Dalton (1798/1948) described his own color perceptions as follows: “All crimsons
appear to me to consist chiefly of dark blue: but many of
them seem to have a tinge of dark brown. I have seen specimens of crimson, claret, and mud, which were very nearly
alike” (p. 102).
Dalton’s descriptions of his abnormal color perceptions
led to the early use of the term Daltonism to describe color deficiency. We now know that there are a number of different
types of color deficiency. This has been determined by color
vision tests like the ones shown in Figure 9.14a, which are
called Ishihara plates. In this example, people with normal
color vision see a “74,” but people with a form of red–green
color deficiency might see something like the depiction in
Figure 9.14b, in which the “74” is not visible. Another way
to determine the presence of color deficiency is by using the
color-matching procedure to determine the minimum number of wavelengths needed to match any other wavelength
in the spectrum. This procedure has revealed the
VL 4
following three types of color deficiency:
1. A monochromat can match any wavelength in the
spectrum by adjusting the intensity of any other
wavelength. Thus, a monochromat needs only one
wavelength to match any color in the spectrum and
sees only in shades of gray. Jay, from our example in
Figure 9.12, is a monochromat.
2. A dichromat needs only two wavelengths to match
all other wavelengths in the spectrum. Dan, from
Figure 9.13, is a dichromat.
3. An anomalous trichromat needs three wavelengths
to match any wavelength, just as a normal trichromat
does. However, the anomalous trichromat mixes these
wavelengths in different proportions from a trichromat, and an anomalous trichromat is not as good as
a trichromat at discriminating between wavelengths
that are close together.
Once we have determined whether a person’s vision
is color deficient, we are still left with the question: What
colors does a person with color deficiency see? When I pose
this question in my class, a few students suggest that we can
answer it by pointing to objects of various colors and asking a color deficient person what he sees. (Most color deficient people are male; see page 212.) This method does not
really tell us what the person perceives, however, because
a color deficient person may say “red” when we point to a
strawberry simply because he has learned that people call
Color Deficiency
211
(a)
(b)
strawberries “red.” It is quite likely that the color deficient
person’s experience of “red” is very different from the experience of the person without color deficiency. For all we
know, he may be having an experience similar to what a person without deficient color vision would call “yellow.”
To determine what a dichromat perceives, we need
to locate a unilateral dichromat—a person with trichromatic vision in one eye and dichromatic vision in the other.
Both of the unilateral dichromat’s eyes are connected to
the same brain, so this person can look at a color with his
dichromatic eye and then determine which color it corresponds to in his trichromatic eye. Although unilateral dichromats are extremely rare, the few who have been tested
have helped us determine the nature of a dichromat’s color
experience (Alpern et al., 1983; Graham et al., 1961; Sloan &
Wollach, 1948). Let’s now look at the nature of the color experience of both monochromats and dichromats.
Monochromatism
Monochromatism is a rare form of color blindness that is
usually hereditary and occurs in only about 10 people out
of 1 million (LeGrand, 1957). Monochromats usually have
no functioning cones; therefore, their vision has the characteristics of rod vision in both dim and bright lights. Monochromats see everything in shades of lightness (white, gray,
and black) and can therefore be called color-blind (as opposed to dichromats, who see some chromatic colors and
therefore should be called color deficient).
In addition to a loss of color vision, people with hereditary monochromatism have poor visual acuity and are so
sensitive to bright lights that they often must protect their
eyes with dark glasses during the day. The rod system is not
designed to function in bright light and so becomes overloaded in strong illumination, creating a perception V
L 5
of glare.
212
Figure 9.14 ❚ (a) Ishihara
plate for testing color
deficiency. A person with
normal color vision sees
a “74” when the plate is
viewed under standardized
illumination. (b) Ishihara
plate as perceived by a
person with a form of red–
green color deficiency.
CHAPTER 9
Perceiving Color
Dichromatism
Dichromats experience some colors, though a lesser range
than trichromats. There are three major forms of dichromatism: protanopia, deuteranopia, and tritanopia. The
two most common kinds, protanopia and deuteranopia,
are inherited through a gene located on the X chromosome
(Nathans et al., 1986).
Males (XY) have only one X chromosome, so a defect in
the visual pigment gene on this chromosome causes color
deficiency. Females (XX), on the other hand, with their
two X chromosomes, are less likely to become color deficient, because only one normal gene is required for normal
color vision. These forms of color vision are therefore called
sex-linked because women can carry the gene for color deficiency without being color deficient themselves, and they
can pass the condition to their male offspring. Thus, many
more men than women are dichromats. As we describe what
the three types of dichromats perceive, we use as our reference points Figures 9.15d and 9.16d, which show how a trichromat perceives a bunch of colored paper flowers V
L 6
and the visible spectrum, respectively.
■
Protanopia affects 1 percent of males and 0.02 percent of females and results in the perception of colors
shown in Figure 9.15a. A protanope perceives shortwavelength light as blue, and as wavelength is increased, the blue becomes less and less saturated until,
at 492 nm, the protanope perceives gray (Figure 9.16a).
The wavelength at which the protanope perceives gray
is called the neutral point. At wavelengths above the
neutral point, the protanope perceives yellow, which
becomes increasingly saturated as wavelength is increased, until at the long-wavelength end of the spectrum the protanope perceives a saturated yellow.
■
Deuteranopia affects about 1 percent of males and
0.01 percent of females and results in the perception
of color in Figure 9.15b. A deuteranope perceives blue
at short wavelengths, sees yellow at long wavelengths,
and has a neutral point at about 498 nm (Figure 9.16b)
(Boynton, 1979).
■
(a)
(b)
Tritanopia is very rare, affecting only about 0.002
percent of males and 0.001 percent of females. A tritanope sees colors as in Figure 9.15c, and sees the spectrum as in Figure 9.16c—blue at short wavelengths,
red at long wavelengths, and a neutral point at V
L 7
570 nm (Alpern et al., 1983).
Bruce Goldstein
Physiological Mechanisms of ReceptorBased Color Deficiency
(c)
(d)
Figure 9.15 ❚ How colored paper flowers appear to
(a) protanopes; (b) deuteranopes; (c) tritanopes; and
(d) trichromats. (Color processing courtesy of John Carroll.)
Protanope
400
700
492
(a)
Deuteranope
400
700
498
What are the physiological mechanisms of color deficiency?
Most monochromats have no color vision because they have
just one type of cone or no cones. Dichromats are missing
one visual pigment, with the protanope missing the longwavelength pigment and the deuteranope missing the
medium-wavelength pigment (W. A. H. Rushton, 1964).
Because of the tritanope’s rarity and because of the low
number of short-wavelength cones even in normal retinas,
it has been difficult to determine which pigment tritanopes are missing, but they are probably missing the shortwavelength pigment.
Genetic research has identified differences in the genes
that determine visual pigment structure in trichromats and
dichromats (Nathans et al., 1986). Based on this research, it
has also been suggested that anomalous trichromats probably match colors differently from normal trichromats and
have more difficulty discriminating between some wavelengths because their M and L pigment spectra have been
shifted so they are closer together (Neitz et al., 1991).
Opponent-Process Theory
of Color Vision
(b)
Tritanope
400
700
570
(c)
400
500
600
700
Although trichromatic theory explains a number of color
vision phenomena, including color matching and color mixing, and some facts about color deficiency, there are some
color perceptions it cannot explain. These color perceptions
were demonstrated by Ewald Hering (1834–1918), another
eminent physiologist who was working at about the same
time as Helmholtz. Hering used the results of phenomenological observations, in which stimuli were presented and
observers described what they perceived, to propose the
opponent-process theory of color vision. This theory states
that color vision is caused by opposing responses generated
by blue and yellow and by red and green.
(d)
Figure 9.16 ❚ How the visible spectrum appears to
(a) protanopes; (b) deuteranopes; (c) tritanopes; and
(d) trichromats. The number indicates the wavelength
of the neutral point.
Behavioral Evidence for the Theory
You can make some phenomenological observations similar
to Hering’s by doing the following demonstrations.
Opponent-Process Theory of Color Vision
213
D E M O N S T R AT I O N
The Colors of the Flag
red and green, and blue and yellow, have changed places.
(Note that the colors associated with long wavelengths—red
and yellow—are on the right in the figure, and switch to the
left in the afterimage.) Based on observations such as these,
Hering proposed that red and green are paired and blue and
yellow are paired. Here is another demonstration that illustrates this pairing.
D E M O N S T R AT I O N
Afterimages and Simultaneous Contrast
Figure 9.17 ❚ Stimulus for afterimage demonstration.
Look at the cross at the center of the strangely colored
American flag in Figure 9.17 for about 30 seconds. If you
then look at a piece of white paper and blink, the image
you see, which is called an afterimage, has colors that
probably match the red, white, and blue of the American flag.
Notice that the green area of the flag in Figure 9.17 created
a red afterimage, and the yellow area created a V
L 8
blue afterimage.❚
Although Hering didn’t use a strangely colored flag to
create afterimages, he did observe that viewing a green field
generates a red afterimage, and viewing a yellow field creates a blue afterimage. He also observed the opposite—viewing green causes a red afterimage, and viewing blue causes
a yellow afterimage. You can demonstrate that this works
both ways by looking at the center of Figure 9.18 for 30 seconds and then looking at a white surface and noticing how
Cut out a 1/2-inch square of white paper and place it in the
center of the green square in Figure 9.18. Cover the other
squares with white paper and stare at the center of the white
square for about 30 seconds. Then look at a white background and blink to observe the afterimage. What color is
the outside area of the afterimage? What color is the small
square in the center? Repeat your observations on the red,
blue, and yellow squares in Figure 9.18. ❚
When you made your observations using the green
square, you probably confirmed your previous observation
that green and red are paired because the afterimage corresponding to the green area of the original square is red.
But the color of the small square in the center also shows
that green and red are paired: Most people see a green
square inside the red afterimage. This green afterimage is
due to simultaneous color contrast, an effect that occurs
when surrounding an area with a color changes the appearance of the surrounded area. In this case, the red afterimage surrounds a white area and causes the white area to appear green. Table 9.4 summarizes this result and the results
that occur when we repeat this demonstration on the other
squares. All of these results show a clear pairing of V
L 9
red and green and of blue and yellow.
TABLE 9.4
ORIGINAL SQUARE
Results of Afterimage and Simultaneous
Contrast Demonstration
COLOR OF OUTSIDE
AFTERIMAGE
COLOR OF INSIDE
AFTERIMAGE
Red
Green
Red
Green
Red
Blue
Yellow
Blue
Yellow
Blue
Yellow
Green
D E M O N S T R AT I O N
Visualizing Colors
Figure 9.18 ❚ Color matrix for afterimage and simultaneous
contrast demonstrations.
214
CHAPTER 9
Perceiving Color
This demonstration involves visualizing colors. Start by
visualizing the color red, with your eyes either opened or
closed, whichever works best for you. Attach this color to
a specific object such as a fire engine, if that makes your
visualizing easier. Now visualize a reddish-yellow and then a
reddish-green. Which of these two combinations is easier to
visualize? Now do the same thing for blue. Visualize a pure
blue, then a bluish-green and a bluish-yellow. Again, which of
these combinations is easier to visualize? ❚
Most people find it easy to visualize a bluish-green or
a reddish-yellow, but find it difficult (or impossible) to visualize a reddish-green or a bluish-yellow. In other experiments, in which observers were shown patches of color and
were asked to estimate the percentages of blue, green, yellow, and red in each patch, they rarely reported seeing blue
and yellow or red and green at the same time (Abramov &
Gordon, 1994), just as the results of the visualization demonstration would predict.
The above observations, plus Hering’s observation
that people who are color-blind to red are also color-blind
to green, and that people who can’t see blue also can’t see
yellow, led to the conclusion that red and green are paired
and that blue and yellow are paired. Based on this conclusion, Hering proposed the opponent-process theory of color
vision (Hering, 1878, 1905, 1964).
The Theory: Vision Is an
Opponent Process
The basic idea underlying Hering’s theory is shown in Figure 9.19. He proposed three mechanisms, each of which
responds in opposite ways to different intensities or wavelengths of light. The Black (⫺) White (⫹) mechanism responds positively to white light and negatively to the absence of light. Red (⫹) Green (⫺) responds positively to red
and negatively to green, and Blue (⫺) Yellow (⫹) responds
negatively to blue and positively to yellow. Although Hering’s phenomenological observations supported his theory,
it wasn’t until many years later that modern physiological
research showed that these colors do cause physiologically
opposite responses.
vided physiological evidence for neurons that respond in
opposite ways to blue and yellow and to red and green.
Opponent Neurons In the 1950s and ’60s research-
ers began finding opponent neurons in the retina and lateral geniculate nucleus that responded with an excitatory
response to light from one part of the spectrum and with
an inhibitory response to light from another part (R. L.
DeValois, 1960; Svaetichin, 1956). For example, the left
column of Figure 9.20 shows records for a neuron that responds to short-wavelength light with an increase in firing
and to long-wavelength light with a decrease in firing. (Notice that firing decreases to below the level of spontaneous
activity.) This neuron is called a B⫹ Y⫺ neuron because the
wavelengths that cause an increase in firing are in the blue
part of the spectrum, and the wavelengths that cause a decrease are in the yellow part of the spectrum.
The right column of Figure 9.20 shows records for an
R⫹ G⫺ neuron, which increases firing to light in the red
part of the spectrum and decreases firing to light in the
green part of the spectrum. There are also B⫺ Y⫹ and G⫹
R⫺ neurons (R. L. DeValois et al., 1966).
How Opponent Responding Can Be Created by Three Receptors The discovery of oppo-
nent neurons provided physiological evidence for opponent
process theory to go with the three different cone pigments
of trichromatic theory. This evidence, which was not available in the 1800s, showed modern researchers that both
trichromatic and opponent-process theories are correct and
that each one describes physiological mechanisms at different places in the visual system. Figure 9.21 shows how this
works.
Trichromatic theory describes what is happening at
the beginning of the visual system, in the receptors of the
retina. Each wavelength causes a different ratio of response
in the three different kinds of cone receptors, and it takes
a minimum of three wavelengths to match any wavelength
B+Y–
R+G–
Spontaneous
The Physiology of Opponent-Process
Vision
Modern physiological research, which has measured the response of single neurons to different wavelengths, has pro-
450 nm (blue)
510 nm (green)
580 nm (yellow)
B
–
W
+
R
+
G
–
B
–
Y
+
660 nm (red)
Figure 9.20 ❚ Responses of B⫹ Y⫺ and R⫹ G⫺ opponent
Figure 9.19 ❚ The three opponent mechanisms proposed
by Hering.
cells in the monkey’s lateral geniculate nucleus. (From
DeValois, R. L., & Jacobs, G. H. (1968). Primate color vision.
Science, 162, 533–540.)
Opponent-Process Theory of Color Vision
215
Trichromatic
Opponent-process
Receptors
Opponent
cells
Light
To brain
of inhibitory and excitatory synapses. Another way to describe this is that processing for color vision takes place in
two stages: First, the receptors respond with different patterns to different wavelengths (trichromatic theory), and
then other neurons integrate the inhibitory and excitatory
signals from the receptors (opponent-process V
L 10–12
theory).
Why Are Opponent Neurons Necessary? Our
Afterimages,
simultaneous
contrast
Figure 9.21 ❚ Our experience of color is shaped by
physiological mechanisms, both in the receptors and in
opponent neurons.
in the spectrum. Opponent-process theory describes events
later in the visual system. Opponent neurons are responsible for perceptual experiences such as afterimages and simultaneous contrast.
Figure 9.22 shows two neural circuits in which the
cones are wired in a way that creates two kinds of opponent neurons. In circuit 1, the short-wavelength cone sends
an excitatory signal to the ganglion cell, and the mediumand long-wavelength cones pool their activity and then
send inhibitory signals to this cell. (The bipolar cells have
been omitted to simplify the circuits.) This creates a B⫹ Y⫺
opponent neuron because stimulation of the shortwavelength cone increases firing and stimulation of the medium- or long-wavelength cones decreases firing. In circuit
2, the medium-wavelength cone sends excitatory signals and
the long-wavelength cone sends inhibitory signals to the
ganglion cell. This creates a G⫹ R⫺ opponent neuron, in
which stimulation of the medium-wavelength cone causes
an increase in firing and stimulation of the long-wavelength
cone causes a decrease in firing.
The important thing about these two circuits is that
their responses are determined both by the wavelengths to
which the receptors respond best and by the arrangement
neural circuit shows that wavelengths can be signaled in
two ways: (1) by trichromatic signals from the receptors,
and (2) by opponent signals in later neurons. But why are
two different ways of signaling wavelength necessary? Specifically, since the firing pattern of the three types of cone
receptors contains enough information to signal which
wavelength has been presented, why is this information
M
L
1
2
(a)
Receptor response
Color
matching
M
L
M
L
1
2
1
2
(b)
S
M
Circuit 2
L
M
L
(+)
(+)
L–M
opponent response
Circuit 1
(+)
(–)
(–)
(c)
Figure 9.23 ❚ (a) Response curves for the M and L
B+Y–
Response
G+R–
Response
Figure 9.22 ❚ Neural circuit showing how the blue–yellow
and red–green mechanisms can be created by excitatory and
inhibitory inputs from the three types of cone receptors.
216
CHAPTER 9
Perceiving Color
receptors. (b) Bar graph indicating the size of the responses
generated in the receptors by wavelengths 1 (left pair of
bars) and 2 (right pair). (c) Bar graph showing the opponent
response of the R⫹ G⫺ cell to wavelengths 1 and 2.
The response to 1 is inhibitory, and the response to 2 is
excitatory.
changed into opponent responses? The answer to this question is that opponent responding provides a way of specifying wavelengths that may be clearer and more efficient than
the ratio of the cone receptor responses.
To understand how this works, let’s consider how the
two cones in Figure 9.23a respond to two wavelengths, labeled 1 and 2. Figure 9.23b shows that when wavelength
1 is presented, receptor M responds more than receptor L,
and when wavelength 2 is presented, receptor L responds
more than receptor M. Although we can tell the difference
between the responses to these two wavelengths, the two
pairs of bars in Figure 9.23b look fairly similar. But taking
the difference between the response of the L cone and the
response of the M cone, enables us to tell the difference between wavelengths 1 and 2 much more easily (Figure 9.23c).
Thus, the information contained in the firing of opponent
cells transmits information about wavelength more efficiently than the information contained in the receptor response (Buchsbaum & Gottschalk, 1983).
Color in the Cortex
How is color represented in the cortex? One possible answer is that there is a specific area in the cortex, a specialized “color center,” that processes information about color
(Livingstone & Hubel, 1988; Zeki, 1983a, 1983b). Cerebral
achromatopsia, color blindness due to damage to the cortex, supports this idea. Although Mr. I’s cerebral achromatopsia meant that he could no longer see color, he still
had excellent visual acuity and could still see form and
movement. This absence of color perception, while other
visual functions remained relatively normal, supports the
idea that an area specialized for color perception had been
damaged.
However, when researchers record from neurons in
the cortex, a different picture emerges. They fi nd cortical neurons that respond to just some wavelengths in the
spectrum, and some neurons that have opponent responses
in many areas of the cortex, including the striate cortex
(V1) and other areas in the ventral processing stream (Figure 4.27). But these neurons that respond to color also usually respond to specific forms and orientations (Lennie et
al., 1990; Leventhal et al., 1995; Shein & Desimone, 1990).
Also, many of the wavelength-selective neurons in the area
originally designated as the “color module” respond to
white, leading some researchers to question the idea that
these neurons determine our perception of color (Gordon &
Abramov, 2001; also see Girard et al., 2002; Heywood &
Cowey, 1998; Hinkle & Connor, 2002).
Taken together, the evidence seems to show that there
may not be a single “module” for color vision (Engel et al.,
1997; Gegenfurtner, 2001; Zeki & Marini, 1998). Thus, color
vision presents an example of distributed processing in the
cortex, with a number of areas being involved in processing
wavelength information and creating color perception (Gegenfurtner, 2003; Solomon & Lennie, 2007).
Discovering the cortical mechanism for color perception is complicated because there are two issues involved in
determining how the cortex processes color information: (1)
Where is information about wavelength processed? (2) Where
is the perception of color determined? You might think these
are equivalent questions because color is determined largely
by wavelength. However, there are people who can use information about wavelength but can’t see colors. An example
is M.S., who suffered from cerebral achromatopsia due to
an illness that left his cone pigments intact but damaged
his cortex (Stoerig, 1998). Although he was able to use wavelength information being sent to the brain by the cones, he
could not see color. For example, he could detect the line
separating two adjacent fields consisting of different wavelengths, even though they both appeared the same shade
of gray.
Apparently, what is happening for M.S. is that wavelength information is being processed by the undamaged
area of his brain, but this information is not being transformed into the experience of color, presumably because
of damage to another area. Understanding how color perception occurs in the brain, therefore, involves determining both how wavelength information is processed and
how further processing of this information creates the
experience of color (Cowey & Heywood, 1997; Solomon &
Lennie, 2007).
T E S T YO U R S E L F 9. 2
1. What is color deficiency? How can it be detected
using the procedure of color mixing? How can we
determine how a color deficient person perceives
different wavelengths?
2. How is color deficiency caused by (a) problems with
the receptors? (b) damage to the cortex?
3. Describe opponent-process theory, including the
observations on which it is based and the physiological basis of this theory.
4. What is the evidence that a number of areas in the
cortex are involved in color vision? Why is it important to distinguish between processing information
about wavelength and perceiving color?
Perceiving Colors Under
Changing Illumination
It is midday, with the sun high in the sky, and as you are
walking to class you notice a classmate who is wearing a
green sweater. Then, a few minutes later, as you are sitting
in class, you again notice the same green sweater. The fact
that the sweater appears green both outside under sunlight
and inside under artificial indoor illumination may not
seem particularly remarkable. After all, the sweater is green,
Perceiving Colors Under Changing Illumination
217
Tungsten
150
100
Sunlight
50
0
400
500
600
Under
tungsten light
Light reflected from sweater
Relative amount of light
Reflectance curve
200
Under
daylight
700
Wavelength (nm)
Figure 9.24 ❚ The wavelength distribution of sunlight and of
light from a tungsten light bulb. (From Judd, D. B., MacAdam,
D. L., & Wyszecki, G. (1964). Spectral distribution of typical
daylight as a function of correlated color temperature. Journal
of the Optical Society of America, 54, 1031–1040.)
isn’t it? However, when we consider the interaction between
the illumination and the properties of the sweater, we can
appreciate that your perception of the sweater as green,
both outside and inside, represents a remarkable achievement of the visual system. This achievement is called color
constancy—we perceive the colors of objects as being relatively constant even under changing illumination.
We can appreciate why color constancy is an impressive achievement by considering the interaction between
illumination, such as sunlight or lightbulbs, and the reflection properties of an object, such as the green sweater.
First, let’s consider the illumination. Figure 9.24 shows the
wavelengths that are contained in sunlight and the wavelengths that are contained in light from a lightbulb. The
sunlight contains approximately equal amounts of energy
at all wavelengths, which is a characteristic of white light.
The bulb contains much more energy at long wavelengths.
This wavelength distribution is sometimes called “tungsten” light because it is produced by the tungsten fi lament
inside old-style lightbulbs (which are in the process of being
replaced with screw-in “twisty” fluorescent lightbulbs). This
large amount of long-wavelength light is why the tungsten
bulb looks slightly yellow.
Now consider the interaction between the wavelengths
produced by the illumination and the wavelengths reflected
from the green sweater. The reflectance curve of the sweater
is indicated by the green line in Figure 9.25. It reflects mostly
medium-wavelength light, as we would expect of something
that is green.
The actual light that is reflected from the sweater depends on both its reflectance curve and the illumination
that reaches the sweater and is then reflected from it. To determine the wavelengths that are actually reflected from the
sweater, we multiply the reflectance curve by the amount of
light provided by the illumination source (sunlight or tungsten) at each wavelength. This calculation indicates that the
sweater reflects more long-wavelength light when it is seen
218
CHAPTER 9
Perceiving Color
400
500
600
700
Wavelength (nm)
Figure 9.25 ❚ Reflectance curve of sweater, and light
reflected from sweater when illumination by tungsten light
and white light.
under tungsten illumination (orange line) than when it is
seen under sunlight (white line). The fact that we still see
the sweater as green even though the wavelength composition of the reflected light differs under different illuminations is color constancy. Without color constancy, the color
we see would depend on how the sweater was being illuminated (Delahunt & Brainard, 2004). Luckily, color constancy works, so we can refer to objects as having a particular well-defined color. You can demonstrate color constancy
to yourself by doing the following demonstration.
D E M O N S T R AT I O N
Color Perception Under Changing
Illumination
View the color circle of Figure 9.3 so it is illuminated by natural light by taking it outdoors or illuminating it with light from a
window. Then illuminate it with the tungsten lightbulb of your
desk lamp. Notice whether the colors change and, if so, how
much they change. ❚
In this demonstration, you may have noticed some
change in color as you changed the illumination, but the
change was probably much less than we would predict based
on the change in the wavelength distribution of the light.
Even though the wavelengths reflected from a blue object illuminated by long-wavelength-rich tungsten light can match
the wavelengths reflected by a yellow object illuminated by
sunlight (Jameson, 1985), our perception of color remains
relatively constant with changing illumination. As color vision researcher Dorthea Jameson puts it, “A blue bird would
not be mistaken for a goldfinch if it were brought indoors”
(1985, p. 84). (Note, however, that color constancy breaks
down under extreme kinds of illumination such as sodium
vapor lamps that emit narrow bands of wavelengths.)
Researchers still do not completely understand why
color constancy occurs; however, it is likely that it is
caused by a number of things working together. We will
consider some of these things, beginning with chromatic
adaptation.
Chromatic Adaptation
One reason why color constancy occurs lies in the results of
the following demonstration.
D E M O N S T R AT I O N
Adapting to Red
Illuminate Figure 9.26 with a bright light from your desk lamp;
then, with your left eye near the page and your right eye
closed, look at the field with your left eye for about 30 to 45
seconds. Then look at various colored objects in your environment, first with your left eye and then with your right. ❚
Figure 9.26 ❚ Red adapting field.
This demonstration shows that color perception can
be changed by chromatic adaptation—prolonged exposure
to chromatic color. Adaptation to the red light selectively
bleaches your long-wavelength cone pigment, which decreases your sensitivity to red light and causes you to see the
reds and oranges viewed with your left (adapted) eye as less
saturated and bright than those viewed with the right eye.
We can understand how chromatic adaptation contributes to color constancy by realizing that when you walk
into a room illuminated with tungsten light, the eye adapts
to the long-wavelength-rich light, which decreases your eye’s
sensitivity to long wavelengths. This decreased sensitivity
causes the long-wavelength light reflected from objects to
have less effect than before adaptation, and this compensates for the greater amount of long-wavelength tungsten
light that is reflected from everything in the room. Because
of this adaptation, the tungsten illumination has only a
small effect on your perception of color.
The idea that chromatic adaptation is responsible for
color constancy has been tested in an experiment by Keiji
Uchikawa and coworkers (1989). Observers viewed isolated
patches of colored paper under three different conditions
(Figure 9.27): (a) baseline: paper and observer illuminated
by white light; (b) observer not adapted: paper illuminated by
red light, observer by white (the illumination of the object
is changed, but the observer is not chromatically adapted);
and (c) observer adapted to red: both paper and observer illuminated by red light (the illumination of the object is
changed, and the observer is chromatically adapted).
The results from these three conditions are shown
above each condition. In the baseline condition, a green paper
is perceived as green. In the observer not adapted condition, the
observer perceives the paper’s color as being shifted toward
the red. Thus, color constancy does not occur in this condition. But in the observer adapted to red condition, perception is
shifted only slightly to the red, so it appears more yellowish.
Thus, the chromatic adaptation has created partial color
constancy—the perception of the object is shifted after
adaptation, but not as much as when there was no adaptation. This means that the eye can adjust its sensitivity to
Perception: Paper is green
Perception: Paper shifted toward red
Perception:
Paper shifted only slightly toward red
(a) Baseline
(b) Observer not adapted
(c) Observer adapted to red
Figure 9.27 ❚ The three conditions in Uchikawa et al.’s (1989) experiment. See text for details.
Perceiving Colors Under Changing Illumination
219
different wavelengths to keep color perception approximately constant as illumination changes.
The Effect of the Surroundings
An object’s perceived color is affected not only by the observer’s state of adaptation, but also by the object’s surroundings, as shown by the following demonstration.
D E M O N S T R AT I O N
Color and the Surroundings
Illuminate the green quadrant of Figure 9.18 with tungsten
light, and then look at the square through a small hole
punched in a piece of paper so that all you see through the
hole is part of the green area. Now repeat this observation
while illuminating the same area with daylight from your
window. ❚
When the surroundings are masked, most people perceive the green area to be slightly more yellow under the
tungsten light than in daylight, which shows that color constancy works less well when the surroundings are masked.
A number of investigators have shown that color constancy
works best when an object is surrounded by objects of many
different colors, a situation that often occurs when viewing
objects in the environment (E. H. Land, 1983, 1986; E. H.
Land & McCann, 1971).
The surroundings help us achieve color constancy because the visual system—in ways that are still not completely
understood—uses the information provided by the way objects in a scene are illuminated to estimate the characteristics of the illumination and to make appropriate corrections. (For some theories about exactly how the presence of
the surroundings enhances color constancy, see Brainard &
Wandell, 1986; E. H. Land, 1983, 1986; Pokorny et al., 1991.)
Memory and Color
Another thing that helps achieve color constancy is our
knowledge about the usual colors of objects in the environment. This effect on perception of prior knowledge of the
typical colors of objects is called memory color. Research
has shown that because people know the colors of familiar
objects, like a red stop sign, or a green tree, they judge these
familiar objects as having richer, more saturated colors than
unfamiliar objects that reflect the same wavelengths (Jin &
Shevell, 1996; Ratner & McCarthy, 1990).
In a recent experiment, Thorsten Hansen and coworkers (2006) demonstrated an effect of memory color by presenting observers with pictures of fruits with characteristic
colors, such as lemons, oranges, and bananas, against a gray
background. Observers also viewed a spot of light against
the same gray background. When the intensity and wavelength of the spot of light were adjusted so the spot was
220
CHAPTER 9
Perceiving Color
physically the same as the background, observers reported
that the spot appeared the same gray as the background.
But when the intensity and wavelength of the fruits were
set to be physically the same as the background, observers
reported that the fruits appeared slightly colored. For example, a banana that was physically the same as the gray
background appeared slightly yellowish, and an orange
looked slightly orange. This led Hansen to conclude that
the observer’s knowledge of the fruit’s characteristic colors
actually changed the colors they were experiencing. This effect of memory on our experience of color may help us accurately perceive the colors of familiar objects under different
illuminations and so makes a small contribution to color
constancy (Jin & Shevell, 1996).
Lightness Constancy
We not only perceive chromatic colors like red and green as
remaining relatively constant, even when the illumination
changes; we perceive achromatic colors, like white, gray, and
black, as remaining fairly constant as well. Thus, we perceive a Labrador retriever as black when it is inside under
dim illumination, and it remains black even when it runs
out of the house into bright sunlight.
Consider what is happening in this situation. The Labrador retriever lying on the rug in the living room is illuminated by a 100-watt lightbulb in the overhead light fi xture.
Some of the light that hits the retriever’s black coat is reflected, and we see the coat as black. When the dog goes outside into bright sunlight, much more light hits its coat, and
therefore much more light is reflected. But the dog still appears black. Even though more light is reflected, the perception of the shade of achromatic color (white, gray, and black),
which we call lightness, remains the same. The fact that we
see whites, grays, and blacks as staying about the same shade
under different illuminations is called lightness constancy.
The visual system’s problem is that the amount of light
reaching the eye from an object depends on two things: (1)
the illumination—the total amount of light that is striking the
object’s surface—and (2) the object’s reflectance—the proportion of this light that the object reflects into our eyes. When
lightness constancy occurs, our perception of lightness is
determined not by the illumination hitting an object, but
by the object’s reflectance. Objects that look black reflect
about 5 percent of the light. Objects that look gray reflect
about 10 to 70 percent of the light (depending on the shade
of gray); and objects that look white, like the paper in this
book, reflect 80 to 95 percent of the light. Thus, our perception of an object’s lightness is related not to the amount of
light that is reflected from the object, which can change depending on the illumination, but to the percentage of light
reflected from the object, which remains the same no matter what the illumination.
You can appreciate the existence of lightness constancy
by imagining a checkerboard illuminated by room light,
like the one in Figure 9.28. Let’s assume that the white
100 units
90 units
10,000 units
9 units
9,000 units
900 units
Figure 9.28 ❚ A black-and-white
checkerboard illuminated by (a)
tungsten light and (b) sunlight.
squares have a reflectance of 90 percent, and the black
squares have a reflectance of 9 percent. If the light intensity inside the room is 100 units, the white squares reflect
90 units and the black squares reflect 9 units. Now, if we take
the checkerboard outside into bright sunlight, where the intensity is 10,000 units, the white squares reflect 9,000 units
of light, and the black squares reflect 900 units. But even
though the black squares when outside reflect much more
light than the white squares did when the checkerboard was
inside, the black squares still look black. Your perception is
determined by the reflectance, not the amount of light reflected. What is responsible for lightness constancy? There
are a number of possible causes.
Intensity Relationships: The Ratio
Principle
One observation about our perception of lightness is that
when an object is illuminated evenly—that is, when the illumination is the same over the whole object, as in our checkerboard example—then lightness is determined by the ratio
of reflectance of the object to the reflectance of surrounding objects. According to the ratio principle, as long as this
ratio remains the same, the perceived lightness will remain
the same (Jacobson & Gilchrist, 1988; Wallach, 1963). For
example, consider one of the black squares in the checkerboard. The ratio of a black square to the surrounding
white squares is 9/90 ⫽ 0.10 under low illuminations and
900/9,000 ⫽ 0.10 under high illuminations. Because the
ratio of the reflectances is the same, our perception of the
lightness remains the same.
The ratio principle works well for flat, evenly illuminated objects like our checkerboard. However, things get
more complicated in three-dimensional scenes, which are
usually illuminated unevenly.
Lightness Perception Under
Uneven Illumination
If you look around, you will probably notice that the illumination is not even over the entire scene, as was the case
for our two-dimensional checkerboard. The illumination in
three-dimensional scenes is usually uneven because of shadows cast by one object onto another or because one part of
an object faces the light and another part faces away from
the light. For example, in Figure 9.29, in which a shadow is
cast across a wall, it is necessary to determine whether the
changes in appearance we see across the wall are due to differences in the properties of different parts of the wall or to
differences in the way the wall is illuminated.
The problem for the perceptual system is that it has to
somehow take the uneven illumination into account. One
way to state this problem is that the perceptual system
needs to distinguish between reflectance edges and illumination edges. A reflectance edge is an edge where the reflectance
of two surfaces changes. The border between areas a and c in
Figure 9.29 is a reflectance edge because they are made of
different materials that reflect different amounts of light.
An illumination edge is an edge where the lighting changes.
The border between a and b is an illumination edge because area a is receiving more light than area b, which is in
shadow.
Some explanations for how the visual system distinguishes between these two types of edges have been proposed (see Adelson, 1999; Gilchrist, 1994; and Gilchrist
et al., 1999, for details). The basic idea behind these explanations is that the perceptual system uses a number of sources
of information to take the illumination into account. Let’s
look at a few of these sources of information.
The Information in Shadows In order for light-
ness constancy to work, the visual system needs to be able
to take the uneven illumination created by shadows into account. It must determine that this change in illumination
caused by a shadow is due to an illumination edge and not
due to a reflectance edge. Obviously, the visual system usually succeeds in doing this because although the light intensity is reduced by shadows, you don’t usually see shadowed
areas as gray or black. For example, in the case of the wall
in Figure 9.30, you assume that the shadowed and unshadowed areas are bricks with the same lightness, but that less
light falls on some areas than on others. (See “Think About
It” #4 on page 225 for another example of an image of a tree
on a wall.)
How does the visual system know that the change in intensity caused by the shadow is an illumination edge and
not a reflectance edge? One thing the visual system may
take into account is the shadow’s meaningful shape. In this
Lightness Constancy
221
(b)
(a)
(c)
Bruce Goldstein
Figure 9.29 ❚ This unevenly
illuminated wall contains both
reflectance edges (between
a and c) and illumination
edges (between a and b).
The perceptual system must
distinguish between these two
types of edges to accurately
perceive the actual properties
of the wall, as well as other
parts of the scene.
particular example, we know that the shadow was cast by a
tree, so we know it is the illumination that is changing, not
the color of the bricks on the wall. Another clue is provided
by the nature of the shadow’s contour, as illustrated by the
following demonstration.
D E M O N S T R AT I O N
The Penumbra and Lightness Perception
Bruce Goldstein
Place an object, such as a cup, on a white piece of paper on
your desk. Then illuminate the cup at an angle with your desk
lamp and adjust the lamp’s position to produce a shadow
with a slightly fuzzy border, as in Figure 9.31a. (Generally,
moving the lamp closer to the cup makes the border get
fuzzier.) The fuzzy border at the edge of the shadow is called
the shadow’s penumbra. Now take a marker and draw a
thick line, as shown in Figure 9.31b, so you can no longer
see the penumbra. What happens to your perception of the
shadowed area inside the black line? ❚
Figure 9.30 ❚ Shadow of a tree.
222
CHAPTER 9
Perceiving Color
Covering the penumbra causes most people to perceive
a change in the appearance of the shadowed area. Apparently, the penumbra provides information to the visual system that the dark area next to the cup is a shadow, so the
edge between the shadow and the paper is an illumination
edge. However, masking off the penumbra eliminates that
information, so the area covered by the shadow is seen as a
change in reflectance. In this demonstration, lightness con-
Bruce Goldstein
(a)
(b)
stancy occurs when the penumbra is present, but does not
occur when it is masked.
Orientation of Surfaces The following
demonstration provides an example of how information
about the orientation of a surface affects our perception of
lightness.
The
D E M O N S T R AT I O N
Perceiving Lightness at a Corner
Stand a folded index card on end so that it resembles the
outside corner of a room, and illuminate it so that one side is
illuminated and the other is in shadow. When you look at the
corner, you can easily tell that both sides of the corner are
made of the same white material but that the nonilluminated
side is shadowed (Figure 9.32a). In other words, you perceive
the edge between the illuminated and shadowed “walls” as
an illumination edge.
Now create a hole in another card and, with the hole a
few inches from the corner of the folded card, view the corner
(a)
(b)
Figure 9.32 ❚ Viewing a shaded corner. (a) Illuminate a
folded card so one side is illuminated and the other is in
shadow. (b) View the folded card through a small hole so the
two sides of the corner are visible, as shown.
Figure 9.31 ❚ (a) A cup
and its shadow. (b) The same
cup and shadow with the
penumbra covered by a black
border.
with one eye about a foot from the hole (Figure 9.32b). If,
when viewing the corner through the hole, you perceive the
corner as a flat surface, your perception of the left and right
surfaces will change. ❚
In this demonstration, the illumination edge you perceived at first became transformed into an erroneous perception of a reflectance edge. The erroneous perception occurs because viewing the shaded corner through a small hole
eliminated information about the conditions of illumination and the orientation of the corner. In order for lightness
constancy to occur, it is important that the visual system
have adequate information about the conditions of illumination. Without this information, lightness
VL 13–16
constancy can break down.
How
Images
Are
Perceptually
Organized Figure 9.33 provides an example of how lightness
perception can be affected by the way elements are perceptually organized (Anderson & Winawer, 2005). The four
disks on the left are identical to the four disks on the right
in terms of how much light is reflected from the disks. To
Figure 9.33 ❚ (a) Four dark discs partially covered by a
white mist; (b) four light discs partially covered by a dark
mist. The discs are identical in (a) and (b). (Anderson, B. L.,
& Winawer, J. (2005). Image segmentation and lightness
perception. Nature, 434, 79–83.)
Lightness Constancy
223
1.0
Light absorbed
prove this to yourself, mask the surroundings by viewing
each disk through a hole punched in a piece of paper.
Despite being physically identical, the disks on the left
appear dark and the ones on the right appear light. This is
because the visual system organizes the dark areas differently in the two displays. In the display on the left, the dark
areas within the circles are seen as belonging to dark disks
that are partially obscured by a light “mist.” In the display
on the right, the same dark areas inside the circles are seen
as belonging to dark “mist” that is partially obscuring white
disks. Thus, the way the various parts of the display are perceptually organized influences our perception of lightness.
(See “If You Want to Know More” #4 at the end of the chapter for some additional examples of how our perception
of lightness can be affected by characteristics of a V
L 17
display.)
0.5
0
300
400
500
600
700
Wavelength (nm)
Figure 9.34 ❚ Absorption spectra of honeybee visual
Something to Consider:
Experiences That Are Created
by the Nervous System
At the beginning of the chapter we introduced the idea that
wavelengths themselves aren’t colored but that our experience of color is created by our nervous system. We can appreciate that color isn’t the only perceptual quality that is
created by the nervous system by considering our experience of hearing sounds. We will see in Chapter 11 that our
experience of hearing is caused by pressure changes in the
air. But why do we perceive rapid pressure changes as high
pitches (like the sound of a piccolo) and slower pressure
changes as low pitches (like a tuba)? Is there anything intrinsically “high-pitched” about rapid pressure changes? Or
consider the sense of smell. We perceive some substances as
“sweet” and others as “rancid,” but where is the “sweetness”
or “rancidity” in the molecular structure of the substances
that enter the nose? Again, the answer is that these perceptions are not in the molecular structures. They are created
by the action of the molecular structures on the nervous
system.
We can better understand the idea that some perceptual
qualities—such as color, pitch, or smell—are created by our
nervous system by considering animals that can perceive
energy that humans can’t perceive at all. For example, Figure 9.34 shows the absorption spectra of a honeybee’s visual
pigments. The pigment that absorbs short-wavelength light
enables the honeybee to see short wavelengths that can’t be
detected by humans (Menzel & Backhaus, 1989; Menzel
et al., 1986). What “color” do you think bees perceive at 350
nm, which you can’t see? You might be tempted to say “blue”
because humans see blue at the short-wavelength end of the
spectrum, but you really have no way of knowing what the
honeybee is seeing, because, as Newton stated, “The Rays . . .
are not coloured” (see page 206). There is no color in the
wavelengths, so the bee’s nervous system creates its experi-
224
CHAPTER 9
Perceiving Color
pigments.
ence of color. For all we know, the honeybee’s experience of
color at short wavelengths is quite different from ours, and
may also be different for wavelengths in the middle of the
spectrum that humans and honeybees can both see.
One of the themes of this book has been that our experience is fi ltered through our nervous system, so the properties of the nervous system can affect what we experience. For
example, in Chapter 3 (page 58) we saw that the way the rods
and cones converge onto other neurons results in high sensitivity for rod vision and good detail vision for cone vision.
The idea we have introduced here, that the nervous system
creates the way we experience the qualities of color, sound,
taste, and smell, adds another dimension to the idea that
properties of the nervous system can affect what we experience. Experience is not only shaped by the nervous system, as
in the example of rod and cone vision, but—in cases such as
color vision, hearing, taste, and smell—the very essence of
our experience is created by the nervous system.
T E S T YO U R S E L F 9. 3
1. What is color constancy? Describe three factors that
help achieve color constancy.
2. What is lightness constancy? Describe the factors
that are responsible for lightness constancy.
3. What does it mean to say that color is created by the
nervous system?
THINK ABOUT IT
1.
A person with normal color vision is called a trichromat. This person needs to mix three wavelengths to
match all other wavelengths and has three cone pigments. A person who is color deficient is called a dichromat. This person needs only two wavelengths to match
all other wavelengths and has only two operational
cone pigments. A tetrachromat needs four wavelengths
to match all other wavelengths and has four cone pigments. If a trichromat were to meet a tetrachromat,
would the tetrachromat think that the trichromat was
color deficient? How would the tetrachromat’s color vision be “better than” the trichromat’s? (p. 207)
2.
When we discussed color deficiency, we noted the difficulty in determining the nature of a color deficient person’s color experience. Discuss how this is related to the
idea that color experience is a creation of our nervous
system. (p. 211)
3.
When you walk from outside, which is illuminated by
sunlight, to inside, which is illuminated by tungsten
illumination, your perception of colors remains fairly
constant. But under some illuminations, such as streetlights called “sodium-vapor” lights that sometimes illuminate highways or parking lots, colors do seem to
change. Why do you think color constancy would hold
under some illuminations, but not others? (p. 218)
4.
Look at the photograph in Figure 9.35. Are the edges
between the dark areas and the lighter areas reflectance
edges or illumination edges? What characteristics
of the dark area did you take into account in determining your answer? (Compare this picture to the one in
Figure 9.30) (p. 221)
5.
We have argued that the link between wavelength and
color is created by our nervous system. What if you met
a person whose nervous system was wired differently
than yours, so he experienced the entire spectrum as
“inverted,” as shown in Figure 9.36, with short wavelengths perceived as red and long wavelengths perceived
as blue? Can you think of a way to determine whether
this person’s perception of color is different from yours?
(p. 224)
6.
The “Something to Consider” section pointed out that
properties of color, sound, taste, and smell are created
by the nervous system. Do you think the same thing
holds for perceptions of shape (“I see a square shape”)
or distance (“that person appears to be 10 feet away)?
(Hint: Think about how you might determine the accuracy of a person’s color perception or taste perception,
and the accuracy of their shape perception or distance
perception.) (p. 224)
IF YOU WANT TO KNOW MORE
Bruce Goldstein
1.
Figure 9.35 ❚ Are these shadows on the wall, or paintings
of trees?
400
500
The adaptive nature of animal coloration. Some
birds have bright plumage to attract mates. This
could be dangerous if these colors also made these
birds more obvious to predators. It has been shown
that Swedish songbirds reflect light in the ultraviolet
area of the spectrum, which other songbirds can see.
However, these wavelengths are not very conspicuous
to potential predators. (p. 202)
Hastad, O., Victorsson, J., & Odeen, A. (2005). Differences in color vision make passerines less conspicuous in the eyes of their predators. Proceedings
of the National Academy of Sciences, 102, 6391–6394.
600
700
Figure 9.36 ❚ In this “inverted” spectrum, short wavelengths appear red and long wavelengths appear blue.
If You Want to Know More
225
2.
Color vision in animals. What does your cat or dog
see? Are there animals other than humans that have
trichromatic vision? Are there animals that have better color vision than humans? (p. 211)
Jacobs, G. H. (1993). The distribution and nature of
colour vision among the mammals. Biological Review, 68, 413–471.
Jacobs, G. H. (in press). Color vision in animals. In
E. B. Goldstein (Ed.), Sage encyclopedia of perception.
Thousand Oaks, CA: Sage.
Neitz, J., Geist, T., & Jacobs, G. H. (1989). Color vision in the dog. Visual Neuroscience, 3, 119–125.
Varela, F. J., Palacios, A. G., & Goldsmith, T. H.
(1993). Color vision of birds. In H. P. Zeigler &
H.-J. Bishof (Eds.), Vision, brain and behavior in birds
(pp. 77–98). Cambridge, MA: MIT Press.
3.
The strength of opponent color mechanisms. The
strengths of the blue, yellow, red, and green mechanisms shown in Virtual Labs 10–12 were determined
using a psychophysical procedure. (p. 215)
Hurvich, L. (1981). Color vision. Sunderland, MA:
Sinauer Associates.
Hurvich, L. M., & Jameson, D. (1957). An opponentprocess theory of color vision. Psychological Review,
64, 384–404.
(a)
Lightness perception in three-dimensional displays. At the end of the chapter, we saw that our perception of lightness depends on a number of things in
Figure 9.37 ❚ The light distribution is identical for (a) and
(b), though it appears to be different. (Figure courtesy of
David Knill and Daniel Kersten.)
4.
addition to the amount of light reflected from objects.
Figure 9.37 is another example of this because the intensity distributions are identical in both displays.
This display shows that surface curvature can affect
lightness displays. Other displays have been created
that show how lightness depends on the perception of
surface layout. (p. 224)
Knill, D. C., & Kersten, D. (1991). Apparent surface
curvature affects lightness perception. Nature, 351,
228–230.
Adelson, E. H. (1999). Light perception and lightness illusions. In M. Gazzaniga (Ed.), The new cognitive neurosciences (pp. 339–351). Cambridge, MA:
MIT Press.
(b)
KEY TERMS
Achromatic color (p. 205)
Additive color mixture (p. 205)
Anomalous trichromat (p. 211)
Cerebral achromatopsia (p. 217)
Chromatic adaptation (p. 219)
Chromatic color (p. 204)
Color-blind (p. 212)
Color constancy (p. 218)
Color deficiency (p. 211)
Color-matching experiment (p. 207)
Desaturated (p. 204)
Deuteranopia (p. 212)
Dichromat (p. 211)
Hue (p. 204)
Illumination edge (p. 221)
Ishihara plates (p. 211)
Lightness (p. 220)
Lightness constancy (p. 220)
Memory color (p. 220)
Metamer (p. 208)
Metamerism (p. 208)
Monochromat (p. 211)
Neutral point (p. 212)
Opponent neurons (p. 215)
Opponent-process theory of color
vision (p. 213)
Partial color constancy (p. 219)
Penumbra (p. 222)
Protanopia (p. 212)
Ratio principle (p. 221)
MEDIA RESOURCES
The Sensation and Perception Book
Companion Website
www.cengage.com/psychology/goldstein
See the companion website for flashcards, practice quiz
questions, Internet links, updates, critical thinking exercises, discussion forums, games, and more!
226
CHAPTER 9
Perceiving Color
Reflectance (p. 220)
Reflectance curves (p. 204)
Reflectance edge (p. 221)
Saturation (p. 203)
Selective reflection (p. 204)
Selective transmission (p. 205)
Simultaneous color contrast (p. 214)
Subtractive color mixture (p. 206)
Trichromat (p. 211)
Trichromatic theory of color vision
(p. 207)
Tritanopia (p. 213)
Unilateral dichromat (p. 212)
Young-Helmholtz theory of color
vision (p. 207)
CengageNOW
www.cengage.com/cengagenow
Go to this site for the link to CengageNOW, your one-stop
shop. Take a pre-test for this chapter, and CengageNOW
will generate a personalized study plan based on your test
results. The study plan will identify the topics you need to
review and direct you to online resources to help you mas-
ter those topics. You can then take a post-test to help you
determine the concepts you have mastered and what you
will still need to work on.
Virtual Lab
VL
Your Virtual Lab is designed to help you get the most out
of this course. The Virtual Lab icons direct you to specific
media demonstrations and experiments designed to help
you visualize what you are reading about. The number
beside each icon indicates the number of the media element
you can access through your CD-ROM, CengageNOW, or
WebTutor resource.
The following lab exercises are related to the material
in this chapter:
1. Color Mixing Mixing colored lights. (Ignore the “Color
Space” on the right).
2. Cone Response Profiles and Hue How the relative response
of each type of cone changes across the visible spectrum.
3. Cone Response Profiles and Perceived Color Relative cone
responses for colors arranged in the color circle.
4. Color Arrangement Test A color vision test that involves
placing colors that appear similar next to each other.
5. Rod Monochromacy How the spectrum appears to a rod
monochromat.
6. Dichromacy How removing one type of cone affects
color perception.
7. Missing Blue–Yellow Channel Which colors are most likely
to be confused by a tritanope?
8. “Oh Say Can You See” Afterimage Demonstration An
American flag afterimage that illustrates the opponent
nature of afterimages.
9. Mixing Complementary Colors How mixing blue and yellow, and red and green, results in gray when mixed in the
correct proportions.
Strength of Blue–Yellow Mechanisms The strength of
blue and yellow components of the blue–yellow opponent
mechanism across the spectrum.
11. Strength of Red–Green Mechanism The strength of the
red and green components of the red–green opponent
mechanism across the spectrum.
12. Opponent-Process Coding of Hue The strengths of opponent mechanisms across the spectrum (combining the
blue–yellow and red–green demonstrations).
13. Checker-Shadow Illusion How interpretation of a display
as a three-dimensional scene can affect our judgment of
the lightness of a surface. (Courtesy of Michael Bach.)
14. Corrugated Plaid Illusion 1 Another demonstration of
how interpretation of a display as three-dimensional can
affect our perception of lightness. (Courtesy of Edward
Adelson.)
15. Corrugated Plaid Illusion 2 Another version of this illusion. (Courtesy of Michael Bach.)
16. Impossible Steps How the three-dimensional interpretation of a display can change a reflectance edge into an
illumination edge. (Courtesy of Edward Adelson.)
17. Troxler Effect How vision fades when contours are
blurred.
10.
Media Resources
227
Chapter Contents
C H A P T E R
1 0
OCULOMOTOR CUES
DEMONSTRATION: Feelings in Your Eyes
MONOCULAR CUES
Pictorial Cues
Motion-Produced Cues
DEMONSTRATION: Deletion and Accretion
BINOCULAR DEPTH INFORMATION
Binocular Disparity
DEMONSTRATION: Two Eyes: Two
Viewpoints
Connecting Disparity Information and the
Perception of Depth
DEMONSTRATION: Binocular Depth From
a Picture, Without a Stereoscope
Perceiving
Depth
and Size
The Correspondence Problem
DEPTH INFORMATION ACROSS
SPECIES
THE PHYSIOLOGY OF DEPTH
PERCEPTION
Neurons That Respond to Pictorial Depth
Neurons That Respond to Binocular
Disparity
Connecting Binocular Depth Cells and
Depth Perception
❚ TEST YOURSELF 10.1
PERCEIVING SIZE
The Holway and Boring Experiment
Size Constancy
DEMONSTRATION: Perceiving Size at a
Distance
DEMONSTRATION: Size-Distance Scaling
and Emmert’s Law
VISUAL ILLUSIONS
The Müller-Lyer Illusion
DEMONSTRATION: Measuring the Müller-
Lyer Illusion
DEMONSTRATION: The Müller-Lyer
Illusion With Books
The Ponzo Illusion
The Ames Room
The Moon Illusion
SOMETHING TO CONSIDER:
DISTANCE PERCEPTION AND
PERCEIVED EFFORT
❚ TEST YOURSELF 10.2
Think About It
If You Want to Know More
Key Terms
Media Resources
VL VIRTUAL LAB
This scene near the California coast illustrates how
the sizes of objects relative to one another can provide information
about an object’s size. The size of the house in the lower part of the
picture indicates that the surrounding trees are extremely tall. The
sizes of objects in the field of view can also provide information about
depth. The smallness of the trees on the top of the hill suggests that
the hill is far away.
OPPOSITE PAGE
Bruce Goldstein
VL The Virtual Lab icons direct you to specific animations and videos
designed to help you visualize what you are reading about. The number beside
each icon indicates the number of the clip you can access through your
CD-ROM or your student website.
229
Some Questions We Will Consider:
❚ How can we see far into the distance based on the flat
image on the retina? (p. 230)
❚ Why do we see depth better with two eyes than with one
eye? (p. 235)
❚ Why don’t people appear to shrink in size when they
walk away? (p. 244)
Y
ou can easily tell that this book is about 18 inches
away and, when you look up at the scene around you,
that other objects are located at distances ranging from
your nose (very close!) to across the room, down the street,
or even as far as the horizon, depending on where you are.
What’s amazing about this ability to see the distances of
objects in your environment is that your perception of
these objects, and the scene as a whole, is based on a twodimensional image on your retina.
We can begin to appreciate the problem of perceiving
depth based on two-dimensional information on the retina
by focusing on two points on the retina, N and F, shown in
Figure 10.1. These points represent where rays of light have
been reflected onto the retina from the tree, which is near
(N) and the house, which is farther away (F). If we look just
at these places on the retina, we have no way of knowing
how far the light has traveled to reach points N and F. For
all we know, the light stimulating either point on the retina
could have come from 1 foot away or from a distant star.
Clearly, we need to expand our view beyond single points on
the retina to determine where objects are located in space.
When we expand our view from two isolated points to
the entire retinal image, we increase the amount of infor-
mation available to us because now we can see the images
of the house and the tree. However, because this image is
two-dimensional, we still need to explain how we get from
the flat image on the retina to the three-dimensional perception of the scene. One way researchers have approached
this problem is to ask what information is contained in
this two-dimensional image that enables us to perceive
depth in the scene. This is called the cue approach to depth
perception.
The cue approach to depth perception focuses on
identifying information in the retinal image that is correlated with depth in the scene. For example, when one object
partially covers another object, as the tree in the foreground
in Figure 10.1 covers part of the house, the object that is
partially covered must be at a greater distance than the
object that is covering it. This situation, which is called
occlusion, is a signal, or cue, that one object is in front of
another. According to cue theory, we learn the connection
between this cue and depth through our previous experience with the environment. After this learning has occurred, the association between particular cues and depth
becomes automatic, and when these depth cues are present, we experience the world in three dimensions. A number
of different types of cues that signal depth in a scene have
been identified. We can divide these cues into three major
groups:
1. Oculomotor. Cues based on our ability to sense the position of our eyes and the tension in our eye muscles.
2. Monocular. Cues that work with one eye.
3. Binocular. Cues that depend on two eyes.
N
F
F
N
(b)
(a)
F
Figure 10.1 ❚ (a) The house is farther away than the
N
(c) Image on retina
230
CHAPTER 10
Perceiving Depth and Size
tree, but (b) the images of points F on the house and N on
the tree both fall on the two-dimensional surface of the
retina, so (c) these two points, considered by themselves,
do not tell us the distances to the house and the tree.
Oculomotor Cues
The oculomotor cues are created by (1) convergence, the
inward movement of the eyes that occurs when we look at
nearby objects, and (2) accommodation, the change in the
shape of the lens that occurs when we focus on objects at various distances. The idea behind these cues is that we can feel
the inward movement of the eyes that occurs when the eyes
converge to look at nearby objects, and we feel the tightening of eye muscles that change the shape of the lens to focus
on a nearby object. You can experience the feelings in your
eyes associated with convergence and accommodation V
L 1
by doing the following demonstration.
D E M O N S T R AT I O N
Feelings in Your Eyes
Look at your finger as you hold it at arm’s length. Then, as
you slowly move your finger toward your nose, notice how
you feel your eyes looking inward and become aware of the
increasing tension inside your eyes.
The feelings you experience as you move your finger
closer are caused by (1) the change in convergence angle
as your eye muscles cause your eyes to look inward, as in
Figure 10.2a, and (2) the change in the shape of the lens as
the eye accommodates to focus on a near object (Figure 3.3).
If you move your finger farther away, the lens flattens, and
your eyes move away from the nose until they are both looking straight ahead, as in Figure 10.2b. Convergence and accommodation indicate when an object is close and are useful up to a distance of about arm’s length, with convergence
being the more effective of the two (Cutting & Vishton, 1995;
Mon-Williams & Tresilian, 1999; Tresilian et al., 1999).
Monocular Cues
Monocular cues work with only one eye. They include accommodation, which we have described under oculomotor
cues; pictorial cues, which is depth information that can
be depicted in a two-dimensional picture; and movement-
(a)
(b)
based cues, which are based on depth information created
by movement.
Pictorial Cues
Pictorial cues are sources of depth information that can be
depicted in a picture, such as the illustrations in this book
or the image on the retina (Goldstein, 2001).
Occlusion We have already described the depth cue of
occlusion. Occlusion occurs when one object hides or partially hides another from view. The partially hidden object
is seen as being farther away, so the mountains in Figure
10.3 are perceived as being farther away than the hill. Note
that occlusion does not provide information about an object’s absolute distance; it only indicates relative distance.
We know that the object that is partially covered is farther
away than another object, but from occlusion alone we can’t
tell how much farther.
Height According to the cue of relative
height, objects that are below the horizon and have their
bases higher in the field of view are usually seen as being
more distant. Notice how this applies to the two motorcycles in Figure 10.3. The base of the far motorcycle (where its
tires touch the road) is higher in the picture than the base
of the near motorcycle. When objects are above the horizon,
like the clouds, being lower in the field of view indicates
more distance. There is also a connection between an observer’s gaze and distance. Looking straight out at an object
high in the visual field, near the horizon, indicates greater
depth than looking down, as you would for an object lower
in the visual field (Ooi et al., 2001).
Relative
Relative Size According to the cue of relative size,
when two objects are of equal size, the one that is farther
away will take up less of your field of view than the one that
is closer. This cue depends, to some extent, on a person’s
knowledge of physical sizes—for example, that the two telephone poles in Figure 10.3 are about the same size, as are
the two motorcycles.
Perspective Convergence When parallel lines
extend out from an observer, they are perceived as converging—becoming closer together—as distance increases. This
Figure 10.2 ❚ (a) Convergence of the eyes occurs
when a person looks at something that is very close.
(b) The eyes look straight ahead when the person
observes something that is far away.
Monocular Cues
231
Figure 10.3 ❚ A scene in Tucson, Arizona,
Bruce Goldstein
containing a number of depth cues: occlusion
(the cactus occludes the hill, which occludes the
mountain); perspective convergence (the sides
of the road converge in the distance); relative
size (the far motorcycle and telephone pole are
smaller than the near ones); and relative height
(the far motorcycle is higher in the field of view;
the far cloud is lower).
perceptual coming-together of parallel lines, which is illustrated by the road in Figure 10.3, is called perspective
convergence.
Familiar Size We use the cue of familiar size when we
judge distance based on our prior knowledge of the sizes of
objects. We can apply this idea to the coins in Figure 10.4.
If you are influenced by your knowledge of the actual size
of dimes, quarters, and half-dollars, you would probably
say that the dime is closer than the quarter. An experiment
by William Epstein (1965) shows that under certain conditions, our knowledge of an object’s size influences our perception of that object’s distance. The stimuli in Epstein’s
experiment were equal-sized photographs of a dime, a quarter, and a half-dollar, which were positioned the same distance from an observer. By placing these photographs in a
darkened room, illuminating them with a spot of light, and
having subjects view them with one eye, Epstein created the
illusion that these pictures were real coins.
When the observers judged the distance of each of the
coin photographs, they estimated that the dime was closest, the quarter was farther than the dime, and the half-dollar was the farthest of them all. The observers’ judgments
Figure 10.4 ❚ Drawings of the stimuli used in Epstein’s
(1965) familiar-size experiment. The actual stimuli were
photographs that were all the same size as a real quarter.
232
CHAPTER 10
Perceiving Depth and Size
were influenced by their knowledge of the sizes of real
dimes, quarters, and half-dollars. This result did not occur,
however, when the observers viewed the scene with both
eyes, because the use of two eyes provided information indicating the coins were at the same distance. The cue of familiar size is therefore most effective when other information
about depth is absent (see also Coltheart, 1970; Schiffman,
1967).
Atmospheric Perspective Atmospheric perspec-
tive occurs when more distant objects appear less sharp and
often have a slight blue tint. The farther away an object is,
the more air and particles (dust, water droplets, airborne
pollution) we have to look through, making objects that are
farther away look less sharp and bluer than close objects.
Figure 10.5 illustrates atmospheric perspective. The details
in the foreground are sharp and well defined, but as we look
out at the rocks, details become less and less visible as we
look farther into the distance.
If, instead of viewing these hills, you were standing on
the moon, where there is no atmosphere, and hence no atmospheric perspective, far craters would look just as clear
as near ones. But on Earth, there is atmospheric perspective,
with the exact amount depending on the nature of the atmosphere. An example of how atmospheric perspective depends on the nature of the atmosphere occurred when one
of my friends took a trip from Philadelphia to Montana. He
started walking toward a mountain that appeared to be perhaps a two- or three-hour hike away but found after three
hours of hiking that he was still far from the mountain. Because my friend’s perceptions were “calibrated” for Philadelphia, he found it difficult to accurately estimate distances in
the clearer air of Montana, so a mountain that would have
looked three hours away in Philadelphia was more than six
hours away in Montana!
Bruce Goldstein
(a)
Figure 10.5 ❚ A scene on the coast of Maine showing the
effect of atmospheric perspective.
(b)
Figure 10.7 ❚ (a) Where are the spheres located in relation
to the checkerboard? (b) Adding shadows makes their
location clear. (Courtesy of Pascal Mamassion.)
Bruce Goldstein
Shadows also enhance the three-dimensionality of objects. For example, shadows make the circles in Figure 10.7
appear spherical, and help define some of the contours in
the mountains in Figure 10.3. In the middle of the day,
when the sun is directly overhead and there are no
VL 2
shadows, the mountains appear almost flat.
Motion-Produced Cues
Figure 10.6 ❚ A texture gradient in Death Valley, California.
Texture Gradient Another source of depth infor-
mation is the texture gradient: Elements that are equally
spaced in a scene appear to be more closely packed as distance increases, as with the textured ground in the scene in
Figure 10.6. Remember that according to the cue of relative
size, more distant objects take up less of our field of view.
This is exactly what happens to the faraway elements in the
texture gradient.
Shadows Shadows that are associated with objects
can provide information regarding the locations of these
objects. Consider, for example, Figure 10.7a, which shows
seven spheres and a checkerboard. In this picture, the location of the spheres relative to the checkerboard is unclear.
They could be resting on the surface of the checkerboard,
or floating above it. But adding shadows, as shown in
Figure 10.7b, makes the spheres’ locations clear—the ones
on the left are resting on the checkerboard, and the ones
on the right are floating above it. This illustrates how shadows can help determine the location of objects (Mamassian
et al., 1998).
All of the cues we have described so far work if the observer
is stationary. If, however, we decide to take a walk, new cues
emerge that further enhance our perception of depth. We
will describe two different motion-produced cues: (1) motion parallax and (2) deletion and accretion.
Motion Parallax Motion parallax occurs when, as
we move, nearby objects appear to glide rapidly past us, but
more distant objects appear to move more slowly. Thus,
when you look out the side window of a moving car or train,
nearby objects appear to speed by in a blur, whereas objects
on the horizon may appear to be moving only slightly.
We can understand why motion parallax occurs by noting how the image of a near object (the tree in Figure 10.8a)
and a far object (the house in Figure 10.8b) move across the
retina as the eye moves from position 1 to position 2. First
let’s consider the tree: Figure 10.8a shows that when the eye
moves to position 2, the tree’s image moves all the way across
the retina from T1 to T2, as indicated by the dashed arrow.
Figure 10.8b shows that the house’s image moves a shorter
distance, from H1 to H 2. Because the image of the near object travels a large distance across the retina, it appears to
move rapidly as the observer moves. The image of the far object travels a much smaller distance across the retina, so it
appears to move more slowly as the observer moves.
Monocular Cues
233
Move
T1
Position 1
T1
Move
T2
H1
Position 2
Position 1
(a)
H2
H1
Position 2
(b)
Figure 10.8 ❚ Eye moving past (a) a nearby tree; (b) a faraway house. Notice how the image of
the tree moves farther on the retina than the image of the house.
Motion parallax is one of the most important sources
of depth information for many animals. The information
provided by motion parallax has also been used to enable
human-designed mechanical robots to determine how far
they are from obstacles as they navigate through the environment (Srinivasan & Venkatesh, 1997). Motion parallax
is also widely used to create an impression of depth in cartoons and video games.
Deletion and Accretion As an observer moves side-
ways, some things become covered, and others become uncovered. Try the following demonstration.
D E M O N S T R AT I O N
Close one eye. Position your hands out as shown in Figure
10.9, so your right hand is at arm’s length and your left hand
at about half that distance, just to the left of the right hand.
Then as you look at your right hand, move your head sideways
to the left and then back again, keeping your hands still. As
you move your left hand will appear to move back and forth,
covering and uncovering your right hand. Covering the right
hand is deletion. Uncovering is accretion. ❚
234
CHAPTER 10
Perceiving Depth and Size
Bruce Goldstein
Deletion and Accretion
Figure 10.9 ❚ Position of the hands for deletion and
accretion demonstration.
Binocular Depth Information
Deletion and accretion are related to both motion parallax and overlap because they occur when overlapping surfaces appear to move relative to one another. They are especially effective for detecting the differences in the depths of
two surfaces (Kaplan, 1969).
Our discussion so far has described a number of the
cues that contribute to our perception of depth. As shown
in Table 10.1, these cues work over different distances, some
only at close range (accommodation and convergence), some
at close and medium ranges (motion parallax), some at long
range (atmospheric perspective), and some at the whole
range of depth perception (occlusion and relative size; Cutting & Vishton, 1995). For example, we can appreciate how
occlusion operates over a wide range of distances by noticing how this cue works over a distance of a few inches for
the cactus flower in Figure 10.10a, and over a distance of
many miles for the scene in Figure 10.10b.
In addition to the cues we have described so far, there is one
other important source of depth information—the differences in the images received by our two eyes. Because our
eyes view the world from positions that are about 6 cm apart
in the average adult, this difference in the viewpoint of the
two eyes creates the cue of binocular disparity.
Binocular Disparity
Binocular disparity is the difference in the images in the
left and right eyes. The following demonstration illustrates
this difference.
D E M O N S T R AT I O N
Two Eyes: Two Viewpoints
TABLE 10.1
Close your right eye. Hold your finger vertically about 6
inches in front of you and position it so it is partially covering
an object in the distance. Look directly at the distant object
with your left eye, then close your left eye and look directly at
the distant object with your right eye. When you switch eyes,
how does the position of your finger change relative to the far
object? ❚
❚ Range of Effectiveness of Different Depth
Cues
DEPTH INFORMATION
0–2 METERS
2–20 METERS
ABOVE 30 METERS
Occlusion
✓
✓
✓
Relative size
✓
✓
✓
Accommodation
and convergence
✓
Motion parallax
✓
Relative height
Atmospheric
perspective
When you switched from looking with your left eye to
your right, you probably noticed that your finger appeared
to move to the left relative to the far object. Figure 10.11
diagrams what happened on your retinas. The green line in
Figure 10.11a shows that when the left eye was open, the images of the finger and far object both fell on the same place
on the retina. This occurred because you were looking right
at both objects, so their images would fall on the foveas. The
green lines in Figure 10.11b show that when the right eye
was open, the image of the far object still fell on the fovea
✓
✓
✓
✓
Bruce Goldstein
Bruce Goldstein
Source: Based on Cutting & Vishton, 1995.
(a)
(b)
Figure 10.10 ❚ (a) Occlusion operating on a small scale: the flower near the center occludes the cactus, so the flower appears
closer. (b) Occlusion operating on a larger scale: The green shrubbery occludes the river; the buildings in Pittsburgh occlude
one another; the city occludes the hills in the far distance. Occlusion indicates only that one object is closer than another object.
What other depth cues make us aware of the actual distances in this scene?
Binocular Depth Information
235
Far object
Far object
Finger
Finger
Finger
Figure 10.11 ❚ Location of
Object
Finger and
object
(a)
(b)
because you were looking at it, but the image of the finger
was now off to the side.
The difference between the images in the left and right
eyes shown in Figure 10.11 creates binocular disparity. To
describe how disparity works, we need to introduce the idea
of corresponding retinal points—the places on each retina
that would overlap if one retina could be slid on top of the
other. In Figure 10.12, we see that the two foveas, marked F,
fall on corresponding points, and that the two A’s and V
L 3
the two B’s also fall on corresponding points.
To take the idea of corresponding points into the real
world, let’s consider the lifeguard in Figure 10.13a, who
is looking directly at Frieda. The dashed line that passes
through Harry, Frieda, and Susan is part of the horopter,
which is an imaginary surface that passes through the point
of fixation and indicates the location of objects that fall on
corresponding points on the two retinas. In this example,
Frieda is the point of fixation because the lifeguard is looking directly at her, and so her image falls on the foveas,
which are corresponding points, indicated by F in Figure
10.13b. Because Harry and Susan are also on the horoptor,
their images, indicated by H and S also fall on corresponding points.
Figure 10.14 shows where Carole’s image falls on the
lifeguard’s retinas when he is looking at Frieda. Frieda’s
image falls on corresponding points FL and FR . Carole’s
images fall on noncorresponding points CL in the left
A
A
B
B
F
F
Figure 10.12 ❚ Corresponding points on the two retinas. To
determine corresponding points, imagine that one eye is slid
on top of the other one.
Susan
images on the retina for the
“Two Eyes: Two Viewpoints”
demonstration. See text for
explanation.
Susan
Horopter
Horopter
Frieda
Frieda
H
Carole
F
S
Harry
Harry
H
F
S
(a)
(b)
Figure 10.13 ❚ (a) When the lifeguard looks at Frieda, the images of Frieda, Susan, and Harry fall on corresponding
points on the lifeguard’s retinas. (b) The locations of the images of Susan, Frieda, and Harry on the lifeguard’s retinas.
236
CHAPTER 10
Perceiving Depth and Size
Frieda
Frieda
Looking at
Carole
Looking at
Frieda
Carole
Carole
CL
FR
FL
Corresponding
point for CL –
Carole’s image
on left eye
CL
CR – Actual location
of Carole’s image
FR
FL
CR
Corresponding
point for FL –
Frieda’s image
on left eye
Actual location
of Frieda’s image
Frieda’s
disparity
Carole’s
disparity
Absolute disparity
for Carole = 0
Absolute disparity
for Frieda = 0
Figure 10.14 ❚ The location of the images of Frieda and
Figure 10.15 ❚ The location of the images of Frieda and
Carole in the lifeguard’s eyes when the lifeguard is looking
at Frieda. Because Carole is not located on the horopter, her
images fall on noncorresponding points. The absolute angle
of disparity is the angle between the point on the right eye
that corresponds to Carole’s image on the left eye (C L ), and
the point where the image actually falls (C R).
Carole in the lifeguard’s eyes when the lifeguard is looking
at Carole. Because Frieda is not located on the horopter, her
images fall on noncorresponding points. The absolute angle
of disparity is the angle between the point on the right eye
that corresponds to Frieda’s image on the left eye (FL ), and
the point where the image actually falls (FR).
eye and CR in the right eye. Note that if you slid the retinas on top of each other, point CL would not overlap with
point CR . The difference between where Carole’s image falls
on the right eye (CR) and the corresponding point is called
the angle of disparity. Carole’s angle of disparity, which
in this example is about 26 degrees, is the absolute angle of
disparity, or simply the absolute disparity for Carole’s V
L 4
image when the lifeguard is looking at Frieda.
Absolute disparity is important because it provides information about the distances of objects. The amount of absolute disparity indicates how far an object is from the horopter. Greater disparity is associated with greater distance
from the horopter. Thus, if Carole were to swim toward the
lifeguard while the lifeguard kept looking at Frieda, the
angle of disparity of Carole’s image on the lifeguard’s retina would increase. (Notice that as Carole approaches, the
dashed red lines in Figure 10.14 would move outward, creating greater disparity.)
One of the properties of absolute disparity is that it
changes every time the observer changes where he or she is
looking. For example, if the lifeguard decided to shift his
fixation from Frieda to Carole, as shown in Figure 10.15, the
absolute disparity for Carole’s images at CL and CR would
become zero, because they would fall on the lifeguard’s foveas. But Frieda’s images are no longer on corresponding
points, and when we determine the disparity of her images,
it turns out to be about 26 degrees.1
What this means is that the absolute disparity of every
object in an observer’s visual field is constantly changing as
the observer looks around. When we consider that a person
makes as many as 3 fixations per second when scanning a
scene and that every new fixation establishes a new horopter, this means that the absolute disparities for every object
in a scene have to be constantly recalculated.
There is, however, disparity information that remains
the same no matter where an observer looks. This information is called relative disparity—the difference between two
objects’ absolute disparities. We can see how this works by
comparing the situations in Figures 10.14 and 10.15. We saw
in Figure 10.14 that when the lifeguard is looking at Frieda,
her absolute disparity is zero, and Carole’s is about 26 degrees. The relative disparity for Carole and Frieda is therefore 26 degrees (the difference between 0 and 26 degrees).
When the lifeguard shifts his fixation to Carole, as
shown in Figure 10.15, her absolute disparity becomes 0 degrees, and Frieda’s becomes about 26 degrees. As before, the
1
The disparities in the real world are much smaller than the large disparities
in these pictures, because in the environment, objects are much farther away
relative to the spacing between the eyes.
Binocular Depth Information
237
relative disparity is 26 degrees. Although both Carol’s and
Frieda’s absolute disparities changed when the lifeguard
shifted his fixation from Frieda to Carol, the difference between them remained the same. The same thing happens for
all objects in the environment. As long as the objects stay
in the same position relative to an observer, the difference
in their disparities remains the same, no matter where the
observer looks. Thus, relative disparity, which remains constant, offers an advantage over absolute disparity, which
changes as a person looks around. As we will see below,
there is evidence that both absolute and relative disparity
information is represented by neural activity in the visual
system.
Connecting Disparity Information
and the Perception of Depth
D E M O N S T R AT I O N
Binocular Depth From a Picture,
Without a Stereoscope
Place a 4 ⫻ 6 card vertically, long side up, between the
stairs in Figure 10.17, and place your nose against the card
so that you are seeing the left-hand drawing with just your
left eye and the right-hand drawing with just your right eye.
(Blink back and forth to confirm this separation.) Then
relax and wait for the two drawings to merge. When the
drawings form a single image, you should see the stairs in
depth, just as you would if you looked at them through a
stereoscope. ❚
Figure 10.17 ❚ See text for instructions for viewing these
stairs.
Bruce Goldstein
We have seen that both absolute and relative disparity information contained in the images on the retinas provides
information indicating an object’s distance from an observer. Notice, however, that our description of disparity
has focused on geometry—where an object’s images fall on
the retina—but has not mentioned perception, the observer’s
experience of an object’s depth or its relation to other objects in the environment. We now consider the relationship
between disparity and what observers perceive. To do this
we introduce stereopsis—the impression of depth that results from information provided by binocular disparity.
An example of stereopsis is provided by the depth effect achieved by the stereoscope, a device introduced by the
physicist Charles Wheatstone (1802–1875), which produces
a convincing illusion of depth by using two slightly different
pictures. This device, extremely popular in the 1800s and
reintroduced as the View Master in the 1940s, presents two
photographs that are made with a camera with two lenses
separated by the same distance as the eyes. The result is two
slightly different views, like those shown in Figure 10.16.
The stereoscope presents the left picture to the left eye and
the right picture to the right eye. This creates the same
binocular disparity that occurs when a person views the
scene naturally, so that slightly different images appear in
the left and right eyes. In this next demonstration, the binocular disparity created by two pictures creates a V
L 5–7
perception of depth.
(a) Left eye image
(b) Right eye image
Figure 10.16 ❚ The two images of a stereoscopic photograph. The difference between the two images, such as the distances
between the front cactus and the window in the two views, creates retinal disparity. This creates a perception of depth when the
left image is viewed by the left eye and the right image is viewed by the right eye.
238
CHAPTER 10
Perceiving Depth and Size
ized so its vibration is vertical, and the other is polarized so
its vibration is horizontal. Viewing the film through polarizing lenses, which let vertically polarized light into one eye
and horizontally polarized light into the other eye, creates
the disparity that results in three-dimensional perception.
Our conclusion that disparity creates stereopsis seems
to be supported by the demonstration above, which shows
that we perceive depth when two slightly displaced views are
presented to the left and right eyes. However, this demonstration alone doesn’t prove that disparity creates a perception of depth because images such as those in Figure 10.16
also contain potential depth cues, such as occlusion and
relative height, which could contribute to our perception
of depth. In order to show that disparity alone can result
in depth perception, Bela Julesz (1971) created a stimulus
called the random-dot stereogram, which contained no V
L 8
pictorial cues.
By creating stereoscopic images of random-dot patterns, Julesz showed that observers can perceive depth in
displays that contain no depth information other than disparity. Two such random-dot patterns, which constitute a
random-dot stereogram, are shown in Figure 10.19. These
patterns were constructed by first generating two identical
random-dot patterns on a computer and then shifting a
square-shaped section of the dots one or more units to the
side. In the stereogram in Figure 10.19a, a section of dots
on the right pattern has been shifted one unit to the right.
This shift is too subtle to be seen in these dot patterns, but
we can understand how it is accomplished by looking at the
diagrams below the dot patterns (Figure 10.19b). In these
© Bettmann/CORBIS
The principle behind the stereoscope is also used in 3-D
movies. The left-eye and right-eye images are presented simultaneously on the screen, slightly displaced from one
another, to create disparity. These images can be presented
separately to the left and right eyes by coloring one red and
the other green and viewing the film through glasses with
a red filter for one eye and a green filter for the other eye
(Figure 10.18). Another way to separate the images is to create the left and right images from polarized light—light
waves that vibrate in only one direction. One image is polar-
Figure 10.18 ❚ A scene in a movie theater in the 1950s,
when three-dimensional movies were first introduced. The
glasses create different images in the left and right eyes, and
the resulting disparity leads to a convincing impression of
depth.
(a)
(b)
1
0
1
0
1
0
0
1
0
1
1
0
1
0
1
0
0
1
0
1
1
0
0
1
0
1
0
1
0
0
1
0
0
1
0
1
0
1
0
0
0
0
1
1
0
1
1
0
1
0
0
0
1
1
0
1
1
0
1
0
0
1
0
A
A
B
B
1
0
1
0
1
0
Y
A
A
B
B
0
1
1
1
1
B
A
B
A
0
0
1
1
1
1
X
B
A
B
A
0
1
0
0
1
A
A
B
A
0
1
0
0
0
1
X
A
A
B
A
1
0
1
1
1
B
B
A
B
1
0
1
1
1
1
Y
B
B
A
B
0
1
1
0
0
1
1
0
1
1
0
1
1
0
0
1
1
0
1
1
0
1
1
1
0
0
1
1
0
1
1
1
1
1
0
0
1
1
0
1
1
1
0
1
0
0
0
1
1
1
1
0
0
1
0
0
0
1
1
1
1
0
Figure 10.19 ❚ (a) A random-dot stereogram.
(b) The principle for constructing the stereogram.
See text for explanation.
Binocular Depth Information
239
diagrams, the black dots are indicated by 0’s, A’s, and X’s
and the white dots by 1’s, B’s, and Y’s. The A’s and B’s indicate the square-shaped section where the shift is made in
the pattern. Notice that the A’s and B’s are shifted one unit
to the right in the right-hand pattern. The X’s and Y’s indicate areas uncovered by the shift that must be filled in with
new black dots and white dots to complete the pattern.
The effect of shifting one section of the pattern in this
way is to create disparity. When the two patterns are presented simultaneously to the left and the right eyes in a stereoscope, observers perceive a small square floating above
the background. Because binocular disparity is the only
depth information present in these stereograms, disparity
alone must be causing the perception of depth.
Psychophysical experiments, particularly those using
Julesz’s random-dot stereograms, show that retinal disparity creates a perception of depth. But before we can fully
understand the mechanisms responsible for depth perception, we must answer one more question: How does the visual system match the parts of the images in the left and
right eyes that correspond to one another? This is called the
correspondence problem, and as we will see, it has still not
been fully explained.
The Correspondence Problem
Let’s return to the stereoscopic images of Figure 10.16.
When we view this image in a stereoscope, we see different
parts of the image at different depths because of the disparity between images on the left and right retinas. Thus, the
cactus and the window appear to be at different distances
when viewed through the stereoscope because they create
different amounts of disparity. But in order for the visual
system to calculate this disparity, it must compare the images of the cactus on the left and right retinas and the images of the window on the left and right retinas. This is the
correspondence problem. How does the visual system match
up the images in the two eyes?
A possible answer to this question is that the visual system may match the images on the left and right retinas on
the basis of the specific features of the objects. For example,
the upper-left window pane on the left could be matched
with the upper-left pane on the right, and so on. Explained
in this way, the solution seems simple: Most things in the
world are quite discriminable from one another, so it is easy
to match an image on the left retina with the image of the
same thing on the right retina. But what about images in
which matching similar points would be extremely difficult,
as with Julesz’s random-dot stereogram?
You can appreciate the problem involved in matching
similar parts of a stereogram by trying to match up the
points in the left and right images of the stereogram in Figure 10.19. Most people find this to be an extremely difficult
task, involving switching their gaze back and forth between
the two pictures and comparing small areas of the pictures
one after another. But even though matching similar features on a random-dot stereogram is much more difficult
and time-consuming than matching features in the real
240
CHAPTER 10
Perceiving Depth and Size
world, the visual system somehow matches similar parts
of the two stereogram images, calculates their disparities,
and creates a perception of depth. A number of proposals,
all too complex to describe here, have been put forth to explain how the visual system solves the correspondence problem, but a totally satisfactory answer has yet to be proposed
(see Blake & Wilson, 1991; Menz & Freeman, 2003; Ohzawa,
1998; Ringbach, 2003).
Depth Information
Across Species
Humans make use of a number of different sources of depth
information in the environment. But what about other species? Many animals have excellent depth perception. Cats
leap on their prey; monkeys swing from one branch to the
next; a male housefly follows a flying female, maintaining
a constant distance of about 10 cm; and a frog accurately
jumps across a chasm (Figure 10.20).
There is no doubt that many animals are able to judge
distances in their environment, but what depth information do they use? A survey of mechanisms used by different
animals reveals that animals use the entire range of cues
described in this chapter. Some animals use many cues, and
others rely on just one or two.
To make use of binocular disparity, an animal must
have eyes that have overlapping visual fields. Thus, animals
such as cats, monkeys, and humans that have frontal eyes
(Figure 10.21a), which result in overlapping fields of view,
can use disparity to perceive depth. Animals with lateral
eyes, such as the rabbit (Figure 10.21b), do not have overlapping visual fields and therefore cannot use disparity to
perceive depth. Note, however, that in sacrificing binocular disparity, animals with lateral eyes gain a wider field of
Figure 10.20 ❚ These drawings, which are based on
photographs of frogs jumping, show that the frog adjusts
the angle of its jump based on its perception of the distance
across the chasm, with steeper takeoffs being associated
with greater distances. (Adapted from Collett, T. S., &
Harkness, L. I. K. (1982). Depth vision in animals. In D. J.
Ingle, M. A. Goodale, & R. J. W. Mansfield (Eds.), Analysis of
visual behavior (pp. 111–176). Cambridge, MA: MIT Press.)
sway—as it observed prey at different distances, and found
that the locust swayed more when targets were farther away.
Since more distant objects move less across the retina than
nearer objects for a given amount of observer movement
(Figure 10.8), a larger sway would be needed to cause the image of a far object to move the same distance across the retina as the image of a near object. The locust may therefore
be judging distance by noting how much sway is needed to
cause the image to move a certain distance across its retina
(also see Sobel, 1990).
The above examples show how depth can be determined from different sources of information in light. But
bats, some of which are blind to light, use a form of energy
we usually associate with sound to sense depth. Bats sense
objects by using a method similar to the sonar system used
in World War II to detect underwater objects such as submarines and mines. Sonar, which stands for sound navigation and ranging, works by sending out pulses of sound and
using information contained in the echoes of this sound
to determine the location of objects. Donald Griffin (1944)
coined the term echolocation to describe the biological sonar system used by bats to avoid objects in the dark.
Bats emit pulsed sounds that are far above the upper
limit of human hearing, and they sense objects’ distances
by noting the interval between when they send out the pulse
and when they receive the echo (Figure 10.22). Since they
Bruce Goldstein
view—something that is extremely important for animals
that need to constantly be on the lookout for predators.
The pigeon is an example of an animal with lateral eyes
that are placed so the visual fields of the left and right eyes
overlap only in a 35-degree area surrounding the pigeon’s
beak. This overlapping area, however, happens to be exactly
where pieces of grain would be located when the pigeon is
pecking at them, and psychophysical experiments have
shown that the pigeon does have a small area of binocular
depth perception right in front of its beak (McFadden, 1987;
McFadden & Wild, 1986).
Movement parallax is probably insects’ most important
method of judging distance, and they use it in a number of
different ways (Collett, 1978; Srinivasan & Venkatesh, 1997).
For example, the locust uses a “peering” response—moving
its body from side to side to create movement of its head—as
it observes potential prey. T. S. Collett (1978) measured a locust’s “peering amplitude”—the distance of this side-to-side
(a)
Barbara Goldstein
(a)
(b)
(c)
Figure 10.22 ❚ When a bat sends out its pulses, it receives
(b)
Figure 10.21 ❚ (a) Frontal eyes such as those of the cat
have overlapping fields of view that provide good depth
perception. (b) Lateral eyes such as those of the rabbit
provide a panoramic view but poorer depth perception.
echoes from a number of objects in the environment. This
figure shows the echoes received by the bat from (a) a moth
located about half a meter away; (b) a tree, located about
2 meters away; and (c) a house, located about 4 meters away.
The echoes from each object return to the bat at different
times, with echoes from more distant objects taking longer
to return. The bat locates the positions of objects in the
environment by sensing how long it takes the echoes to return.
Depth Information Across Species
241
The Physiology of Depth
Perception
Most of the research on the physiology of depth perception has concentrated on looking for neurons that signal
information about binocular disparity. But neurons have
also been found that signal the depth indicated by pictorial
depth cues.
Neurons That Respond to
Pictorial Depth
Ken-Ichino Tsutsui and coworkers (2002, 2005) studied the
physiology of neurons that respond to the depth indicated
by texture gradients by having monkeys match stimuli
like the ones in Figure 10.23 to three-dimensional displays
created by stereograms. The results showed that monkeys
perceive the pattern in Figure 10.23a as slanting to the
right, 10.23b as flat, and 10.23c as slanting to the left.
The records below the texture gradient patterns are the
responses of a neuron in an area in the parietal cortex that
had been associated with depth perception in other studies.
This neuron does not fire to the right-slanting gradient, or
to a flat pattern, but does fire to the left-slanting gradient.
Thus, this neuron fires to a display in which depth is indicated by the pictorial depth cues of texture gradients. This
neuron also responds when depth is indicated by disparity,
so it is tuned to respond to depth whether it is determined
by pictorial depth cues or by binocular disparity. (Also see
Sereno et al., 2002, for a description of a neuron that responds to the depth cue of motion parallax.)
Neurons That Respond
to Binocular Disparity
One of the most important discoveries about the physiology
of depth perception was the finding that there are neurons
that are tuned to respond to specific amounts of disparity
(Barlow et al., 1967; Hubel & Wiesel, 1970). The first research
on these neurons described neurons in the striate cortex
(V1) that responded to absolute disparity. These neurons
are called binocular depth cells or disparity-selective cells.
A given cell responds best when stimuli presented to the left
and right eyes create a specific amount of absolute disparity.
Figure 10.24 shows a disparity tuning curve for one of these
neurons (Uka & DeAngelis, 2003). This particular neuron
responds best when the left and right eyes are stimulated to
create an absolute disparity of about 1 degree. Further research has shown that there are also neurons higher up in
the visual system that respond to relative disparity (Parker,
2007) (see page 237).
Connecting Binocular Depth Cells
and Depth Perception
Just because disparity-selective neurons fire best to a specific
angle of disparity doesn’t prove that these neurons have
anything to do with depth perception. To show that binocular depth cells are actually involved in depth perception,
we need to demonstrate a connection between disparity and
behavior.
80
Response (spikes/sec)
use sound echoes to sense objects, they can avoid obstacles
even when it is totally dark (Suga, 1990). Although we don’t
have any way of knowing what the bat experiences when
these echoes return, we do know that the timing of these
echoes provides the information the bat needs to locate objects in its environment. (Also see von der Emde et al., 1998,
for a description of how electric fish sense depth based on
“electrolocation.”) From the examples we have described,
we can see that animals use a number of different types of
information to determine depth, with the type of information used depending on the animal’s specific needs and on
its anatomy and physiological makeup.
60
40
20
0
–2
Near
–1
0
1
2
Far
Horizontal disparity (deg)
(a)
(b)
(c)
Figure 10.23 ❚ Top: gradient stimuli. Bottom: response
of neurons in the parietal cortex to each gradient. (From
Tsutsui, K. I., Sakata, H., Naganuma, T., & Taira, M. (2002).
Neural correlates for perception of 3D surface orientation
from texture gradient. Science, 298, 402–412; Tsutsui, K. I.,
Tiara, M., & Sakata, H. (2005). Neural mechanisms of threedimensional vision. Neuroscience Research, 51, 221–229.)
242
CHAPTER 10
Perceiving Depth and Size
Figure 10.24 ❚ Disparity tuning curve for a neuron
sensitive to absolute disparity. This curve indicates the neural
response that occurs when stimuli presented to the left and
right eyes create different amounts of disparity. (From Uka, T.,
& DeAngelis, G. C. (2003). Contribution of middle temporal
area to coarse depth discrimination: Comparison of neuronal
and psychophysical sensitivity. Journal of Neuroscience, 23,
3515–3530.)
Randolph Blake and Helmut Hirsch (1975) demonstrated this connection by doing a selective rearing experiment that resulted in the elimination of binocular neurons.
(See Chapter 4, page 80, for another example of a selective
rearing experiment.) They reared cats so that their vision was
alternated between the left and right eyes every other day
during the first 6 months of their lives. After this 6-month
period of presenting stimuli to just one eye at a time, Blake
and Hirsch recorded from neurons in the cat’s cortex and
found that (1) these cats had few binocular neurons, and (2)
they were not able to use binocular disparity to perceive
depth. Thus, eliminating binocular neurons eliminates stereopsis and confirms what everyone suspected all along—
that disparity-selective neurons are responsible for stereopsis (also see Olson & Freeman, 1980).
Another technique that has been used to demonstrate
a link between neural responding and depth perception
is microstimulation (see Method: Microstimulation in
Chapter 8, page 188). Microstimulation is achieved by inserting a small electrode into the cortex and passing an
electrical charge through the electrode to activate the neurons near the electrode (M. R. Cohen & Newsome, 2004).
Neurons that are sensitive to the same disparities tend to be
organized in clusters, so stimulating one of these clusters
activates a group of neurons that respond best to a specific
disparity.
Gregory DeAngelis and coworkers (1998) trained a
monkey to indicate the depth created by presenting images
with different absolute disparities to the left and right eyes.
Presumably, the monkey perceived depth because the disparate images on the monkey’s retina activated disparityselective neurons in the cortex. But what would happen if
microstimulation were used to activate a different group of
disparity-selective neurons? DeAngelis and coworkers stimulated disparity-selective neurons that were tuned to a disparity different from what was indicated by the images on
the retina. When they did this, the monkey shifted its depth
judgment toward the disparity signaled by the stimulated
neurons (Figure 10.25).
DeAngelis’ experiment provides another demonstration of a connection between disparity-selective neurons
and depth perception. (This result is like the result we described on page 188 in Chapter 8, in which stimulating neurons that preferred specific directions of movement shifted
a monkey’s perception toward that direction of movement.)
In addition, brain-imaging experiments on humans show
that a number of different areas are activated by stimuli
that create binocular disparity (Backus et al., 2001; Kwee et
al., 1999; Ts’o et al., 2001). Experiments on monkeys have
determined that neurons sensitive to absolute disparity are
found in the primary visual receiving area, and neurons
sensitive to relative disparity are found higher in the visual
system, in the temporal lobe and other areas. Apparently,
depth perception involves a number of stages of processing that begins in the primary visual cortex and extends to
many different areas in both the ventral and dorsal streams
(Parker, 2007).
2
1
Figure 10.25 ❚ DeAngeles and coworkers (1998) stimulated
neurons in the monkey’s cortex that were sensitive to
a particular amount of disparity, while the monkey was
observing a random-dot stereogram. This stimulation shifted
perception of the dots from position 1 to position 2.
T E S T YO U R S E L F 10.1
1. What is the basic problem of depth perception, and
how does the cue approach deal with this problem?
2. What monocular cues provide information about
depth in the environment?
3. What is binocular disparity? What is the difference
4.
5.
6.
7.
8.
9.
between absolute disparity and relative disparity?
How are absolute and relative disparity related to
the depths of objects in a scene? What is the advantage of relative disparity?
What is stereopsis? What is the evidence that disparity creates stereopsis?
What does perception of depth from a random-dot
stereogram demonstrate?
What is the correspondence problem? Has this
problem been solved?
What kinds of information do other species use to
perceive depth? How does the information they use
depend on the animals’ sensory systems?
What is the relationship between the firing of neurons in the cortex and depth perception? Be sure
to distinguish between (a) experiments that demonstrated a connection between neurons that respond
to depth information and (b) experiments that demonstrate a connection between neural responding
and depth perception.
Where does the neural processing for depth perception occur in the brain?
Perceiving Size
We discuss size perception in this chapter because our perception of size can be affected by our perception of depth.
This link between size perception and depth perception is
graphically illustrated by the following example.
Perceiving Size
243
Whiteout—one of the most treacherous weather
conditions possible for flying—can arise quickly
and unexpectedly. As Frank pilots his helicopter across the Antarctic wastes, blinding light,
reflected down from thick cloud cover above and
up from the pure white blanket of snow below,
makes it difficult to see the horizon, details on
the surface of the snow, or even up from down.
He is aware of the danger because he has known
pilots dealing with similar conditions who flew
at full power directly into the ice. He thinks he
can make out a vehicle on the snow far below,
and he drops a smoke grenade to check his
altitude. To his horror, the grenade falls only
three feet before hitting the ground. Realizing
that what he thought was a truck was actually a
discarded box, Frank pulls back on the controls
and soars up, his face drenched in sweat, as he
comprehends how close he just came to becoming another whiteout fatality.
This account is based on descriptions of actual flying
conditions at an Antarctic research base. It illustrates that
our ability to perceive an object’s size can sometimes be
drastically affected by our ability to perceive the object’s
distance. A small box seen close up can, in the absence of accurate information about its distance, be misperceived as a
large truck seen from far away (Figure 10.26). The idea that
we can misperceive size when accurate depth information
is not present was demonstrated in a classic experiment by
A. H. Holway and Edwin Boring (1941).
The Holway and Boring Experiment
Observers in Holway and Boring’s experiment sat at the intersection of two hallways and saw a luminous test circle when
looking down the right hallway and a luminous comparison
circle when looking down the left hallway (Figure 10.27).
The comparison circle was always 10 feet from the observer,
but the test circles were presented at distances ranging from
10 feet to 120 feet. The observer’s task on each trial was to
adjust the diameter of the comparison circle on the left to
match their perception of the size of the test circle on the
right.
An important feature of the test stimuli in the right corridor was that they all cast exactly the same-sized image on
the retina. We can understand how this was accomplished
by introducing the concept of visual angle.
What Is Visual Angle? Visual angle is the angle of
an object relative to the observer’s eye. Figure 10.28a shows
how we determine the visual angle of a stimulus (a person,
in this example) by extending lines from the person to the
lens of the observer’s eye. The angle between the lines is the
visual angle. Notice that the visual angle depends both on
the size of the stimulus and on its distance from the observer, so when the person moves closer, as in Figure 10.28b,
the visual angle becomes larger.
The visual angle tells us how large the object will be on
the back of the eye. There are 360 degrees around the entire circumference of the eyeball, and an object with a visual
angle of 1 degree would take up 1/360 of this circumference—about 0.3 mm in an average-sized adult eye. One way
to get a feel for visual angle is to fully extend your arm and
look at your thumb, as the woman in Figure 10.29 is doing.
The approximate visual angle of the width of the thumb at
arm’s length is 2 degrees. Thus, an object that is exactly covered by the thumb held at arm’s length, such as the iPod in
Figure 10.29, has a visual angle of approximately 2 degrees.
Visual angle = 1°
Ground
Far
Near
Comparison
Test
Test circles
(presented one at a
time at different distances)
Figure 10.27 ❚ Setup of Holway and Boring’s (1941)
Figure 10.26 ❚ When a helicopter pilot loses the ability to
perceive distance, due to “whiteout,” a small box that is close
can be mistaken for a truck that is far away.
244
CHAPTER 10
Perceiving Depth and Size
experiment. The observer changes the diameter of the
comparison circle in the left corridor to match his or her
perception of the size of test circles presented in the right
corridor. Each test circle has a visual angle of 1 degree and is
presented separately. This diagram is not drawn to scale. The
actual distance of the far test circle was 100 feet.
Visual angle
(a)
Size of retinal image
Figure 10.28 ❚ (a) The visual angle depends
on the size of the stimulus (the woman in this
example) and its distance from the observer. (b)
When the woman moves closer to the observer,
the visual angle and the size of the image on the
retina increase. This example shows how halving
the distance between the stimulus and the observer
doubles the size of the image on the retina.
Visual angle
Observer’s eye
(b)
2°
Observer’s eye
Thumb
2°
This “thumb technique” provides a way to determine
the approximate visual angle of any object in the environment. It also illustrates an important property of visual
angle: A small object that is near (like the thumb) and a
larger object that is far (like the iPod) can have the same visual angle. This is illustrated in Figure 10.30, which shows
a photograph taken by Jennifer, a student in my sensation
and perception class. To take this picture, Jennifer adjusted
the distance between her fingers so that the Eiffel Tower just
fit between them. When she did this, the space between her
fingers had the same visual angle as the Eiffel Tower.
How Holway and Boring Tested Size Perception in a Hallway The idea that objects with dif-
ferent sizes can have the same visual angle was used in the
creation of the test circles in Holway and Boring’s experiment. You can see from Figure 10.27 that small circles were
positioned close to the observer and larger circles were positioned farther away, and that all of the circles had a visual
angle of 1 degree. Objects with the same visual angle create
the same-sized image on the retina, so all of the test circles
had the same-sized image on the observers’ retinas, no matter where in the hallway they were located.
Figure 10.29 ❚ The “thumb” method of
determining the visual angle of an object.
When the thumb is at arm’s length, whatever
it covers has a visual angle of about 2
degrees. The woman’s thumb covers the
width of her iPod, so the visual angle of
the iPod, from the woman’s point of view,
is 2 degrees. Note that the visual angle will
change if the distance between the woman
and the iPod changes.
In the first part of Holway and Boring’s experiment,
many depth cues were available, including binocular disparity, motion parallax, and shading, so the observer could
easily judge the distance of the test circles. The results, indicated by line 1 in Figure 10.31, show that even though all of
the retinal images were the same size, observers based their
judgments on the physical sizes of the circles. When they
viewed a large test circle that was located far away (far circle in Figure 10.27), they made the comparison circle large
(point F in Figure 10.31); when they viewed a small test circle that was located nearby (near circle in Figure 10.27), they
made the comparison circle small (point N in Figure 10.31).
The observers’ adjustment of the comparison circle to
match the physical size of the test circles means that they
were accurately judging the physical sizes of the circles.
Holway and Boring then determined how accurate the
observers’ judgments would be when they eliminated depth
information. They did this by having the observer view the
test circles with one eye, which eliminated binocular disparity (line 2 in Figure 10.31); then by having the observer view
the test circles through a peephole, which eliminated motion parallax (line 3); and finally by adding drapes to the
Perceiving Size
245
Jennifer Bittel
Figure 10.30 ❚ The visual angle between the two fingers is
the same as the visual angle of the Eiffel Tower.
Inches
30
1
Size of comparison circle
Physical size
F
2
20
N
3
10
4
Visual angle
0
0
50
100
Distance of test circle (ft)
Figure 10.31 ❚ Results of Holway and Boring’s (1941)
experiment. The dashed line marked “Physical size” is the
result that would be expected if the observers adjusted
the diameter of the comparison circle to match the actual
diameter of each test circle. The line marked “Visual angle”
is the result that would be expected if the observers adjusted
the diameter of the comparison circle to match the visual
angle of each test circle.
hallway to eliminate shadows and reflections (line 4). The results of these experiments indicate that as it became harder
to determine the distance of the test circles, the observer’s
perception of the sizes of the circles became inaccurate.
246
CHAPTER 10
Perceiving Depth and Size
Eliminating depth information made it more difficult
to judge the physical sizes of the circles. Without depth information, the perception of size was determined not by the
actual size of an object but by the size of the object’s image
on the observer’s retina. Because all of the test circles in Holway and Boring’s experiment had the same retinal size, they
were judged to be about the same size once depth information was eliminated. Thus, the results of this experiment indicate that size estimation is based on the actual sizes of objects when there is good depth information (blue lines), but
that size estimation is strongly influenced by the object’s visual angle when depth information is eliminated (red lines).
An example of size perception that is determined by visual angle is our perception of the sizes of the sun and the
moon, which, due to a cosmic coincidence, have the same visual angle. The fact that they have identical visual angles becomes most obvious during an eclipse of the sun. Although
we can see the flaming corona of the sun surrounding the
moon, as shown in Figure 10.32, the moon’s disk almost exactly covers the disk of the sun.
If we calculate the visual angles of the sun and the
moon, the result is 0.5 degrees for both. As you can see in
Figure 10.32, the moon is small (diameter 2,200 miles) but
close (245,000 miles from Earth), whereas the sun is large
(diameter 865,400 miles) but far away (93 million miles
from Earth). Even though these two celestial bodies are
vastly different in size, we perceive them to be the same size
because, as we are unable to perceive their distance, we base
our judgment on their visual angles.
In yet another example, we perceive objects viewed from
a high-flying airplane as very small. Because we have no way
of accurately estimating the distance from the airplane to
the ground, we perceive size based on objects’ visual angles,
which are very small because we are so high up.
Size Constancy
The examples just described all demonstrate a link between
our perception of size and our perception of depth, with
good depth perception favoring accurate size perception.
And even though our perception of size is not always totally
accurate (Gilinsky, 1951), it is good enough to cause psychologists to propose the principle of size constancy. This
principle states that our perception of an object’s size remains relatively constant, even when we view an object from
different distances, which changes the size of the object’s
image on the retina.
To introduce the idea of size constancy to my perception
class, I ask someone in the front row to estimate my height
when I am standing about 3 feet away. Their guess is usually
accurate, around 5 feet 9 inches. I then take one large step
back so I am now 6 feet away and ask the person to estimate
my height again. It probably doesn’t surprise you that the
second estimate of my height is about the same as the first.
The point of this demonstration is that even though my image on the person’s retina becomes half as large when I step
0.5°
Moon
2,200
miles
245,000 miles
0.5°
Eclipse of the sun
Sun
865,400
miles
93,000,000 miles
back to 6 feet (Figure 10.28), I do not appear to shrink to
less than 3 feet tall, but still appear to be my normal size.
This perception of size as remaining constant no matter
what the viewing distance is size constancy. The following
demonstration illustrates size constancy in another way.
D E M O N S T R AT I O N
Perceiving Size at a Distance
Hold a quarter between the fingertips of each hand so you
can see the faces of both coins. Hold one coin about a foot
from you and the other at arm’s length. Observe the coins
with both of your eyes open and note their sizes. Under these
conditions, most people perceive the near and far coins as
being approximately the same size. Now close one eye, and
holding the coins so they appear side-by-side, notice how
your perception of the size of the far coin changes so that it
now appears smaller than the near coin. This demonstrates
how size constancy is decreased under conditions of poor
depth information. ❚
Size Constancy as a Calculation The link
between size constancy and depth perception has led to
the proposal that size constancy is based on a mechanism
called size–distance scaling that takes an object’s distance
into account (Gregory, 1966). Size–distance scaling operates according to the equation S ⫽ K (R ⫻ D), where S is the
object’s perceived size, K is a constant, R is the size of the
retinal image, and D is the perceived distance of the object.
(Since we are mainly interested in R and D, and K is a scaling factor that is always the same, we will omit K in the rest
of our discussion).
According to the size–distance equation, as a person
walks away from you, the size of the person’s image on your
retina (R) gets smaller, but your perception of the person’s
distance (D) gets larger. These two changes balance each
other, and the net result is that you perceive the person’s
size (S) as remaining constant.
Figure 10.32 ❚ The
moon’s disk almost
exactly covers the sun
during an eclipse because
the sun and the moon
have the same visual
angle.
D E M O N S T R AT I O N
Size–Distance Scaling and Emmert’s Law
You can demonstrate size–distance scaling to yourself by
looking back at Figure 8.20 in Chapter 8 (page 190). Look at
the center of the circle for about 60 seconds. Then look at
the white space to the side of the circle and blink to see the
circle’s afterimage. Before the afterimage fades, also look
at a wall far across the room. You should see that the size of
the afterimage depends on where you look. If you look at a
distant surface, such as the far wall of the room, you see a
large afterimage that appears to be far away. If you look at a
near surface, such as the page of this book, you see a small
afterimage that appears to be close. ❚
Figure 10.33 illustrates the principle underlying the effect you just experienced, which was first described by Emmert in 1881. Staring at the circle bleached a small circular area of visual pigment on your retina (see page 55). This
bleached area of the retina determined the retinal size of
the afterimage and remained constant no matter where you
were looking.
The perceived size of the afterimage, as shown in Figure
10.33, is determined by the distance of the surface against
which the afterimage is viewed. This relationship between
the apparent distance of an afterimage and its perceived size
is known as Emmert’s law: The farther away an afterimage
appears, the larger it will seem. This result follows from our
size–distance scaling equation, S ⫽ R ⫻ D. The size of the
bleached area of pigment on the retina (R) always stays the
same, so that increasing the afterimage’s distance (D) increases the magnitude of R ⫻ D. We therefore perceive the
size of the afterimage (S) as larger when it is viewed against
the far wall.
Other Information for Size Perception Al-
though we have been stressing the link between size constancy and depth perception and how size–distance scaling
works, other sources of information in the environment
also help achieve size constancy. One source of information
Perceiving Size
247
Afterimage
on wall
Afterimage
on book
Figure 10.33 ❚ The principle behind the
observation that the size of an afterimage increases
as the afterimage is viewed against more distant
surfaces.
Retinal image of circle
(bleached pigment)
person of average height stands next to one of these players,
the player’s true height becomes evident.
Another source of information for size perception
is the relationship between objects and texture information on the ground. We saw that a texture gradient occurs
when elements that are equally spaced in a scene appear to
be more closely packed as distance increases (Figure 10.6).
Figure 10.35 shows two cylinders sitting on a texture gradient formed by a cobblestone road. Even if we have trouble
perceiving the depth of the near and far cylinders, we can
Bruce Goldstein
Bruce Goldstein
for size perception is relative size. We often use the sizes
of familiar objects as a yardstick to judge the size of other
objects, as in Figure 10.34, in which the size of the woman
indicates that the wheel is very large. (Also see the chapter
opening picture, facing page 229, in which the size of the
house indicates that the trees are very tall.) This idea that
our perception of the sizes of objects can be influenced by
the sizes of nearby objects explains why we often fail to appreciate how tall basketball players are, when all we see for
comparison are other basketball players. But as soon as a
Figure 10.34 ❚ The size of this wheel becomes apparent
Figure 10.35 ❚ Two cylinders resting on a texture gradient.
when it can be compared to an object of known size, such as
the person. If the wheel were seen in total isolation, it would
be difficult to know that it is so large.
According to Gibson (1950), the fact that the bases of both
cylinders cover the same number of units on the gradient
indicates that the bases of the two cylinders are the same size.
248
CHAPTER 10
Perceiving Depth and Size
tell that they are the same size because their bases both
cover the same portion of a paving stone.
Visual Illusions
Visual illusions fascinate people because they demonstrate
how our visual system can be “tricked” into seeing inaccurately (Bach & Poloschek, 2006). We have already described
a number of types of illusions: Illusions of lightness include
Mach bands (page 64), in which small changes in lightness
are seen near a border even though no changes are present in the physical pattern of light; simultaneous contrast
(page 66) and White’s illusion (page 67), in which two physically identical fields can appear different; and the Hermann
grid (page 63), in which small gray spots are seen that aren’t
there in the light. Attentional effects include change blindness (page 139), in which two alternating scenes appear
similar even though there are differences between them. Illusions of motion are those in which stationary stimuli are
perceived as moving (page 180).
We will now describe some illusions of size—situations
that lead us to misperceive the size of an object. We will see
that some explanations of these illusions involve the connection we have described between the perception of size
and the perception of depth. We will also see that some of
the most familiar illusions have yet to be fully explained.
A good example of this situation is provided by the
Müller-Lyer illusion.
The Müller-Lyer Illusion
In the Müller-Lyer illusion, the right vertical line in Figure
10.36 appears to be longer than the left vertical line, even
though they are both exactly the same length (measure
them). It is obvious by just looking at these figures that one
line appears longer than the other, but you can measure
how much longer the right line appears by using the simple matching procedure described in the following V
L 9
demonstration.
D E M O N S T R AT I O N
Measuring the Müller-Lyer Illusion
The first step in measuring the Müller-Lyer illusion is to create
a “standard stimulus” by drawing a line 30 millimeters long
on an index card and adding outward-going fins, as in the
right figure in Figure 10.36. Then, on separate cards, create
“comparison stimuli” by drawing lines that are 28, 30, 32, 34,
36, 38, and 40 millimeters long with inward-going fins, as in
the left figure. Then ask your observer to pick the comparison
stimulus that most closely matches the length of the standard
stimulus. The difference in length between the standard stimulus and the comparison stimulus chosen by your observer
(typically between 10 percent and 30 percent) defines the
Figure 10.36 ❚ The Müller-Lyer illusion. Both lines are
actually the same length.
size of the illusion. Try this procedure on a number of people
to see how variable it is. ❚
Misapplied Size Constancy Scaling Why does
the Müller-Lyer display cause a misperception of size? Richard Gregory (1966) explains the illusion on the basis of a
mechanism he calls misapplied size constancy scaling. He
points out that size constancy normally helps us maintain a
stable perception of objects by taking distance into account
(as expressed in the size–distance scaling equation). Thus,
size constancy scaling causes a 6-foot-tall person to appear
6 feet tall no matter what his distance. Gregory proposes,
however, that the very mechanisms that help us maintain
stable perceptions in the three-dimensional world sometimes create illusions when applied to objects drawn on a
two-dimensional surface.
We can see how misapplied size constancy scaling works
by comparing the left and right lines in Figure 10.36 to the
left and right lines that have been superimposed on the corners in Figure 10.37. Gregory suggests that the fins on the
right line in Figure 10.37 make this line look like part of an
inside corner, and that the fins on the left line make this
line look like part of an outside corner. Because inside corners appear to “recede” and outside corners “jut out,” our
size–distance scaling mechanism treats the inside corner as
if it is farther away, so the term D in the equation S ⫽ R ⫻ D
is larger and this line therefore appears longer. (Remember
that the retinal sizes, R, of the two lines are the same, so perceived size, S, is determined by the perceived distance, D.)
At this point, you could say that although the MüllerLyer figures may remind Gregory of inside and outside corners, they don’t look that way to you (or at least they didn’t
until Gregory told you to see them that way). But according
to Gregory, it is not necessary that you be consciously aware
that these lines can represent three-dimensional structures;
your perceptual system unconsciously takes the depth information contained in the Müller-Lyer figures into account, and your size–distance scaling mechanism adjusts
the perceived sizes of the lines accordingly.
Visual Illusions
249
Bruce Goldstein
Figure 10.37 ❚ According to Gregory (1966), the Müller-Lyer line on the left corresponds to an outside corner, and the line on
the right corresponds to an inside corner. Note that the two vertical lines are the same length (measure them!).
Gregory’s theory of visual illusions has not, however,
gone unchallenged. For example, figures like the dumbbells in Figure 10.38, which contain no obvious perspective
or depth, still result in an illusion. And Patricia DeLucia
and Julian Hochberg (1985, 1986, 1991; Hochberg, 1987)
have shown that the Müller-Lyer illusion occurs for a threedimensional display like the one in Figure 10.39, in which it
is obvious that the spaces between the two sets of fins are not
at different depths. (Measure distances x and y to convince
yourself that they are the same.) You can experience this effect for yourself by doing the following demonstration.
D E M O N S T R AT I O N
The Müller-Lyer Illusion With Books
Pick three books that are the same size and arrange two of
them with their corners making a 90-degree angle and standing in positions A and B, as shown in Figure 10.39. Then,
without using a ruler, position the third book at position C, so
that distance x appears to be equal to distance y. Check your
placement, looking down at the books from the top and from
A
B
x
Figure 10.38 ❚ The “dumbbell”’ version of the Müller-Lyer
illusion. As in the original Müller-Lyer illusion, the two straight
lines are actually the same length.
250
CHAPTER 10
Perceiving Depth and Size
C
y
Figure 10.39 ❚ A three-dimensional Müller-Lyer illusion.
The 2-foot-high wooden “fins” stand on the floor. Although
the distances x and y are the same, distance y appears larger,
just as in the two-dimensional Müller-Lyer illusion.
other angles as well. When you are satisfied that distances x
and y appear about equal, measure the distances with a ruler.
How do they compare? ❚
The Ponzo Illusion
In the Ponzo (or railroad track) illusion, shown in Figure 10.41, both animals are the same size on the page, and so
have the same visual angle, but the one on top appears longer. According to Gregory’s misapplied scaling explanation,
the top animal appears larger because of depth information
provided by the converging railroad tracks that make the top
animal appear farther away. Thus, just as in the Müller-Lyer
illusion, the scaling mechanism corrects for this apparently
increased depth (even though there really isn’t any, because
the illusion is on a flat page), and we perceive the top animal
to be larger. (Also see Prinzmetal et al., 2001; Shimamura
& Prinzmetal, 1999, for another explanation of V
L 10, 11
the Ponzo illusion.)
If you set distance y so that it was smaller than distance x,
this is exactly the result you would expect from the twodimensional Müller-Lyer illusion, in which the distance between the outward-going fins appears enlarged compared
to the distance between the inward-going fins. You can
also duplicate the illusion shown in Figure 10.39 with your
books by using your ruler to make distances x and y equal.
Then, notice how the distances actually appear. The fact
that we can create the Müller-Lyer illusion by using threedimensional stimuli such as these, along with demonstrations like the dumbbell in Figure 10.38, is difficult for Gregory’s theory to explain.
The Ames Room
Conflicting Cues Theory R. H. Day (1989, 1990)
The Ames room causes two people of equal size to appear
very different in size (Ittleson, 1952). In Figure 10.42, you can
see that the woman on the right looks much taller than the
woman on the left. This perception occurs even though both
women are actually about the same height. The reason for
this erroneous perception of size lies in the construction of
the room. The shapes of the wall and the windows at the rear
of the room make it look like a normal rectangular room
when viewed from a particular observation point; however,
as shown in the diagram in Figure 10.43, the Ames room is
Image not available due to copyright restrictions
William Vann/www.edupic.net
has proposed the conflicting cues theory, which states that
our perception of line length depends on two cues: (1) the
actual length of the vertical lines, and (2) the overall length
of the figure. According to Day, these two conflicting cues
are integrated to form a compromise perception of length.
Because the overall length of the right figure in Figure 10.36
is larger due to its outward-oriented fins, the vertical line
appears larger.
Another version of the Müller-Lyer illusion, shown in
Figure 10.40, results in the perception that the space between the dots is greater in the lower figure than in the upper figure, even though the distances are actually the same.
According to Day’s conflicting cues theory, the space in
the lower figure appears greater because the overall extent
of the figure is greater. Notice that conflicting cues theory
can also be applied to the dumbbell display in Figure 10.38.
Thus, although Gregory believes that depth information is
involved in determining illusions, Day rejects this idea and
says that cues for length are what is important. Let’s now
look at some more examples of illusions and the mechanisms that have been proposed to explain them.
Figure 10.41 ❚ The Ponzo (or railroad track) illusion. The
two animals are the same length on the page (measure them),
but the far one appears larger. (Courtesy of Mary Bravo.)
Visual Illusions
251
women is determined by how they fill the distance between
the bottom and top of the room. Because the woman on the
right fills the entire space and the woman on the left occupies only a little of it, we perceive the woman on the right as
taller (Sedgwick, 2001).
© Phil Schermeister/CORBIS
The Moon Illusion
Figure 10.42 ❚ The Ames room. Both women are actually
the same height, but the woman on the right appears taller
because of the distorted shape of the room.
Peephole
You may have noticed that when the moon is on the horizon,
it appears much larger than when it is higher in the sky. This
enlargement of the horizon moon compared to the elevated
moon, shown in Figure 10.44, is called the moon illusion.
When I discuss this in class, I first explain that visual angles
of the horizon moon and elevated moon are the same. This
must be so because the moon’s physical size (2,200 miles in
diameter) and distance from Earth (245,000 miles) are constant throughout the night; therefore, the moon’s visual angle must be constant. (If you are still skeptical, photograph
the horizon and the elevated moons with a digital camera.
When you compare the two images, you will find that the
diameters in the resulting two pictures are identical. Or you
can view the moon through a quarter-inch-diameter hole
held at about arm’s length. For most people, the moon just
fits inside this hole, wherever it is in the sky.)
Once students are convinced that the moon’s visual angle remains the same throughout the night, I ask why they
think the moon appears larger on the horizon. One common
response is “When the moon is on the horizon, it appears
closer, and that is why it appears larger.” When I ask why
it appears closer, I often receive the explanation “Because it
Figure 10.43 ❚ The Ames room, showing its true shape.
The woman on the left is actually almost twice as far from the
observer as the one on the right; however, when the room is
viewed through the peephole, this difference in distance is
not seen. In order for the room to look normal when viewed
through the peephole, it is necessary to enlarge the left side
of the room.
actually shaped so that the left corner of the room is almost
twice as far from the observer as the right corner.
What’s happening in the Ames room? The construction
of the room causes the woman on the left to have a much
smaller visual angle than the one on the right. We think
that we are looking into a normal rectangular room at two
women who appear to be at the same distance, so we perceive
the one with the smaller visual angle as shorter. We can understand why this occurs by returning to our size–distance
scaling equation, S ⫽ R ⫻ D. Because the perceived distance
(D) is the same for the two women, but the size of the retinal
image (R) is smaller for the woman on the left, her perceived
size (S) is smaller.
Another explanation for the Ames room is based not on
size–distance scaling, but on relative size. The relative size
explanation states that our perception of the size of the two
252
CHAPTER 10
Perceiving Depth and Size
Figure 10.44 ❚ An artist’s conception of the how the moon
is perceived when it is on the horizon and when it is high in
the sky. Note that the visual angle of the horizon moon is
depicted as larger than the visual angle of the moon high in
the sky. This is because the picture is simulating the illusion.
In the environment, the visual angles of the two moons are
the same.
Elevated moon
“Flattened heavens”
Same visual angle
Horizon moon
H
appears larger.” But saying “It appears larger because it appears closer, and it appears closer because it appears larger”
is clearly a case of circular reasoning that doesn’t really explain the moon illusion.
One explanation that isn’t circular is called the apparent distance theory. This theory does take distance into account, but in a way opposite to our hypothetical student’s
explanation. According to apparent distance theory, the
moon on the horizon appears more distant because it is
viewed across the filled space of the terrain, which contains
depth information; but when the moon is higher in the sky,
it appears less distant because it is viewed through empty
space, which contains little depth information.
The idea that the horizon is perceived as farther away
than the sky overhead is supported by the fact that when
people estimate the distance to the horizon and the distance to the sky directly overhead, they report that the horizon appears to be farther away. That is, the heavens appear
“flattened” (Figure 10.45).
The key to the moon illusion, according to apparent
distance theory, is that both the horizon and the elevated
moons have the same visual angle, but because the horizon
moon is seen against the horizon, which appears farther
than the zenith sky, it appears larger. This follows from
the size–distance scaling equation, S ⫽ R ⫻ D, because
retinal size, R, is the same for both locations of the moon
(remember that the visual angle is always the same), so the
moon that appears farther away will appear larger. This is
the principle we invoked to explain why an afterimage appears larger if it is viewed against a faraway surface in the
Emmert’s law demonstration.
Just as the near and far afterimages in the Emmert’s
law demonstration have the same visual angles, so do the
horizon and the elevated moons. The afterimage that appears on the wall far across the room simulates the horizon
moon; the circle appears farther away, so your size–distance
scaling mechanism makes it appear larger. The afterimage that is viewed on a close surface simulates the elevated
moon; the circle appears closer, so your scaling mechanism
makes it appear smaller (King & Gruber, 1962).
Lloyd Kaufman and Irvin Rock (1962a, 1962b) have
done a number of experiments that support the apparent
distance theory. In one of their experiments, they showed
that when the horizon moon was viewed over the terrain,
which made it seem farther away, it appeared 1.3 times
Figure 10.45 ❚ When observers
are asked to consider the sky as a
surface and to compare the distance
to the horizon (H) and the distance
to the top of the sky on a clear
moonless night, they usually say that
the horizon appears farther away.
This results in the “flattened heavens”
shown here.
larger than the elevated moon; however, when the terrain was masked off so that the horizon moon was viewed
through a hole in a sheet of cardboard, the illusion vanished
(Kaufman & Rock, 1962a, 1962b; Rock & Kaufman, 1962).
Some researchers, however, are skeptical of the apparent distance theory. They question the idea that the horizon
moon appears farther, as shown in the flattened heavens effect in Figure 10.45, because some observers see the horizon
moon as floating in space in front of the sky (Plug & Ross,
1994).
Another theory of the moon illusion is the angular
size contrast theory, which states that the moon appears
smaller when it is surrounded by larger objects. Thus, when
the moon is elevated, the large expanse of sky surrounding
it makes it appear smaller. However, when the moon is on
the horizon, less sky surrounds it, so it appears larger (Baird
et al., 1990).
Even though scientists have been proposing theories to
explain the moon illusion for hundreds of years, there is still
no agreement on an explanation (Hershenson, 1989). Apparently a number of factors are involved, in addition to the
ones we have considered here, including atmospheric perspective (looking through haze on the horizon can increase
size perception), color (redness increases perceived size), and
oculomotor factors (convergence of the eyes, which tends to
occur when we look toward the horizon and can cause an increase in perceived size; Plug & Ross, 1994). Just as many different sources of depth information work together to create
our impression of depth, many different factors may work
together to create the moon illusion, and perhaps V
L 12–16
the other illusions as well.
Something to Consider:
Distance Perception and
Perceived Effort
Imagine the following situation: You are hiking in the
woods with a friend. You have agreed to take turns carrying
a heavy backpack, and it is your turn. In the distance you
see the small lake where you plan to set up camp. Just as you
are thinking that it is pretty far to the lake, your friend says,
“There’s the lake. It’s pretty close.”
Something to Consider: Distance Perception and Perceived Effort
253
Estimated distance (m)
10.1
10.0
10
9.2
8.2
7.8
7.0
5
(a)
(b)
(c)
Actual distance = 10 m
(a)
might expect that the group that was told they would be
throwing would estimate the distance as greater than those
who were told they would be walking. The results, in Figure 10.46c, indicate that this is what happened. Apparently
just thinking about expending effort over a distance can increase people’s judgment of distance.
What all of this adds up to is that distance perception
depends not only on optical information, such as monocular and binocular depth cues, but also on actions we intend
to perform and the effort associated with these actions.
This is consistent with our discussion in Chapter 7 (Taking Action), in which we saw how perception and action are
closely linked.
= No backpack
= Wore backpack
(b)
= Threw heavy ball
(c)
T E S T YO U R S E L F 10. 2
= Threw light ball
1. Describe the Holway and Boring experiment.
= Intend to walk blindfolded
= Intend to throw ball blindfolded
2.
Figure 10.46 ❚ Results of the Witt et al. (2004) experiment.
See text for explanation.
The idea that wearing a heavy backpack may make
things appear more distant has been confirmed in the laboratory, by having people judge the distance to various targets while wearing a heavy backpack and while not wearing a backpack (Proffitt et al., 2003). The people in this
experiment did not have to walk the distances wearing the
backpack; they just wore the backpack while making their
distance estimates. The result, in Figure 10.46a, shows that
people estimated the distance as farther when wearing the
backpack.
To test the idea that judging distance might depend on
the effort that people believe is associated with a particular
distance, Janice Witt and coworkers (2004) had participants
throw balls to targets ranging from 4 to 10 meters away.
After they had thrown either a light ball or a heavy ball,
participants estimated the distances to the targets. The results for the 10-meter target, shown in Figure 10.46b, indicate that distance estimates were larger after throwing the
heavy ball.
Finally, here’s an additional twist to these findings: Apparently, distance judgments are determined not only by the
amount of effort people actually exert, but their expectation
that they will have to exert some effort. This was demonstrated by dividing participants who had previously thrown
heavy balls into two groups. One group was told that they
were going to have to throw the balls at the targets while
blindfolded, and the other group was told that they were going to have to walk to the targets while blindfolded. Because
throwing heavy balls involves more effort than walking, we
254
CHAPTER 10
Perceiving Depth and Size
3.
4.
5.
6.
7.
8.
What do the results of this experiment tell us
about how size perception is influenced by depth
perception?
What are some examples of situations in which our
perception of an object’s size is determined by the
object’s visual angle? Under what conditions does
this occur?
What is size constancy, and under what conditions
does it occur?
What is size–distance scaling? How does it explain
size constancy?
Describe two other types of information (other than
depth) that can influence our perception of size.
Describe how illusions of size, such as the MüllerLyer illusion, the Ponzo illusion, the Ames room,
and the moon illusion, can be explained in terms of
size–distance scaling.
What are some problems with the size–distance
scaling explanation of (a) the Müller-Lyer illusion and
(b) the moon illusion? What alternative explanations
have been proposed?
What does it mean to say that the perception of distance depends not only on optical information but
also on perceived effort?
THINK ABOUT IT
1.
Texture gradients are said to provide information for
depth perception because elements in a scene become
more densely packed as distance increases. The classic example of a texture gradient is a tiled floor, like
the one in Figure 10.47, which has regularly spaced elements. But regularly spaced elements are more the
exception than the rule in the environment. Make an
informal survey of your environment, both inside and
outside, and decide (1) whether texture gradients are
ceived as “flat.” What steps do these artists have to take
to accomplish this? (p. 231)
IF YOU WANT TO KNOW MORE
Perception of spatial layout can affect the perception of
lightness. A classic early paper showed that our perception of light and dark can be strongly influenced by
our perception of the locations of surfaces in space.
(p. 231)
Gilchrist, A. L. (1977). Perceived lightness depends
on perceived spatial arrangement. Science, 195,
185–187.
2.
Achieving stereopsis after decades without it. Neurologist
Oliver Sachs gives an account of a woman who had
been unable to achieve stereopsis for decades because
of a condition that prevented coordination of her left
and right eyes. He describes how, through therapy that
included wearing prisms and doing eye exercises, she
was able to achieve stereopsis and an enhanced perception of depth. (p. 238)
Sacks, O. (2006, June 19). Stereo Sue. New Yorker,
64–73.
3.
How depth cues are combined in the brain. Our perception
of depth is determined by a combination of different
cues working together. The experiments described in
the following article show which brain structures may
be involved in combining these cues. (p. 242)
Welchman, A. E., Deubelius, A., Conrad, V.,
Bülthoff, H. H., & Kourtzi, Z. (2005). 3D shape
perception from combined depth cues in human
visual cortex. Nature Neuroscience, 8, 820–827.
4.
Information about depth and size in the primary visual
cortex. The mechanism responsible for how depth
perception can influence our perception of an object’s
size was originally thought to be located in higher
areas of the visual system, where size and depth information were combined. Recent research has shown
that this process may occur as early as the primary
visual cortex. (p. 242)
Murray, S. O., Boyaci, H., & Kersten, D. (2006). The
representation of perceived angular size in human primary visual cortex. Nature Neuroscience, 9,
429–434.
Sterzer, P., & Rees, G. (2006). Perceived size matters.
Nature Neuroscience, 9, 302–304.
5.
Action and depth perception. Actions such as locomotion, eye and hand movements, and the manipulation
of objects can influence our perception of threedimensional space and an object’s shape. (p. 253)
Wexler, M., & van Boxtel, J. J. A. (2005). Depth perception by the active observer. Trends in Cognitive
Sciences, 9, 431–438.
Bruce Goldstein
1.
Figure 10.47 ❚ Texture gradients in a hallway in the
Versailles Palace in France. How prevalent is texture gradient
information in the environment in general?
present in your environment and (2) if you think the
principle behind texture gradients could contribute to
the perception of depth even if the texture information
in the environment is not as obvious as the information
in Figure 10.47. (p. 233)
2.
How could you determine the contribution of binocular
vision to depth perception? One way would be to close
one eye and notice how this affects your perception. Try
this, and describe any changes you notice. Then devise
a way to quantitatively measure the accuracy of depth
perception that is possible with two-eyed and one-eyed
vision. (p. 235)
3.
One of the triumphs of art is creating the impression of
depth on a two-dimensional canvas. Go to a museum or
look at pictures in an art book, and identify the depth
information that helps increase the perception of depth
in these pictures. You may also notice that you perceive
less depth in some pictures, especially abstract ones. In
fact, some artists purposely create pictures that are per-
If You Want to Know More
255
KEY TERMS
Absolute disparity (p. 237)
Accretion (p. 234)
Ames room (p. 251)
Angle of disparity (p. 237)
Angular size contrast theory (p. 253)
Apparent distance theory (p. 253)
Atmospheric perspective (p. 232)
Binocular depth cell (p. 242)
Binocular disparity (p. 235)
Conflicting cues theory (p. 251)
Correspondence problem (p. 240)
Corresponding retinal points (p. 236)
Cue approach to depth perception
(p. 230)
Deletion (p. 234)
Disparity-selective cell (p. 242)
Echolocation (p. 241)
Emmert’s law (p. 247)
Familiar size (p. 232)
Frontal eyes (p. 240)
Horopter (p. 236)
Lateral eyes (p. 240)
Misapplied size constancy scaling
(p. 249)
Monocular cue (p. 231)
Moon illusion (p. 252)
Motion parallax (p. 233)
Müller-Lyer illusion (p. 249)
Noncorresponding points (p. 236)
Occlusion (p. 230)
Oculomotor cue (p. 231)
Perspective convergence (p. 232)
Pictorial cue (p. 231)
Ponzo illusion (p. 251)
Random-dot stereogram (p. 239)
Relative disparity (p. 237)
Relative height (p. 231)
Relative size (p. 231)
Size constancy (p. 246)
Size–distance scaling (p. 247)
Stereopsis (p. 238)
Stereoscope (p. 238)
Texture gradient (p. 233)
Visual angle (p. 244)
Shape From Shading How the shadows that result from
illumination can help define the shape of a rotating threedimensional object.
3. The Horopter and Corresponding Points How corresponding points on the two eyes can be determined by sliding one
eye over the other. How the angle of convergence changes
with different distances of fixation.
MEDIA RESOURCES
The Sensation and Perception
Book Companion Website
2.
www.cengage.com/psychology/goldstein
See the companion website for flashcards, practice quiz
questions, Internet links, updates, critical thinking
exercises, discussion forums, games, and more!
Disparity and Retinal Location How disparity changes as
one object is moved closer to the eye as a person fixates on
another object.
4.
CengageNOW
www.cengage.com/cengagenow
Go to this site for the link to CengageNOW, your one-stop
shop. Take a pre-test for this chapter, and CengageNOW
will generate a personalized study plan based on your test
results. The study plan will identify the topics you need to
review and direct you to online resources to help you master those topics. You can then take a post-test to help you
determine the concepts you have mastered and what you
will still need to work on.
VL
Virtual Lab
Your Virtual Lab is designed to help you get the most out
of this course. The Virtual Lab icons direct you to specific
media demonstrations and experiments designed to help
you visualize what you are reading about. The number
beside each icon indicates the number of the media element
you can access through your CD-ROM, CengageNOW, or
WebTutor resource.
The following lab exercises are related to the material
in this chapter:
1. Convergence Shows how convergence of the eyes depends
on an object’s distance.
256
CHAPTER 10
Perceiving Depth and Size
Pictures Some “classic” stereograms of photographs.
Red–green glasses required.
5.
Outlines Stereogram of a Necker cube. Red–green
glasses required.
6.
Depth Perception An experiment in which you can
determine how your perception of depth changes with the
amount of binocular disparity. Red–green glasses required.
7.
Random-Dot Stereogram How the perception of depth
can be created by random-dot stereograms. Red–green
glasses required.
8.
The Müller-Lyer Illusion Measure the effect of the
Müller-Lyer illusion with both inward and outward fins.
9.
The Ponzo Illusion Measure the size of the Ponzo
(railroad track) illusion.
10.
Size Perception and Depth How perspective cues can
cause two “monsters” to appear different in size.
11.
Horizontal–Vertical Illusion Measure the size of the
horizontal–vertical illusion.
12.
Zollner Illusion How context can affect the perceived
orientation of parallel lines.
13.
14.
Context and Perception: The Hering Illusion How background lines can make straight parallel lines appear to
curve outward.
16.
Context and Perception: The Poggendorf Illusion How
interrupting a straight line makes the segments of the line
look as though they don’t line up. (Courtesy of Michael
Bach.)
Also see VL2 (Measuring Illusions) in Chapter 1.
15.
Poggendorf Illusion Measure the size of the Poggendorf
illusion.
Media Resources
257
Space Shuttle launch
ground zero (150)
150
Military jet
on runway (140)
140
Pain threshold (130)
130
Rock concert in
front row (120)
120
110
Loud basketball or
hockey crowd (100)
100
90
Heavy traffic (80)
80
70
Normal
conversation (60)
60
50
Library (40)
40
30
Whisper at 5 feet (20)
20
10
Threshold of
hearing (0)
0
dB
Chapter Contents
C H A P T E R
1 1
THE SOUND STIMULUS
Sound as Pressure Changes
Pressure Changes: Pure Tones
Pressure Changes: Complex Tones
PERCEIVING SOUND
Loudness
Pitch
The Range of Hearing
Timbre
❚ TEST YOURSELF 11.1
THE EAR
The Outer Ear
The Middle Ear
The Inner Ear
THE REPRESENTATION OF
FREQUENCY IN THE COCHLEA
Békésy’s Place Theory of Hearing
Evidence for Place Theory
METHOD: Neural Frequency Tuning
Curves
METHOD: Auditory Masking
How the Basilar Membrane Vibrates to
Complex Tones
Updating Békésy
How the Timing of Neural Firing Can
Signal Frequency
Hearing Loss Due to Hair Cell Damage
❚ TEST YOURSELF 11.2
Sound, the
Auditory
System,
and Pitch
Perception
CENTRAL AUDITORY PROCESSING
Pathway From the Cochlea to the Cortex
Auditory Areas in the Cortex
What and Where Streams for Hearing
PITCH AND THE BRAIN
Linking Physiological Responding and
Perception
How the Auditory Cortex Is Shaped by
Experience
SOMETHING TO CONSIDER:
COCHLEAR IMPLANTS—WHERE
SCIENCE AND CULTURE MEET
The Technology
The Controversy
❚ TEST YOURSELF 11.3
We hear sounds ranging from a quite whisper to the
roar of a rocket blasting off. The graph shows how this range of sound
stimuli can be plotted using a measure called decibels (dB), which is
described in the chapter.
OPPOSITE PAGE
© Roger Ressmeyer/CORBIS
Think About It
If You Want to Know More
Key Terms
Media Resources
VL VIRTUAL LAB
VL The Virtual Lab icons direct you to specific animations and videos
designed to help you visualize what you are reading about. The number beside
each icon indicates the number of the clip you can access through your
CD-ROM or your student website.
259
Some Questions We Will Consider:
❚ If a tree falls in the forest and no one is there to hear it,
is there a sound? (p. 261)
❚ What is it that makes sounds high pitched or low
pitched? (p. 265)
❚ How do sound vibrations inside the ear lead to the
perception of different pitches? (p. 273)
❚ How are sounds represented in the auditory cortex?
(p. 280)
H
earing has an extremely important
function in my life. I was born legally
blind, so although I can see, my vision is highly
impaired and is not correctable. Even though
I am not usually shy or embarrassed, sometimes
I do not want to call attention to myself and my
disability. . . . There are many methods that I can
use to improve my sight in class, like sitting close
to the board or copying from a friend, but sometimes these things are impossible. Then
I use my hearing to take notes. . . . My hearing
is very strong. While I do not need my hearing
to identify people who are very close to me, it is
definitely necessary when someone is calling my
name from a distance. I can recognize their voice,
even if I cannot see them.
This statement, written by one of my students, Jill Robbins, illustrates a special effect hearing has had on her life.
The next statement, by student Eileen Lusk, illustrates her
reaction to temporarily losing her ability to hear.
In an experiment I did for my sign language
class, I bandaged up my ears so I couldn’t hear
a sound. I had a signing interpreter with me to
translate spoken language. The two hours that I
was “deaf” gave me a great appreciation for deaf
people and their culture. I found it extremely
difficult to communicate, because even though
I could read the signing, I couldn’t keep up with
the pace of the conversation. . . . Also, it was
uncomfortable for me to be in that much silence.
Knowing what a crowded cafeteria sounds like
and not being able to hear the background noise
was an uncomfortable feeling. I couldn’t hear the
buzzing of the fluorescent light, the murmur of
the crowd, or the slurping of my friend’s Coke
(which I usually object to, but which I missed
when I couldn’t hear it). I saw a man drop his
tray, and I heard nothing. I could handle the
signing, but not the silence.
You don’t have to bandage up your ears for two hours
to appreciate what hearing adds to your life. Just close your
eyes for a few minutes, observe the sounds you hear, and notice what they tell you about your environment. What most
260
CHAPTER 11
people experience is that by listening closely they become
aware of many events in the environment that without hearing they would not be aware of at all.
As I sit here in my office in the psychology department,
I hear things that I would be unaware of if I had to rely only
on my sense of vision: people talking in the hall; a car passing by on the street below; and an ambulance, siren blaring, heading up the hill toward the hospital. If it weren’t for
hearing, my world at this particular moment would be limited to what I can see in my office and the scene directly outside my window. Although the silence might make it easier
to concentrate on writing this book or studying my lecture
notes, without hearing I would be unaware of many of the
events in my environment.
Our ability to hear events that we can’t see serves an important signaling function for both animals and humans.
For an animal living in the forest, the rustle of leaves or
the snap of a twig may signal the approach of a predator.
For humans, hearing provides signals such as the warning
sound of a smoke alarm or an ambulance siren, the distinctive high-pitched cry of a baby who is distressed, or telltale
noises that signal problems in a car engine.
But hearing has other functions, too. On the first day
of my perception class, I ask my students which sense they
would choose to keep if they had to pick between hearing
and vision. Two of the strongest arguments for keeping
hearing instead of vision are music and speech. Many of my
students wouldn’t want to give up hearing because of the
pleasure they derive from listening to music, and they also
realize that speech is important because it facilitates communication between people.
Helen Keller, who was both deaf and blind, stated that
she felt being deaf was worse than being blind because blindness isolated her from things, but deafness isolated her from
people. Being unable to hear people talking creates an isolation that makes it difficult to relate to hearing people and
sometimes makes it difficult even to know what is going on.
To appreciate this last point, try watching a dramatic program on television with the sound turned off. You may be
surprised at how little, beyond physical actions and perhaps
some intense emotions, you can understand about the story.
Our goal in this chapter is to describe the basic mechanisms responsible for our ability to hear. We begin by
describing the nature of sound and how we experience
both laboratory-produced sounds and naturally occurring
sounds in the environment. We then consider the physiology behind our perception of pitch, starting with how structures in the ear respond to sound and then how different
parts of the brain respond to sound.
As you read this chapter, you will see important differences between vision and hearing, especially when we
consider the complex path that the sound stimulus must
negotiate in order to reach the receptors. You will also see
similarities, especially in the cortex, where there is evidence
for what and where streams in the auditory system that are
similar to the what and where streams we have described for
vision.
Sound, the Auditory System, and Pitch Perception
The Sound Stimulus
The first step in understanding hearing is to define what
we mean by sound and to show how we measure the characteristics of sound. One way to answer the question “What
is sound?” is to consider the following question: If a tree
falls in the forest and no one is there to hear it, would there be
a sound?
This question is useful because it shows that we can
use the word sound in two different ways. Sometimes sound
refers to a physical stimulus, and sometimes it refers to a
perceptual response. The answer to the question about the
tree depends on which of the following definitions of sound
we use.
■
Physical definition: Sound is pressure changes in the air or
other medium.
Answer to the question: “Yes,” because the falling tree
causes pressure changes whether or not someone is there to
hear them.
■
Perceptual definition: Sound is the experience we have
when we hear.
Answer to the question: “No,” because if no one is in the forest, there would be no experience.
This difference between physical and perceptual is important to be aware of as we discuss hearing in this chapter
and the next two. Luckily, it is usually easy to tell from the
context in which the terms are used whether “sound” refers
to the physical stimulus or to the experience of hearing. For
example, “the sound of the trumpet pierced the air” refers
to the experience of sound, but “the sound’s level was 10
decibels” refers to sound as a physical stimulus. We will first
describe sound as a physical stimulus and then describe
sound as a perceptual experience.
Increase in pressure
(condensation)
Sound as Pressure Changes
A sound stimulus occurs when the movements or vibrations
of an object cause pressure changes in air, water, or any
other elastic medium that surrounds the object. Let’s begin
by considering your radio or stereo system’s loudspeaker,
which is really a device for producing vibrations to be transmitted to the surrounding air. People have been known to
turn up the volume control on their stereos so high that vibrations can be felt through a neighbor’s wall, but even at
lower levels the vibrations are there.
The speaker’s vibrations affect the surrounding air, as
shown in Figure 11.1a. When the diaphragm of the speaker
moves out, it pushes the surrounding air molecules together, a process called condensation, which causes a slight
increase in the density of molecules near the diaphragm.
This increased density results in a local increase in the air
pressure that is superimposed on the atmospheric pressure.
When the speaker diaphragm moves back in, air molecules
spread out to fill in the increased space, a process called
rarefaction. The decreased density of air molecules caused
by rarefaction causes a slight decrease in air pressure. By
repeating this process many hundreds or thousands of
times a second, the speaker creates a pattern of alternating high- and low-pressure regions in the air as neighboring
air molecules affect each other. This pattern of air pressure
changes, which travels through air at 340 meters per second
(and through water at 1,500 meters per second), is called a
sound wave.
You might get the impression from Figure 11.1a that
this traveling sound wave causes air to move outward from
the speaker into the environment. What is actually happening is analogous to the ripples created by a pebble dropped
into a still pool of water (Figure 11.1b). As the ripples move
outward from the pebble, the water at any particular place
moves up and down. This becomes obvious when you realize that the ripples would cause a toy boat to bob up and
Decrease in pressure
(rarefaction)
Figure 11.1 ❚ (a) The effect of a vibrating
(a)
(b)
speaker diaphragm on the surrounding air. Dark
areas represent regions of high air pressure, and
light areas represent areas of low air pressure.
(b) When a pebble is dropped into still water,
the resulting ripples appear to move outward.
However, the water is actually moving up and
down, as indicated by movement of the boat.
A similar situation exists for the sound waves
produced by the speaker in (a).
The Sound Stimulus
261
To describe the pressure changes associated with sound, we
will first focus on a simple kind of sound wave called a pure
tone. A pure tone occurs when pressure changes in the air
occur in a pattern described by a mathematical function
called a sine wave, as shown in Figure 11.2. Tones with this
pattern of pressure changes are occasionally found in the
environment. A person whistling or the high-pitched notes
produced by a flute are close to pure tones. Tuning forks,
which are designed to vibrate with a sine-wave motion, also
produce pure tones. For laboratory studies of hearing, computers generate pure tones that cause a speaker diaphragm
to vibrate in and out with a sine-wave motion. This vibration can be described by noting its amplitude—the size of
the pressure change and its frequency—the number of times
per second that the pressure changes repeat.
Air pressure
Increased
pressure
Amplitude
Atmospheric
pressure
Time
Decreased
pressure
would be to indicate the difference in pressure between the
high and low peaks of the sound wave. Figure 11.3 shows
three pure tones with different amplitudes. The physical
property of amplitude is associated with our experience of
loudness, with higher amplitudes associated with louder
sounds.
The range of amplitudes we can encounter in the environment is extremely large, as shown in Table 11.1, which
shows the relative amplitudes of some environmental
sounds. We can dramatize the size of the range of amplitudes as follows: If the pressure change plotted in the middle record of Figure 11.3, in which the sine wave is about
1/2-inch high on the page, represented the amplitude associated with a sound we can just barely hear, then to plot the
graph for a very loud sound, such as you might hear at a
rock concert, you would need to make the sine wave several
miles high! Since this is somewhat impractical, auditory researchers have devised a unit of sound called the decibel,
TABLE 11.1
Environmental Sounds
SOUND
RELATIVE
AMPLITUDE
Barely audible (threshold)
1
DECIBELS (DB)
0
Leaves rustling
10
20
Quiet residential
community
100
40
Average speaking voice
1,000
Express subway train
100,000
100
60
Propeller plane at takeoff
1,000,000
120
Jet engine at takeoff (pain
threshold)
10,000,000
140
High
Louder
Low
Less
loud
One cycle
(a)
(b)
Figure 11.2 ❚ (a) Plot of sine-wave pressure changes
for a pure tone. (b) Pressure changes are indicated, as in
Figure 11.1, by darkening (pressure increased relative to
atmospheric pressure) and lightening (pressure decreased
relative to atmospheric pressure).
262
❚ Relative Amplitudes and Decibels for
Loudness
(perceptual)
Pressure Changes: Pure Tones
Amplitude One way to specify a sound’s amplitude
Air pressure
(physical)
down—not to move outward. Similarly, although air pressure
changes move outward from the speaker, the air molecules at
each location move back and forth, but stay in about the
same place. What is transmitted is the pattern of increases
and decreases in pressure that eventually reach the listener’s
ear. (Note that this is different from what occurs when waves
are pounding on a beach. In that case, water moves in and
back; in contrast to our boat in the pond, a small boat near
the shore could be carried ashore on an incoming wave.)
CHAPTER 11
Time
Figure 11.3 ❚ Three different amplitudes of a pure tone.
Larger amplitude is associated with the perception of greater
loudness.
Sound, the Auditory System, and Pitch Perception
High
High
Frequency
(physical)
Pitch
(perceptual)
which converts the large range of sound pressure into V
L 1
a more manageable scale.
The following equation is used to convert sound pressure into decibels:
Low
Low
dB ⫽ 20 ⫻ logarithm(p/po)
where dB stands for decibels, p is the sound pressure of
the stimulus, and po is a standard sound pressure, usually
set at 20 micropascals, where a pascal is a unit of pressure
and 20 micropascals is a pressure near the threshold for
human hearing. We can use this equation to calculate the
decibels for a 20 micropascal tone (p ⫽ 20) as follows:
dB ⫽ 20log(p/po) ⫽ 20log(20/20) ⫽ 20 ⫻ (log 1)
⫽ 20 ⫻ (0) ⫽ 0 dB SPL
(Note: log of 1 ⫽ 0)
Adding the notation SPL, for sound pressure level, indicates that we have used the standard pressure of 20 micropascals as po in our calculation. In referring to the decibels or sound pressure of a sound stimulus, the term level
or sound level is usually used.
Now let’s calculate dB for two higher pressure levels.
First, we multiply pressure by 10, so p ⫽ 200:
dB ⫽ 20log(p/po) ⫽ 20log(200/20) ⫽ 20(log 10)
⫽ 20(1) ⫽ 20 dB SPL
(Note: log 10 ⫽ 1)
Notice that multiplying pressure by 10 adds 20 dB.
Now let’s multiple by 10 again, so p ⫽ 2,000:
dB ⫽ 20log(p/po) ⫽ 20log(2,000/20) ⫽ 20(log 100)
⫽ 20(2) ⫽ 40 dB SPL
(Note: log 100 ⫽ 2)
Notice that multiplying pressure by 10 again adds another 20 dB.
Because multiplying pressure by 10 only adds 20 dB, a
large increase in amplitude causes a much smaller increase
in dB. The right column of Table 11.1 shows that a range of
amplitudes from 1 to 10,000,000 results in a range of decibels from 0 to 140.
Frequency Frequency, the other characteristic of a pure
tone, is illustrated in Figure 11.4, which shows three different frequencies. Frequency, the number of cycles per second
the change in pressure repeats, is the physical measure associated with our perception of pitch, with higher frequencies
associated with higher pitches.
Frequency is indicated in units called Hertz (Hz), in
which 1 Hz is 1 cycle per second. Thus, the middle stimulus
in Figure 11.4, which repeats five times in a second would be
a 5-Hz tone. As we will see, humans can perceive frequencies
ranging from about 20 Hz to 20,000 Hz.
Pure tones are important because they are simple and
because they have been used extensively in auditory research. Pure tones are, however, rare in the environment.
Sounds in the environment, such as those produced by musical instruments, people speaking, and the various sounds
1 second
Figure 11.4 ❚ Three different frequencies of a pure tone.
Higher frequencies are associated with the perception of
higher pitches.
produced by nature and machines, have waveforms that
are more complex than the pure tone’s sine-wave pattern of
pressure changes.
Pressure Changes: Complex Tones
To describe complex tones, we will focus on sounds created
by musical instruments (in Chapter 13 we will consider
sound produced when people speak). Figure 11.5a shows the
waveform of a complex tone that would be created by a musical instrument. Notice that the waveform repeats. For example, the waveform in Figure 11.5a repeats four times. This
property of repetition means that this complex tone, like a
pure tone, is a periodic tone. The repetition rate of a complex
tone is called the fundamental frequency of the tone.
An important property of periodic complex tones is that
they consist of a number of pure tones. Because of this, we
can “build” a complex tone by using a technique called additive synthesis, in which a number of sine-wave components
are added together to create the complex tone. The starting
point for creating a complex tone by additive synthesis is
a single pure tone, like the one in Figure 11.5b, which has
a frequency equal to the complex tone’s fundamental frequency. The frequency of this fundamental is 200 Hz. We
then add to the fundamental additional pure tones, each of
which has a frequency that is a multiple of the fundamental. For the 200-Hz fundamental, the frequency of the second tone is 400 Hz (Figure 11.5c), the frequency of the third
tone is 600 Hz (Figure 11.5d), and the fourth is 800 Hz
(Figure 11.5e). These additional tones are higher harmonics
of the tone. Adding the fundamental (also called the first
harmonic) and the higher harmonics results in the waveform of the complex tone.
Another way to represent the harmonic components of
a complex tone is by frequency spectra, shown on the right
The Sound Stimulus
263
Waveforms
Frequency spectra
1+2+3+4
Level
1+2+3+4
(a)
0 200 400 600 800 1,000
0 200 400 600 800 1,000
0
1
5
(a)
(b)
0 200 400 600 800 1,000
10
15
20
Frequency (Hz)
Time (ms)
2+3+4
2
(c)
0 200 400 600 800 1,000
3
0 200 400 600 800 1,000
(d)
0
0 200 400 600 800 1,000
4
10
15
20
Frequency (Hz)
Time (ms)
Figure 11.6 ❚ (a) The complex tone from Figure 11.5a,
(e)
0 200 400 600 800 1,000
0
(b)
5
5
10
15
20
with its frequency spectrum; (b) the same tone with its first
harmonic removed. (Adapted from Plack, 2005.)
Frequency (Hz)
Time (ms)
Figure 11.5 ❚ Left: Waveforms of (a) a complex periodic
sound with a fundamental frequency of 200 Hz; (b) fundamental (first harmonic) ⫽ 200 Hz; (c) second harmonic ⫽
400 Hz; (d) third harmonic ⫽ 600 Hz; (e) fourth harmonic ⫽
800 Hz. Right: Frequency spectra for the tones on the left.
(Adapted from Plack, 2005.)
of Figure 11.5. The position of each line on the horizontal
axis indicates the harmonic’s frequency, and the height of
the line indicates the harmonic’s amplitude. Frequency
spectra provide a way of indicating a complex tone’s fundamental frequency and harmonics without drawing the
tone’s waveform.
Figure 11.6 shows what happens if we remove the first
harmonic of a complex tone. The tone in Figure 11.6a is the
one from Figure 11.5a, which has a fundamental frequency
of 200 Hz. The tone in Figure 11.6b is the same tone with
the first harmonic (200 Hz) removed. Note that removing a
harmonic changes the tone’s waveform, but that the repetition rate remains the same. Even though the fundamental
is no longer present, the repetition rate, which is still 200
Hz, indicates the frequency of the harmonic.
You may wonder why the repetition rate remains the
same even though the fundamental has been removed.
Looking at the frequency spectrum on the right, we can see
that the distance between harmonics equals the fundamental frequency. When the fundamental is removed, this spacing remains, so there is still information in the waveform indicating the frequency of the fundamental. In the following
section, we will see that since a tone’s pitch (perceiving a tone
as “high” or “low”) is related to repetition rate, removing
264
CHAPTER 11
the fundamental does not change the tone’s pitch, but the
changed waveform does affect our perception of other qualities of the tone.
Perceiving Sound
As we described the physical characteristics of the sound
stimulus, we mentioned the connection between amplitude (physical) and loudness (perceptual) (Figure 11.3) and
between frequency (physical) and pitch (perceptual) (Figure
11.4). Let’s now look more closely at the perceptual qualities
of sound.
Loudness
Loudness is the quality most closely related to the amplitude or sound pressure, which is also called the level of an
auditory stimulus. Thus, decibels are often associated with
loudness, as shown in Table 11.1, which indicates that a
sound with zero decibels is just barely detectible and 120 dB
is extremely loud.
Figure 11.7 shows the relationship between decibels
and loudness for a pure tone, determined by S. S. Stevens’s
magnitude estimation procedure (see Chapter 1, page 16). In
this experiment, loudness was judged relative to a standard
of a 1,000-Hz tone at 40 dB, which was assigned a value of 1.
Thus, a tone that sounds 10 times louder than this standard would be judged to have a loudness of 10. This curve
indicates that increasing the sound level by 10 dB V
L 2
almost (but not quite) doubles the sound’s loudness.
Sound, the Auditory System, and Pitch Perception
100
20
Loudness
10
2.0
1.0
0.2
0.1
0
20
40
60
80
100
120
Intensity (dB)
Figure 11.7 ❚ Loudness of a 1,000-Hz tone as a function
of intensity, determined using magnitude estimation. The
dashed lines show that increasing the intensity by 10 dB
almost doubles the loudness. (Adapted from Gulick,
Gescheider, & Frisina, 1989.)
Although decibels and loudness are related, it is important to distinguish between them. Decibels are a physical
measure, whereas loudness is psychological. To appreciate the
physical nature of dB, all you have to do is look back at the
equation that indicates how dB are calculated. Notice that
decibels are defined in terms of pressure, not perception.
Pitch
Pitch, the perceptual quality we describe as “high” or “low”
is defined as the attribute of auditory sensation in terms of which
sounds may be ordered on a musical scale (Bendor & Wang,
2005). We have seen that pitch is most closely related to
the physical property of frequency. Low fundamental frequencies are associated with low pitches (like the sound of
a tuba), and high fundamental frequencies are associated
with high pitches (like the sound of a piccolo).
Tone height is the perceptual experience of increasing
pitch that accompanies increases in a tone’s fundamental frequency. Starting at the lowest note on the piano, at
the left end of the keyboard (fundamental frequency ⫽
27.5 Hz), and moving to the right toward the highest note
(fundamental ⫽ 4,166 Hz) creates the perception of V
L 3
increasing tone height (Figure 11.8).
In addition to the increase in tone height that occurs as
we move from the low to the high end of the piano keyboard,
something else happens: the letters of the notes A, B, C, D, E,
F, and G repeat, and we notice that notes with the same letter
sound similar. Because of this similarity, we say that notes
with the same letter have the same tone chroma. Every time
we pass the same letter on the keyboard, we have gone up an
interval called an octave. Tones separated by octaves have the
same tone chroma. For example, each of the A’s in Figure 11.8,
indicated by the arrows, has the same tone chroma.
Interestingly, notes with the same chroma have fundamental frequencies that are multiples of one another. Thus,
A1 has a fundamental frequency of 27.5 Hz, A 2’s is 55 Hz,
A3’s is 110 Hz, and so on. Somehow this doubling of frequency for each octave results in similar perceptual experiences. Thus, a male with a low-pitched voice and a female
with a high-pitched voice can be regarded as singing “in
unison,” even when their voices are separated by an octave
or more.
We have been describing how pitch is associated with
fundamental frequency, but let’s consider what happens
when the fundamental frequency is not present in a complex tone. Remember, from Figure 11.6, that removing the
first harmonic changes a tone’s waveform but not its repetition rate and that because the tone’s repetition rate remains
the same, the tone’s pitch remains the same. The pitch,
therefore, is determined not by the presence of the fundamental frequency, but by information, such as the spacing
of the harmonics and the repetition rate of the waveform,
that indicates the fundamental frequency.
The constancy of pitch, even when the fundamental
or other harmonics are removed, is called the effect of the
missing fundamental, and the pitch that we perceive in
tones, and that has had harmonics removed, is called periodicity pitch. We will see soon, when we discuss a quality of
tones called timbre, that although removing the fundamental does not affect a tone’s pitch, it does cause a tone to sound
different, just as an oboe and a trumpet that are V
L 4,5
playing the same note sound different.
The phenomenon of periodicity pitch has a number of
practical consequences. Consider, for example, what happens when you listen to someone talking to you on the
telephone. Even though the telephone does not reproduce
frequencies below about 300 Hz, we hear the low pitch of
a male voice, which contains frequencies below 300 Hz,
because of periodicity pitch created by higher harmonics
(Truax, 1984).
The Range of Hearing
Just as we see light only within only a narrow band of wavelengths called the visible spectrum, we hear sound only
within a specific range of frequencies, called the range of
hearing.
The Audibility Curve The human range of hearing
is depicted by the green curve in Figure 11.9. This is the audibility curve, which indicates the threshold for hearing
determined by free-field presentation (listening to a loudspeaker) versus frequency. This curve indicates that the
range of hearing is between about 20 Hz and 20,000 Hz
and that we are most sensitive (the threshold for hearing is
lowest) at frequencies between 2,000 and 4,000 Hz, which
happens to be the range of frequencies that is most V
L 6
important for understanding speech.
The light green area above the audibility curve is called
the auditory response area because we can hear tones
Perceiving Sound
265
Frequency (Hz)
27.5
30.9
32.7
36.7
41.2
43.7
49.0
55.0
61.7
65.4
73.4
82.4
87.3
98.0
110.0
123.5
130.8
146.8
164.8
174.6
196.0
220.0
246.9
261.6
293.7
329.6
349.2
392.0
440.0
493.9
523.2
587.3
659.2
698.5
784.0
880.0
987.8
1046.5
1174.7
1318.5
1396.9
1568.0
1760.0
1975.5
2093.0
2349.3
2637.0
2793.0
3136.0
3520.0
3951.1
4186.0
Tone height increases
Piano keyboard
A0 B0 C1 D1 E1 F1 G1 A1 B1 C2 D2 E2 F2 G2 A2 B2 C3 D3 E3 F3 G3 A3 B3 C4 D4 E4 F4 G4 A4 B4 C5 D5 E5 F5 G5 A5 B5 C6 D6 E6 F6 G6 A6 B6 C7 D7 E7 F7 G7 A7 B7 C8
Same tone chroma
Figure 11.8 ❚ A piano keyboard, indicating the frequency associated with each key. Moving up the
keyboard to the right increases frequency and tone height. Notes with the same letter, like the A’s
(arrows), have the same tone chroma.
Threshold
of feeling
120
100
dB (SPL)
80
80
Conversational
speech
60
40
A
40
B
Equal
loudness
curves
C
Audibility
curve
(threshold
of hearing)
20
0
20
100
500 1,000
5,000 10,000
Frequency (Hz)
Figure 11.9 ❚ The audibility curve and the auditory
response area. Hearing occurs in the light green area
between the audibility curve (the threshold for hearing)
and the upper curve (the threshold for feeling). Tones with
combinations of dB and frequency that place them in the
light red area below the audibility curve cannot be heard.
Tones above the threshold of feeling result in pain. Where
the dashed line at 10 dB traverses the auditory response
area indicates which frequencies can be heard at 10 dB SPL.
(From Fletcher & Munson, 1933.)
that fall within this area. At intensities below the audibility curve, we can’t hear a tone. For example, we wouldn’t be
able to hear a 30-Hz tone at 40 dB SPL (point A). The upper
boundary of the auditory response area is the curve marked
“threshold of feeling.” Tones with these high amplitudes
are the ones we can “feel”; they can become painful and can
cause damage to the auditory system.
Although humans hear frequencies between about 20
and 20,000 Hz, other animals can hear frequencies outside
the range of human hearing. Elephants can hear stimuli below 20 Hz. Above the high end of the human range, dogs
can hear frequencies above 40,000 Hz, cats can hear above
50,000 Hz, and the upper range for dolphins extends as
high as 150,000 Hz.
266
CHAPTER 11
Loudness Depends on Sound Pressure
and Frequency The audibility curve and auditory re-
sponse area indicate the loudness of pure tones depends not
only on sound pressure but also on frequency. We can appreciate how loudness depends on frequency by comparing
the loudness of two tones that have the same dB level but
different frequencies. For example, point B in Figure 11.9
indicates where a 40-dB SPL 100-Hz tone is located in the
response area, and point C indicates where a 40-dB SPL
1,000-Hz tone is located.
We can tell that these two tones would have very different loudnesses by considering their location relative to
the audibility curve. The 100-Hz tone is located just above
the audibility curve, so it is just above threshold and would
just barely be heard. However, the 1,000-Hz tone is far above
threshold, well into the auditory response area, so it would
be much louder than the 100-Hz tone. Thus, to determine
the loudness of any tone we need to know both its dB level
and its frequency.
Another way to understand the relationship between
loudness and frequency is by looking at the equal loudness
curves in Figure 11.9. These curves indicate the number of
decibels that create the same perception of loudness at different frequencies. An equal loudness curve is determined
by presenting a standard tone of one frequency and dB level
and having a listener adjust the level of tones with frequencies across the range of hearing to match the loudness of the
standard. For example, the curve marked 40 in Figure 11.9
was determined by matching the loudness of frequencies
across the range of hearing to the loudness of a 1,000-Hz
40-dB SPL tone. Similarly, the curve marked 80 was determined by matching the loudness of different frequencies to
a 1,000-Hz 80-dB SPL tone.
Notice that the audibility curve and the equal loudness
curve marked 40 bend up at high and low frequencies, but
the equal loudness curve marked 80 is flat between 30 and
5,000 Hz, meaning that tones at a level of 80 dB are equally
loud between these frequencies. The difference between the
relatively flat 80 curve and the upward-bending curves at
lower decibel levels explains something that happens as you
adjust the volume control on your stereo system.
Sound, the Auditory System, and Pitch Perception
Timbre
Another perceptual quality of tones, in addition to pitch
and loudness, is timbre (pronounced TIM-ber or TAM-ber).
Timbre is the quality that distinguishes between two tones
that have the same loudness, pitch, and duration, but still
sound different. For example, when a flute and a bassoon
play the same note with the same loudness, we can still tell
the difference between these two instruments. We might describe the sound of the flute as clear or mellow and the sound
of the bassoon as nasal or reedy. When two tones have the
same loudness, pitch, and duration, but sound different,
this difference is a difference in timbre.
Timbre is closely related to the harmonic structure of
a tone. In Figure 11.10, frequency spectra indicate the harmonics of a guitar, a bassoon, and an alto saxophone playing the note G3 with a fundamental frequency of 196 Hz.
Both the relative strengths of the harmonics and the number of harmonics are different in these instruments. For example, the guitar has more high-frequency harmonics than
either the bassoon or the alto saxophone. Although the frequencies of the harmonics are always multiples of the fundamental frequency, harmonics may be absent, as is true of
some of the high-frequency harmonics of the bassoon and
the alto saxophone.
The difference in the harmonics of different instruments is one factor that causes musical instruments to have
different timbres. Timbre also depends on the time course
of the tone’s attack (the buildup of sound at the beginning
of the tone) and on the time course of the tone’s decay (the
decrease in sound at the end of the tone). Thus, it is easy to
tell the difference between a tape recording of a high note
played on the clarinet and a recording of the same note
played on the flute when the attack, the decay, and the sustained portion of the tone are heard. It is, however, difficult
Response (dB)
Guitar
20
10
0
2
4
8 1,000
2
4
8 10,000
Response (dB)
Frequency (Hz)
Bassoon
30
20
10
0
2
4
8 1,000
2
4
8 10,000
Frequency (Hz)
Response (dB)
If you are playing music at a fairly high level—say, 80 dB
SPL—you should be able to easily hear each of the frequencies in the music because, as the equal loudness curve for 80
indicates, all frequencies between about 20 Hz and 5,000 Hz
sound equally loud at this level. However, when you turn the
level down to 10 dB SPL, all frequencies don’t sound equally
loud. In fact, from the audibility curve in Figure 11.9 we can
see that frequencies below about 400 Hz (the bass notes)
and above about 12,000 Hz (the treble notes) are inaudible
at 10 dB. (Notice that the dashed 10-dB line crosses the audibility curve at about 400 Hz and 12,000 Hz.) This means
that frequencies lower than 400 Hz and higher than 12,000
Hz are not audible at 10 dB.
Being unable to hear very low and very high frequencies
at low dB levels means that when you play music softly you
won’t hear the very low or very high frequencies. To compensate for this, some stereo receivers have a button labeled
“loudness” which boosts the level of very high and very low
frequencies when the volume control is turned down. (There
are also loudness settings on some MP3 players.) This enables
you to hear these frequencies even when the music is soft.
Alto saxophone
30
20
10
0
2
4
8 1,000
2
4
8 10,000
Frequency (Hz)
Figure 11.10 ❚ Frequency spectra for a guitar, a bassoon,
and an alto saxophone playing a tone with a fundamental
frequency of 196 Hz. The position of the lines on the
horizontal axis indicates the frequencies of the harmonics,
and their height indicates their intensities. (From Olson, 1967.)
to distinguish between the same instruments when the
tone’s attack and decay are eliminated by erasing the first
and last 1/2-second of the recording (Berger, 1964; V
L 7,8
also see Risset & Mathews, 1969).
Another way to make it difficult to distinguish one
instrument from another is to play an instrument’s tone
backward. Even though this does not affect the tone’s harmonic structure, a piano tone played backward sounds
more like an organ than a piano because the tone’s original decay has become the attack and the attack has become
the decay (Berger, 1964; Erickson, 1975). Thus, timbre depends both on the tone’s steady-state harmonic structure
and on the time course of the attack and decay of the
VL 9
tone’s harmonics.
The sounds we have been considering so far—pure
tones and the tones produced by musical instruments—are
all periodic sounds. That is, the pattern of pressure changes
repeats, as in the tone in Figure 11.5a. There are also aperiodic sounds, which have sound waves that do not repeat.
Examples of aperiodic sounds would be a door slamming
shut, people talking, and noises such as the static on a radio not tuned to a station. The sounds produced by these
events are more complex than musical tones, but many of
Perceiving Sound
267
these sound stimuli can also be analyzed into a number
of simpler frequency components. We will describe how
we perceive speech stimuli in Chapter 13. We will focus in
this chapter on pure tones and musical tones because these
sounds are the ones that have been used in most of the basic
research on the operation of the auditory system. In the next
section, we will begin considering how the sound stimuli we
have been describing are processed by the auditory system
so that we can experience sound.
T E S T YO U R S E L F 11.1
1. What are some of the functions of sound? Especially
2.
3.
4.
5.
6.
7.
note what information sound provides that is not
provided by vision.
What are two possible definitions of sound?
(Remember the tree falling in the forest.)
What are the amplitude and frequency of
sound? Why was the decibel scale developed to
measure amplitude? Is decibel “perceptual” or
“physical”?
What is the relationship between amplitude and
loudness? Which one is physical, and which one is
perceptual?
How is frequency related to pitch, tone height, and
tone chroma? Which of these is physical, and which
is perceptual?
What is the audibility curve, and what does it tell
us about what tones we can experience and about
the relationship between a tone’s frequency and its
loudness?
What is timbre? Describe the characteristics of complex tones and how these characteristics determine
timbre.
The Ear
The auditory system must accomplish three basic tasks before we can hear. First, it must deliver the sound stimulus to
the receptors. Second, it must transduce this stimulus from
pressure changes into electrical signals, and third, it must
process these electrical signals so they can indicate qualities of the sound source such as pitch, loudness, timbre, and
location.
We begin our description of how the auditory system accomplishes these tasks by focusing on the ear, at the beginning of the auditory system. Our first question, “How does
energy from the environment reach the receptors?” takes us
on a journey through what Diane Ackerman (1990) has described as a device that resembles “a contraption some ingenious plumber has put together from spare parts.” An overall view of this “contraption” is shown in Figure 11.11. The
268
CHAPTER 11
ear is divided into three divisions: outer, middle, and inner.
We begin with the outer ear.
The Outer Ear
When we talk about ears in everyday conversation, we are
usually referring to the pinnae, the structures that stick
out from the sides of the head. Although this most obvious
part of the ear is important in helping us determine the location of sounds and is of great importance for those who
wear eyeglasses, it is the part of the ear we could most easily
do without. The major workings of the ear are found within
the head, hidden from view.
Sound waves first pass through the outer ear, which
consists of the pinna and the auditory canal (Figure 11.11).
The auditory canal is a tubelike structure about 3 cm long
in adults that protects the delicate structures of the middle
ear from the hazards of the outside world. The auditory canal’s 3-cm recess, along with its wax, protects the delicate
tympanic membrane, or eardrum, at the end of the canal
and helps keep this membrane and the structures in the
middle ear at a relatively constant temperature.
In addition to its protective function, the outer ear
has another effect: to enhance the intensities of some
sounds by means of the physical principle of resonance.
Resonance occurs in the auditory canal when sound
waves that are reflected back from the closed end of the auditory canal interact with sound waves that are entering
the canal. This interaction reinforces some of the sound’s
frequencies, with the frequency that is reinforced the most
being determined by the length of the canal. The frequency
reinforced the most is called the resonant frequency of
the canal.
We can appreciate how the resonant frequency depends
on the length of the canal by noting how the tone produced
by blowing across the top of a soda bottle changes as we
drink more soda. Drinking more soda increases the length
of the air path inside the bottle, which decreases the resonant frequency, and this creates a lower-pitched tone. Measurements of the sound pressures inside the ear indicate
that the resonance that occurs in the auditory canal has a
slight amplifying effect on frequencies between about 1,000
and 5,000 Hz.
The Middle Ear
When airborne sound waves reach the tympanic membrane
at the end of the auditory canal, they set it into vibration,
and this vibration is transmitted to structures in the middle
ear, on the other side of the tympanic membrane. The middle ear is a small cavity, about 2 cubic centimeters in volume, which separates the outer and inner ears (Figure 11.12).
This cavity contains the ossicles, the three smallest bones in
the body. The first of these bones, the malleus (also known
Sound, the Auditory System, and Pitch Perception
Outer
Middle
Inner
Semicircular
canals
Incus
Auditory
nerve
Malleus
Cochlea
Pinna
Stapes
Auditory
canal
Eardrum
Round
window
Oval window
(under footplate
of stapes)
as the hammer), is set into vibration by the tympanic membrane, to which it is attached, and transmits its vibrations
to the incus (or anvil), which, in turn, transmits its vibrations to the stapes (or stirrup). The stapes then transmits
its vibrations to the inner ear by pushing on the membrane
covering the oval window.
Why are the ossicles necessary? We can answer this
question by noting that both the outer ear and middle ear
are filled with air, but the inner ear contains a watery liquid
that is much denser than the air (Figure 11.13). The mismatch between the low density of the air and the high density of this liquid creates a problem: pressure changes in the
air are transmitted poorly to the much denser liquid. This
mismatch is illustrated by the difficulty you would have
hearing people talking to you if you were underwater and
they were above the surface.
Malleus
Incus
Stapes
Tympanic
membrane
(eardrum)
Oval
window
Figure 11.11 ❚ The ear, showing its three subdivisions—
outer, middle, and inner. (From Lindsay & Norman, 1977.)
If vibrations had to pass directly from the air in the
middle ear to the liquid in the inner ear, less than 1 percent of the vibrations would be transmitted (Durrant &
Lovrinic, 1977). The ossicles help solve this problem in two
ways: (1) by concentrating the vibration of the large tympanic membrane onto the much smaller stapes, which increases the pressure by a factor of about 20 (Figure 11.14a);
and (2) by being hinged to create a lever action that creates
an effect similar to what happens when a fulcrum is placed
under a board, so pushing down on the long end of the
board makes it possible to lift a heavy weight on the short
end (Figure 11.14b). We can appreciate the effect of the ossicles by noting that in patients whose ossicles have been
damaged beyond surgical repair, it is necessary to increase
the sound by a factor of 10 to 50 to achieve the same hearing as when the ossicles were functioning.
Not all animals require the concentration of pressure
and lever effect provided by the ossicles in the human. For
example, there is only a small mismatch between the density of water, which transmits sound in a fish’s environment,
and the liquid inside the fish’s ear. Thus, fish have no outer
or middle ear.
The middle ear also contains the middle-ear muscles,
the smallest skeletal muscles in the body. These muscles are
attached to the ossicles, and at very high sound intensities
they contract to dampen the ossicle’s vibration, thereby protecting the structures of the inner ear against potentially
painful and damaging stimuli.
Auditory canal
Round
window
Figure 11.12 ❚ The middle ear. The three bones of the
middle ear transmit the vibrations of the tympanic membrane
to the inner ear.
Air
Air
Cochlear
fluid
Outer
Middle
Inner
Figure 11.13 ❚ Environments inside the outer, middle, and
inner ears. The fact that liquid fills the inner ear poses a
problem for the transmission of sound vibrations from the air
of the middle ear.
The Ear
269
The Inner Ear
Area of
stapes
footplate
Area of
tympanic
membrane
(a)
(b)
Figure 11.14 ❚ (a) A diagrammatic representation of the
tympanic membrane and the stapes, showing the difference
in size between the two. (b) How lever action can amplify a
small force, presented on the right, to lift the large weight
on the left. The lever action of the ossicles amplifies the
sound vibrations reaching the tympanic inner ear. (Adapted
from Schubert, 1980.)
The main structure of the inner ear is the liquid-filled
cochlea, the snail-like structure shown in green in Figure
11.11, and shown partially uncoiled in Figure 11.15a. The
liquid inside the cochlea is set into vibration by the movement of the stapes against the oval window. We can see the
structure inside the cochlea more clearly by imagining how
it would appear if uncoiled to form a long straight tube
(Figure 11.15b). The most obvious feature of the uncoiled
cochlea is that the upper half, called the scala vestibuli, and
the lower half, called the scala tympani, are separated by a
structure called the cochlear partition. This partition extends almost the entire length of the cochlea, from its base
near the stapes to its apex at the far end. Note that this diagram is not drawn to scale and so doesn’t show the cochlea’s
true proportions. In reality, the uncoiled cochlea would be a
cylinder 2 mm in diameter and 35 mm long.
We can best see the structures within the cochlear partition by taking a cross section cut of the cochlea, as shown
in Figure 11.15b, and looking at the cochlea end-on and in
cross section, as in Figure 11.16a. When we look at the cochlea in this way, we see that the cochlear partition contains
a large structure called the organ of Corti. Figure 11.16b
shows the following key structures of the organ of Corti.
■
The hair cells, shown in red in Figure 11.16b, and in
Figure 11.17, which is a view looking down on the
Oval
window
Stapes
Round window
Scala
tympani
Cochlear
partition
Scala
vestibuli
(a)
Stapes
Oval window
Cochlear partition
Scala vestibuli
Base
Figure 11.15 ❚ (a) A partially uncoiled
Scala tympani
Apex
Round
window
Cross-section cut
(see Figure 11.16)
(b)
270
CHAPTER 11
Sound, the Auditory System, and Pitch Perception
cochlea. (b) A fully uncoiled cochlea. The
cochlear partition, indicated here by a line,
actually contains the basilar membrane
and the organ of Corti, which are shown in
Figures 11.16 and 11.17.
Scala
vestibuli
Inner
hair cells
Tectorial membrane
Organ of
Corti
Cilia
Outer
hair cells
Auditory
nerve
Organ
of Corti
Basilar
membrane
Scala
tympani
(a)
Auditory nerve fibers
Basilar membrane
(b)
Figure 11.16 ❚ (a) Cross section of the cochlea. (b) Close-up of the organ of Corti, showing how it
rests on the basilar membrane. Arrows indicate the motions of the basilar membrane and tectorial
membrane that are caused by vibration of the cochlear partition. (Adapted from Denes & Pinson, 1993.)
Image not available due to copyright restrictions
organ of Corti, are the receptors for hearing. The cilia,
which protrude from the tops of the cells, are where
the sound acts to produce electrical signals. There
are two types of hair cells, the inner hair cells and the
outer hair cells. There are about 3,500 inner hair cells
and 12,000 outer hair cells in the human ear (Møller,
2000).
■
The basilar membrane supports the organ of Corti
and vibrates in response to sound.
■
The tectorial membrane extends over the hair cells.
One of the most important events in the auditory process is the bending of the cilia of the inner hair cells, which
are responsible for transduction—the conversion of the vibrations caused by the sound stimulus into electrical signals. As we will see later, the major role of the outer hair
cells is to increase the vibration of the basilar membrane.
The cilia bend because the in-and-out movement of the
stapes creates pressure changes in the liquid inside the cochlea that sets the cochlear partition into an up-and-down
motion, as indicated by the blue arrow in Figure 11.16b.
This up-and-down motion of the cochlear partition causes
two effects: (1) it sets the organ of Corti into an up-anddown vibration, and (2) it causes the tectorial membrane
to move back and forth, as shown by the red arrow. These
two motions cause the cilia of the inner hair cells to bend
because of their movement against the surrounding liquid
and affects the outer hair cells because some of the cilia are
in contact with the tectorial membrane (Dallos, 1996).
Figure 11.18 shows what happens when the cilia bend.
Movement in one direction (Figure 11.18a) opens channels
in the membrane, and ions flow into the cell. Remember
from our description of the action potential in Chapter 2
(see page 28) that electrical signals occur in neurons when
ions flow across the cell membrane. The ion flow in the inner hair cells has the same effect, creating electrical signals
that result in the release of neural transmitter from
VL 10
the inner hair cell.
When the cilia bend in the other direction (Figure
11.18b), the ion channels close, so electrical signals are not
generated. Thus, the back-and-forth bending of the hair
cells causes alternating bursts of electrical signals (when the
cilia bend in one direction) and no electrical signals (when
the cilia bend in the opposite direction).
The Ear
271
100 picometers
Ion
flow
1 centimeter
Ion
flow
Inner
hair
cell
Ion
flow
Transmitter
released
Auditory
nerve fiber
(a)
(b)
opens ion channels in the hair cell, which results in the
release of neurotransmitter onto an auditory nerve fiber.
(b) Movement in the opposite direction closes the ion
channels, so there is no ion flow and no transmitter release.
The amount the cilia of the inner hair cells must bend to
cause an electrical signal is extremely small. At the threshold for hearing, cilia movements as small as 100 trillionths
of a meter (100 picometers) can generate a response in the
hair cell. To give you an idea of just how small a movement
this is, consider that if we were to increase the size of a cilium so it was as big as the 325-meter high Eiffel Tower, the
movement of the cilia would translate into a movement of
the pinnacle of the Eiffel Tower of only 1 cm (Figure 11.19;
Hudspeth, 1983, 1989).
Given the small amount of movement needed to hear
a sound, it isn’t surprising that the auditory system can
detect extremely small pressure changes. In fact, the auditory system can detect pressure changes so small that they
cause the eardrum to move only 10 –11 cm, a dimension that
is less than the diameter of a hydrogen atom (Tonndorf &
Khanna, 1968), and the auditory system is so sensitive that
the air pressure at threshold in the most sensitive range of
hearing is only 10 to 15 dB above the air pressure generated
by the random movement of air molecules. This means that
272
CHAPTER 11
Bruce Goldstein
Figure 11.18 ❚ (a) Movement of hair cilia in one direction
Figure 11.19 ❚ The distance the cilia of a hair cell moves
at the threshold for hearing is so small that if the volume
of an individual cilium were scaled up to that of the Eiffel
Tower, the equivalent movement of the Eiffel Tower would
be about 1 cm.
if our hearing were much more sensitive than it is now, we
would hear the background hiss of colliding air molecules!
The Representation of
Frequency in the Cochlea
One of the major goals of research on hearing has been to
understand the physiological mechanisms behind our perception of pitch. Because our perception of pitch is closely
linked to a tone’s frequency, a great deal of research has
focused on determining how frequency is represented by
the firing of neurons in the auditory system. The classic research on this problem was done by Georg von Békésy, who
won the Nobel Prize in physiology and medicine in 1961 for
his research on the physiology of hearing.
Sound, the Auditory System, and Pitch Perception
High
frequencies
Hair cells
Low
frequencies
Base
Apex
Auditory nerve fibers
Auditory nerve
Figure 11.20 ❚ Hair cells all along the cochlea send signals
to nerve fibers that combine to form the auditory nerve.
According to place theory, low frequencies cause maximum
activity at the apex end of the cochlea, and high frequencies
cause maximum activity at the base. Activation of the hair
cells and auditory nerve fibers indicated in red would signal
that the stimulus is in the middle of the frequency range for
hearing.
Apex
Base
Figure 11.21 ❚ A perspective view showing the traveling
wave motion of the basilar membrane. This picture shows
what the membrane looks like when the vibration is “frozen”
with the wave about two thirds of the way down the
membrane. (From Tonndorf, 1960.)
Spiral
lamina
Stapes
Basilar
membrane
Apex
Békésy’s Place Theory of Hearing
Békésy proposed the place theory of hearing, which states
that the frequency of a sound is indicated by the place along
the cochlea at which nerve firing is highest. Figure 11.20
represents the basilar membrane, which stretches from the
base of the cochlea, near the vibrating stapes, to the apex,
near the end of the cochlea. There are hair cells associated
with each place along the basilar membrane and auditory
nerve fibers associated with the hair cells.
According to place theory, low frequencies cause maximum activity in the hair cells and auditory nerve fibers at
the apex end of the basilar membrane, and high frequencies
cause maximum activity in hair cells and auditory nerve
fibers at the base of the membrane. Thus, the frequency of a
tone is indicated by the place along the basilar membrane at
which auditory nerve fibers are activated.
Békésy came to this conclusion by determining how
the basilar membrane vibrated in response to different frequencies. He determined this in two ways: (1) by actually
observing the vibration of the basilar membrane and (2) by
building a model of the cochlea that took into account the
physical properties of the basilar membrane.
Békésy observed the vibration of the basilar membrane
by boring a hole in cochleas taken from animal and human
cadavers, presenting different frequencies of sound, and
observing the membrane’s vibration by using a technique
similar to that used to create “stop-action” photographs of
high-speed events, which enabled him to see the membrane’s
position at different points in time (Békésy, 1960). He found
that the vibrating motion of the basilar membrane is similar to the motion that occurs when one person holds the end
of a rope and “snaps” it, sending a wave traveling down the
rope. This traveling wave motion of the basilar membrane
is shown in Figure 11.21.
Békésy also determined how the basilar membrane vibrates by analyzing its structure. In this analysis he took
Oval
window
Base
Figure 11.22 ❚ A perspective view of an uncoiled cochlea,
showing how the basilar membrane gets wider at the
apex end of the cochlea. The spiral lamina is a supporting
structure that makes up for the basilar membrane’s difference
in width at the base and the apex ends of the cochlea. (From
Schubert, 1980.)
note of two important facts: (1) the base of the basilar membrane (the end located nearest the stapes) is three or four
times narrower than the apex of the basilar membrane (the
end of the membrane located at the far end of the cochlea;
Figure 11.22); and (2) the base of the membrane is about
100 times stiffer than the apex. Using this information,
Békésy constructed models of the cochlea that revealed that
the pressure changes in the cochlea cause the basilar
VL 11
membrane to vibrate in a traveling wave.
Figure 11.23 shows the traveling wave caused by a pure
tone, at three successive moments in time. The solid horizontal line represents the basilar membrane at rest. Curve 1
shows the position of the basilar membrane at one moment
during its vibration, and curves 2 and 3 show the positions
of the membrane at two later moments. From these curves
we can see that over a period of time most of the membrane
vibrates, but that some parts vibrate more than others. The
envelope of the traveling wave, which is indicated by the
dashed line, indicates the maximum displacement that
the traveling wave causes at each point along the membrane. This maximum displacement is important because
the amount that the hair cell’s cilia move depends on how
far the basilar membrane is displaced. Therefore, hair cells
located near the place where the basilar membrane vibrates
the most will be stimulated the most strongly, and the nerve
fibers associated with these hair cells will therefore fire the
most strongly.
The Representation of Frequency in the Cochlea
273
P
Envelope
Base
Apex
1
25 Hz
2
Base
Apex
3
50 Hz
Figure 11.23 ❚ Vibration of the basilar membrane,
showing the position of the membrane at three instants in
time, indicated by the blue, green, and red lines, and the
envelope of the vibration, indicated by the black dashed line.
P indicates the peak of the basilar membrane vibration.
(From Békésy, 1960.)
Relative amplitude
100 Hz
200 Hz
Békésy’s (1960) observations of the basilar membrane’s
vibrations led him to conclude that the envelope of the
traveling wave of the basilar membrane has two important
properties:
400 Hz
800 Hz
1. The envelope has a peak amplitude at one point on
the basilar membrane. The envelope of Figure 11.23
indicates that point P on the basilar membrane is displaced the most by the traveling wave. Thus, the hair
cells near point P will send out stronger signals than
those near other parts of the membrane.
2. The position of this peak on the basilar membrane is
a function of the frequency of the sound. We can see
in Figure 11.24, which shows the envelopes of vibration for stimuli ranging from 25 to 1,600 Hz, that low
frequencies cause maximum vibration near the apex.
High frequencies cause less of the membrane to vibrate, and the maximum vibration is near the base.
(One way to remember this relationship is to imagine
low-frequency waves as being long waves that reach
farther.)
Evidence for Place Theory
Békésy’s linking of the place on the cochlea with the frequency of the tone has been confirmed by measuring the
electrical response of the cochlea and of individual hair
cells and auditory nerve fibers. For example, placing disc
electrodes at different places along the length of the cochlea
and measuring the electrical response to different frequencies results in a tonotopic map—an orderly map of frequencies along the length of the cochlea (Culler et al., 1943). This
result, shown in Figure 11.25, confirms the idea that the
apex of the cochlea responds best to low frequencies and
the base responds best to high frequencies. More precise
electrophysiological evidence for place coding is provided
by determining that auditory nerve fibers that signal activity at different places on the cochlea respond to different
frequencies.
274
CHAPTER 11
1,600 Hz
0
10
20
30
Distance from stapes (mm)
Figure 11.24 ❚ The envelope of the basilar membrane’s
vibration at frequencies ranging from 25 to 1,600 Hz, as
measured by Békésy (1960). These envelopes were based
on measurements of damaged cochleas. The envelopes are
more sharply peaked in healthy cochleas.
M E T H O D ❚ Neural Frequency Tuning
Curves
Each hair cell and auditory nerve fiber responds to a
narrow range of frequencies. This range is indicated by
each neuron’s frequency tuning curve. This curve is determined by presenting pure tones of different frequencies and measuring how many decibels are necessary to
cause the neuron to fire. This decibel level is the threshold for that frequency. Plotting the threshold for each frequency results in frequency tuning curves like the ones
in Figure 11.26. The arrow under each curve indicates
the frequency to which the neuron is most sensitive. This
frequency is called the characteristic frequency of the
particular auditory nerve fiber.
The frequency tuning curves in Figure 11.26 were recorded from auditory nerve fibers that originated at different places along the cochlea. As we would expect from
Békésy’s place theory, the fibers originating near the
base of the cochlea have high characteristic frequencies,
Sound, the Auditory System, and Pitch Perception
and those originating near the apex have low characteristic
frequencies.
The idea that the frequency of a tone is represented
by the firing of fibers located at specific places along the
cochlea has also been supported by the results of psychophysical experiments that make use of the phenomenon of
auditory masking. Auditory masking occurs in everyday experience any time your ability to hear a sound is decreased
by the presence of other sounds. For example, if you are
standing on the street having a conversation with a friend
and the sound of a passing bus makes it difficult to hear
what your friend is saying, the sound of the bus has masked
the sound of your friend’s voice.
6,000
7,000
5,000
2,000
M E T H O D ❚ Auditory Masking
In the laboratory, an auditory masking experiment is carried out using the procedure diagramed in Figure 11.27.
First, the threshold intensity is determined at a number
of frequencies, by presenting test tones (blue arrows) and
determining the lowest intensity for each test tone that
can just be heard (Figure 11.27a). Then, an intense masking stimulus (red arrow) is presented at one frequency.
This stimulus, which corresponds to the passing bus in
the example above, makes it more difficult to hear the
low-intensity test tones. While the masking stimulus is
sounding, the thresholds for all of the test tones are redetermined (Figure 11.27b). The increased sizes of some
of the arrows indicates that the intensity of the test tones
must be increased to hear them. Typically, the presence
of the masking tone causes the largest increase in threshold for test tones at or near the masking tone’s frequency,
but the effect does spread to test tones that are above and
below the masking tone’s frequency.
700
600
4,000
1,500 500
150
125
400
800
200
60 250
75
100
2,500
3,000
Base of
cochlea
300
1,000
3,500
Figure 11.25 ❚ Tonotopic map of the guinea pig cochlea.
Numbers indicate the location of the maximum electrical
response for each frequency. (From Culler, E. A., Coakley, J. D.,
Lowy, K., & Gross, N., A revised frequency map of the Guinea pig
cochlea, American Journal of Psychology, 56, 1943, 475–500,
figure 11. Copyright © 1943 by the Board of Trustees of the
University of Illinois. Used with the permission of the University
of Illinois.)
Apex
Figure 11.28 shows the results of a masking experiment
in which the masking tone contained frequencies between
365 and 455 Hz (Egan & Hake, 1950). The height of the
curve indicates how much the intensity of the test tone had
to be increased to be heard. Notice that the thresholds for
frequencies near the masking tone are raised the most. Also
notice that this curve is not symmetrical. That is, the masking effect spreads more to high frequencies than
VL 12
to low frequencies.
We can relate the larger effect of masking on highfrequency tones to the vibration of the basilar membrane
by looking at Figure 11.29, which reproduces the vibration patterns from Figure 11.24 caused by 200- and 800Hz test tones and a 400-Hz masking tone. We can see how
a 400-Hz masking tone would affect the 200- and 800-Hz
tones by noting how their vibration patterns overlap. Notice that the pattern for the 400-Hz tone, which is shaded,
Base
Threshold (dB SPL)
100
80
60
Figure 11.26 ❚ Frequency tuning curves of cat
40
20
0
0.1
0.2
0.5
1
2
Frequency (kHz)
5
10
20
50
auditory nerve fibers. The characteristic frequency
of each fiber is indicated by the arrows along the
frequency axis. The frequency scale is in kilohertz
(kHz), where 1 kHz ⫽ 1,000 Hz. (From Palmer, A. R.,
Physiology of the cochlear nerve and cochlear nucleus,
British Medical Bulletin on Hearing, 43, 1987, 838–855,
by permission of Oxford University Press.)
The Representation of Frequency in the Cochlea
275
Mask
Low
800 Hz
High
200 Hz
Frequency
(a) Measure thresholds at different frequencies (blue arrows)
Base
Apex
Figure 11.29 ❚ Vibration patterns caused by 200- and 800Hz test tones, and the 400-Hz mask (shaded), taken from
basilar membrane vibration patterns in Figure 11.24. Notice
that the vibration caused by the masking tone overlaps the
800-Hz vibration more than the 200-Hz vibration.
Masking tone
(b) Remeasure thresholds with the masking tone present
Figure 11.27 ❚ The procedure for a masking experiment.
(a) Threshold is determined across a range of frequencies.
Each blue arrow indicates a frequency where the threshold
is measured. (b) The threshold is redetermined at each
frequency (blue arrows) in the presence of a masking stimulus
(red arrow). The larger blue arrows indicate that the intensities
must be increased to hear these test tones when the masking
tone is present.
almost totally overlaps the pattern for the higher-frequency
800-Hz tone, but does not overlap the peak vibration of the
lower-frequency 200-Hz tone. We would therefore expect
the masking tone to interfere more with the 800-Hz tone
than with the 200-Hz tone, and this greater interference is
what causes the greater masking effect at higher frequencies. Thus, Békésy’s description of the envelope of the basilar membrane’s vibration predicts the masking function in
Figure 11.28.
All of the results we have described—(1) description of
the traveling wave, (2) tonotopic maps on the cochlea, (3) frequency tuning curves, and (4) masking experiments—support the link between frequency and activation of specific
places along the basilar membrane. The way the cochlea
separates frequencies along its length has been described
as an acoustic prism (Fettiplace & Hackney, 2006). Just as a
prism separates white light, which contains all wavelengths
in the visible spectrum, into its components, the cochlea
separates frequencies entering the ear into activity along
different places on the basilar membrane. This property
of the cochlea is particularly important when considering
complex tones that contain many frequencies.
How the Basilar Membrane Vibrates
to Complex Tones
To show how the basilar membrane responds to complex
tones, we return to our discussion of musical tones from
page 263. Remember that musical tones consist of a fundamental frequency and harmonics that are multiples of the
fundamental.
Research that has measured how the basilar membrane
responds to complex tones shows that the basilar membrane vibrates to the fundamental and to each harmonic,
so there are peaks in the membrane’s vibration that correspond to each harmonic. Thus, a complex tone with a number of harmonics (Figure 11.30a), will cause peak vibration
of the basilar membrane at places associated with the frequency of each harmonic (Figure 11.30b) (Hudspeth, 1989).
The acoustic prism idea therefore describes how the cochlea
Base
Apex
Increase in test-tone threshold
70
60
50
40
30
20
Masking
noise
10
100
276
200
300
CHAPTER 11
500 700 1,000
2,000 3,000
Frequency of test tone (Hz)
5,000
10,000
Sound, the Auditory System, and Pitch Perception
Figure 11.28 ❚ Results of Egan and Hake’s
(1950) masking experiment. The threshold
increases the most near the frequencies of the
masking noise, and the masking effect spreads
more to high frequencies than to low frequencies.
(From Egan & Hake, 1950.)
High-frequency end
1,320 Hz
880 Hz
440 Hz
Low-frequency end
(a) Complex tone
(440, 880,1,320 Hz harmonics)
(b) Basilar membrane
Figure 11.30 ❚ (a) Waveform of a complex tone
consisting of three harmonics. (b) Basilar membrane.
The shaded areas indicate locations of peak vibration
associated with each harmonic in the complex tone.
sorts each of the harmonics of a musical tone onto different
places along the basilar membrane.
Updating Békésy
While the basic idea behind Békésy’s place theory has been
confirmed by many experiments, some results were difficult
to explain based on the results of Békésy’s original experiments. Consider, for example, Békésy’s picture of how the
basilar membrane vibrates to different frequencies in
Figure 11.24. A problem with these curves is that two nearby
frequencies would cause overlapping and almost identical
patterns of vibration. Yet psychophysical experiments show
that we can distinguish small differences in frequency. For
example, Békésy’s vibration patterns for 400 and 405 Hz are
almost identical, but we can distinguish between these two
frequencies.
The explanation for this discrepancy is that Békésy
made his measurements of basilar membrane vibration in
cochleas isolated from animal and human cadavers. When
modern researchers measured the basilar membrane’s vibration in live cochleas using techniques more sensitive than
the ones available to Békésy, they found that the peak vibration for a particular frequency is much more sharply localized than Békésy had observed, so there is less overlap between the curves for nearby frequencies (Johnstone & Boyle,
1967; Khanna & Leonard, 1982; Narayan et al., 1998).
These new measurements explained our ability to distinguish between small differences in frequency, but they
also posed a new question: Why does the basilar membrane
vibrate more sharply in healthy cochleas? The answer is that
the outer hair cells expand and contract in response to the
vibration of the basilar membrane, and this expansion and
contraction, which only occurs in live cochleas, amplifies
and sharpens the vibration of the basilar membrane.
Figure 11.31 shows how this works. When vibration
of the basilar membrane causes the cilia of the outer hair
cells to bend in one direction, this causes the entire outer
hair cell to elongate, which pushes on the basilar membrane
(Figure 11.31a). Bending in the other direction causes the
hair cells to contract, which pulls on the basilar membrane
(Figure 11.31b). This pushing and pulling increases the motion of the basilar membrane and sharpens its response to
Cell
contracts
Cell
elongates
(a)
Basilar membrane
(b)
Figure 11.31 ❚ The outer hair cells (a) elongate when cilia
bend in one direction; (b) contract when the cilia bend in the
other direction. This results in an amplifying effect on the
motion of the basilar membrane. The difference between elongated and contracted lengths is exaggerated in this figure.
specific frequencies. For this reason, the action of the
VL 13
outer hair cells is called the cochlear amplifier.
The importance of the outer hair cell’s amplifying effect is illustrated by the frequency tuning curves in Figure 11.32. The solid blue curve shows the frequency tuning
of a cat’s auditory nerve fiber with a characteristic frequency
of about 8,000 Hz. The dashed red curve shows what happened when the outer hair cells were destroyed by a chemical that attacked the outer hair cells but left the inner hair
cells intact. It now takes much higher intensities to get the
fiber to respond, especially in the frequency range to which
the fiber originally responded best (Fettiplace & Hackney,
2006; Liberman & Dodds, 1984).
How the Timing of Neural Firing
Can Signal Frequency
We have been focusing on the idea that frequency is signaled by which fibers in the cochlea fire to a tone. But
frequency can also be signaled by how the fibers fire.
Remember from Figure 11.18 that inner hair cells respond
when their cilia bend in one direction and stop responding
when the cilia bend in the opposite direction. Figure 11.33
shows how the bending of the cilia follows the increases and
decreases in the pressure of a pure tone sound stimulus.
When the pressure increases, the cilia bend to the right and
firing occurs. When the pressure decreases, the cilia bend to
The Representation of Frequency in the Cochlea
277
100
Outer hair cells
destroyed
Inner
hair cell
Threshold (dB)
80
Auditory
nerve fiber
firing
60
40
Sound
stimulus
20
0
0.5
1.0
10
20
Frequency (kHz)
Figure 11.32 ❚ Effect of OHC damage on frequency tuning
curve. The solid blue curve is the frequency tuning curve of
a neuron with a characteristic frequency of about 8,000 Hz.
The dashed red curve is the tuning curve for the same neuron
after the outer hair cells were destroyed by injection of a
chemical. (Adapted from Fettiplace & Hackney, 2006.)
the left and no firing occurs. This means that the hair cells
fire in synchrony with the rising and falling pressure of the
sound stimulus. For high-frequency tones, a hair cell may
not fire every time the pressure increases because it needs
to rest after it fires (see refractory period, Chapter 2, page 30).
But when the cell does fire, it fires at the peak of the sound
stimulus.
This property of firing at the same place in the sound
stimulus is called phase locking. When the firing of a number of auditory nerve fibers is phase locked to the stimulus,
they fire in bursts separated by silent intervals, and the timing of these bursts matches the frequency of the stimulus.
Thus, the rate of bursting of auditory nerve fibers provides
information about the frequency of the sound stimulus.
The connection between the frequency of a sound
stimulus and the timing of the auditory nerve fiber firing
is called temporal coding. Measurements of the pattern of
firing for auditory nerve fibers indicate that phase locking
occurs up to a frequency of about 4,000 Hz.
From the research we have described, we can conclude
that frequency is coded in the cochlea and auditory nerve
based both on which fibers are firing (place coding) and on
the timing of nerve impulses in auditory nerve fibers (temporal coding). Place coding is effective across the entire
range of hearing, and temporal coding up to 4,000 Hz, the
frequency at which phase locking stops operating. This information for frequency originates in the inner hair cells
278
CHAPTER 11
Figure 11.33 ❚ How hair cell activation and auditory nerve
fiber firing are synchronized with pressure changes of the
stimulus. The auditory nerve fiber fires when the cilia are bent
to the right. This occurs at the peak of the sine-wave change
in pressure.
and their auditory nerve fibers. In the next section we will
consider how hearing is affected if the hair cells or auditory
nerve fibers are damaged.
Hearing Loss Due
to Hair Cell Damage
The audibility curve in Figure 11.9 is the average curve for
people with normal hearing. There are, however, a number
of ways that hearing loss can occur, and this is reflected in
changes in the audibility function. Hearing loss can occur for
a number of reasons: (1) blockage of sound from reaching the
receptors, called conductive hearing loss; (2) damage to the
hair cells, and (3) damage to the auditory nerve or the brain.
Hearing loss due to damage to the hair cells, auditory nerve,
or brain is called sensorineural hearing loss. We will
VL 14
focus on hearing loss caused by hair cell damage.
We have already seen that damage to the outer hair cells
can have a large effect on hearing. Inner hair cell damage, as
we would expect, also causes a large effect, with hearing loss
occurring for the frequencies corresponding to the frequencies signaled by the damaged hair cells. The most common
form of sensorineural hearing loss is presbycusis, which
means “old hearing.” (Remember that the equivalent term
for vision is presbyopia, or “old eye.” See page 46.)
Presbycusis The loss of sensitivity associated with
presbycusis, which is greatest at higher frequencies, accompanies aging and affects males more severely than females.
Figure 11.34 shows the progression of loss as a function of
age. Unlike the visual problem of presbyopia, which is an
inevitable consequence of aging, presbycusis is apparently
caused by factors in addition to aging, since people in preindustrial cultures, who have not been exposed to the noises
that accompany industrialization or to drugs that could
Sound, the Auditory System, and Pitch Perception
Women
0
20
50 –59
40
70 –74
60
>85
Hearing loss (dB)
80
100
Men
0
20
40
50 –59
60
70 –74
80
>85
100
0.25
0.50
1.0
2.0
4.0
8.0
Frequency (kHz)
Figure 11.34 ❚ Hearing loss associated with presbycusis
as a function of frequency for groups of women and men of
various ages. Losses are expressed relative to hearing for a
group of young persons with normally functioning auditory
systems, which is assigned a value of 0 at each frequency.
(Adapted from Dubno, in press.)
damage the ear, often do not experience large decreases in
high-frequency hearing in old age. This may be why males,
who historically have been exposed to more workplace noise
than females, as well as to noises associated with hunting
and wartime, experience a greater presbycusis effect.
Hearing Loss Noise-induced
hearing loss occurs when loud noises cause degeneration of
the hair cells. This degeneration has been observed in examinations of the cochleas of people who have worked in
noisy environments and have willed their ear structures to
medical research. Damage to the organ of Corti is often observed in these cases. For example, examination of the cochlea of a man who worked in a steel mill indicated that his
organ of Corti had collapsed and no receptor cells remained
(J. Miller, 1974). More controlled studies, of animals that
are exposed to loud sounds, provide further evidence that
high-intensity sounds can damage or completely destroy inner hair cells (Liberman & Dodds, 1984).
Because of the danger to hair cells posed by workplace
noise, the United States Occupational Safety and Health
Agency (OSHA) has mandated that workers not be exposed
to sound levels greater than 85 decibels for an 8-hour work
shift. But in addition to workplace noise hazards, other
sources of intense sound can cause hearing loss due to hair
cell damage.
If you turn up the volume on your MP3 player, you are
exposing yourself to what hearing professionals call leisure
noise. Other sources of leisure noise are activities such as
recreational gun use, riding motorcycles, playing musical
instruments, and working with power tools. A number of
studies have demonstrated hearing loss in people who listen
to MP3 players (Peng et al., 2007), play in rock/pop bands
(Schmuziger et al., 2006), use power tools (Dalton et al.,
2001), and attend sports events (Hodgetts & Liu, 2006). The
amount of hearing loss depends on the level of sound intensity and the duration of exposure. Given the high levels of
sound that occur in these activities, such as the levels above
90 dB that can occur for the three hours of a hockey game
(Figure 11.35) and levels as high as 90 db while using power
tools in woodworking, it isn’t surprising that both temporary and permanent hearing losses are associated with these
leisure activities.
The potential for hearing loss from listening to music
at high volume on MP3 players for extended periods of time
cannot be overemphasized, because at their highest settings
MP3 players reach levels of 100 dB or higher—far above
OSHA’s recommended maximum of 85 dB. This has led Apple Computer to add a setting to iPods that limits the maximum volume, and also to develop a device that can monitor
playing time and listening levels and can either gradually
reduce maximum sound levels or provide a warning signal
when playing time and sound intensity have reached potentially damaging levels. (This feature was not in use, however, at the time this was written.)
One suggestion for minimizing the potential for hearing damage is to follow this simple rule, proposed by James
Battey, Jr., director of the National Institute on Deafness and
Other Communication Disorders: If you can’t hear someone
talking to you at arm’s length, turn down the music (“More
Noise Than Signal,” 2007). If you can’t bring yourself to
turn down the volume, another thing that would help is to
take a 5-minute break from listening at least once an hour!
Noise-Induced
T E S T YO U R S E L F 11. 2
1. Describe the structure of the ear, focusing on the
role that each component plays in transmitting the
vibrations that enter the outer ear to the auditory
receptors in the inner ear.
2. Describe Békésy’s place theory of hearing and the
physiological and psychophysical evidence that
supports his theory. Be sure you understand the
following: tonotopic map, frequency tuning curve,
auditory masking.
3. What does it mean to say that the basilar membrane
is an acoustic prism?
The Representation of Frequency in the Cochlea
279
130
Player
introductions
Sound pressure level (dB)
120
Oilers’
first goal
Oilers’
second goal
First
intermission
Second
intermission
110
100
Figure 11.35 ❚ Sound level of game 3
90
80
70
6:00 PM
6:25 PM
6:50 PM
7:15 PM
7:40 PM
8:05 PM
8:30 PM
8:55 PM
Time of day
4. How can the frequency of a sound be signaled by
the timing of nerve firing? Be sure you understand
phase locking.
5. What is the connection between hair cell damage
and hearing loss? Exposure to occupational or leisure noise and hearing loss?
Central Auditory Processing
So far we have been focusing on how the ear creates electrical signals in hair cells and fibers of the auditory nerve.
But perception does not occur in the ear or in the auditory
nerve. Just as for vision, we need to follow signals from the
receptors to more central structures in order to understand
perception.
Pathway From the Cochlea to the Cortex
The auditory nerve carries the signals generated by the inner
hair cells away from the cochlea and toward the auditory receiving area in the cortex. Figure 11.36 shows the pathway
the auditory signals follow from the cochlea to the auditory cortex. Auditory nerve fibers from the cochlea synapse
in a sequence of subcortical structures—structures below
the cerebral cortex. This sequence begins with the cochlear
nucleus and continues to the superior olivary nuclei in the
brain stem, which consists of a number of subdivisions that
serve different functions, the inferior colliculus in the midbrain, and the medial geniculate nucleus in the thalamus.
(Meanwhile, signals from the retina are synapsing in the
nearby lateral geniculate nucleus in the thalamus.)
From the medial geniculate nucleus, fibers continue to
the primary auditory receiving area (A1), in the temporal
lobe of the cortex. If you have trouble remembering this
280
CHAPTER 11
of the 2006 Stanley Cup finals between
the Edmonton Oilers (the home team) and
the Carolina Hurricanes. Sound levels
were recorded by a small microphone in
a spectator’s ear. The red line at 90 dB
indicates a “safe” level for a 3-hour game.
Sounds above this line can potentially damage
hearing. (From Hodgetts & Liu, 2006.)
sequence of structures, remember the acronym SONIC MG
(a very fast sports car), which represents the three structures
between the cochlear nucleus and the auditory cortex, as
follows: SON ⫽ superior olivary nuclei; IC ⫽ inferior colliculus; MG ⫽ medial geniculate nucleus.
A great deal of processing occurs as signals travel
through the subcortical structures along the pathway from
the cochlea to the cortex. Some of this processing can be
related to perception. For example, processing in the superior olivary nuclei is important for determining auditory
localization—where a sound appears to originate in space
(Litovsky et al., 2002)—and it has been suggested that one
of the functions of subcortical structures in general is to
respond to individual features of complex stimuli (Frisina,
2001; Nelken, 2004). There has been a tremendous amount
of research on these subcortical structures, but we will focus on what happens once the signals reach the cortex.
Auditory Areas in the Cortex
As we begin discussing the auditory areas of the cortex,
some of the principles we will describe may seem familiar
because many of them are similar to principles we introduced in our description of the visual system in Chapters 3
and 4. Most of the discoveries about the auditory areas of
the cortex are fairly recent compared to discoveries about
the visual areas, so in some cases discoveries about the auditory cortex that are being made today are similar to discoveries that were made about the visual system 10 or 20 years
earlier. For example, you may remember that it was initially
thought that most visual processing occurred in the primary visual receiving area (V1), but beginning in the 1970s,
it became obvious that other areas were also important for
visual processing.
Recently it has been discovered that a similar situation
occurs for hearing. At first most research focused on the
primary auditory receiving area (A1) in the temporal lobe
Sound, the Auditory System, and Pitch Perception
Primary
auditory
cortex
(A1)
Medial
geniculate
nucleus
Inferior
colliculus
Left ear
Superior
olivary
nuclei
Auditory nerve
Cochlear
nucleus
Figure 11.36 ❚ Diagram of the auditory
pathways. This diagram is greatly simplified, as
numerous connections between the structures
are not shown. Note that auditory structures
are bilateral—they exist on both the left and
right sides of the body—and that messages can
cross over between the two sides. (Adapted
from Wever, 1949.)
Core area
Belt area
A1
Parabelt area
Figure 11.37 ❚ The three main auditory areas in the
cortex are the core area, which contains the primary auditory
receiving area (A1), the belt area, and the parabelt area.
Signals, indicated by the arrows, travel from core, to belt,
to parabelt. The dark lines show where the temporal lobe
was pulled back to show areas that would not be visible from
the surface. (From Kaas, Hackett, & Tramo, 1999.)
(Figure 11.37). But now additional areas have been discovered that extend auditory areas in the cortex beyond A1.
Research on the monkey describes cortical processing as
starting with a core area, which includes the primary auditory cortex (A1) and some nearby areas. Signals then travel
to an area surrounding the core, called the belt area, and
then to the parabelt area (Kaas et al., 1999; Rauschecker,
1997, 1998).
One of the properties of these auditory areas is
hierarchical processing—signals are first processed in the
core and then travel to the belt and then to the parabelt. One
finding that supports this idea is that the core area can be activated by simple sounds, such as pure tones, but areas outside the core require more complex sounds, such as auditory
noise that contains many frequencies, human vocalizations,
and monkey “calls” (Wissinger et al., 2001). The fact that
areas outside the auditory core require complex stimuli is
similar to the situation in the visual system in which neurons
in the visual cortex (V1) respond to spots or oriented lines,
but neurons in the temporal lobe respond to complex stimuli such as faces and landmarks (Figures 4.33 and 4.35).
In addition to discovering an expanded area in the temporal lobe that is devoted to hearing, recent research has
shown that other parts of the cortex also respond to auditory stimuli (Figure 11.38; Poremba et al., 2003). What is
particularly interesting about this picture of the brain is
that some areas in the parietal and frontal lobes are activated by both visual and auditory stimuli. Some of this
overlap between the senses occurs in areas associated with
the what and where streams for vision (Ungerleider & Mishkin, 1982); interestingly enough, what and where streams,
indicated by the arrows in Figure 11.38, have recently been
discovered in the auditory system.
What and Where Streams for Hearing
Piggybacking on visual research of the 1970s that identified
what and where streams in the visual system (see page 88),
evidence began accumulating in the late 1990s for the existence of what and where streams for hearing (Kaas & Hackett, 1999; Romanski et al., 1999). The what, or ventral, stream
(green arrow) starts in the anterior (front) part of the core
and belt, and extends to the prefrontal cortex. The where, or
Central Auditory Processing
281
Where
Frontal lobe
What
Temporal lobe
(a)
Auditory
Localization
Recognition
Auditory and visual
1.25
Figure 11.38 ❚ Areas in the monkey cortex that respond to
Performance
auditory stimuli. The green areas respond to auditory stimuli,
the purple areas to both auditory and visual stimuli. The
arrows from the temporal lobe to the frontal lobe represent
the what and where streams in the auditory system. (Adapted
from Poremba et al., 2003.)
0.5
–3.2
dorsal, stream (red arrow) starts in the posterior (rear) part
of the core and belt, and extends to the parietal cortex and
the prefrontal cortex (Figure 11.38). The what stream is responsible for identifying sounds, and the where stream for
locating sounds.
Some of the first evidence supporting the idea of what
and where streams for hearing came from experiments that
showed that neurons in the anterior of the core and belt responded to the sound pattern of a stimulus, and neurons in
the posterior of the core and belt responded to the location of
the stimulus (Rauschecker & Tian, 2000; Tian et al., 2001).
Cases of human brain damage also support the what/
where idea (Clarke et al., 2002). For example, Figure 11.39a
shows the areas of the cortex that are damaged in J.G., a
45-year-old man with temporal lobe damage caused by a
head injury, and E.S., a 64-year-old woman with parietal
and frontal lobe damage caused by a stroke. Figure 11.39b
shows that J.G. can locate sounds, but his recognition is
poor, whereas E.S. can recognize sounds, but her ability to
locate them is poor. Thus, J.G.’s what stream is damaged,
and E.S’s where stream is damaged.
The what/where division is also supported by brain scan
experiments. Figure 11.40 shows areas of cortex that are
more strongly activated by recognizing pitch (a what task)
in green and areas that are more strongly activated by detecting a location (a where task) in red (Alain et al., 2001).
Notice that pitch processing causes greater activation in
ventral parts of the brain (anterior temporal cortex), and
sound localization causes greater activity in dorsal regions
(parietal cortex and frontal cortex). (Also see Meader et al.,
2001; Wissinger et al., 2001.) Thus, evidence from animal
recording, the effects of brain damage, and brain scanning
supports the idea that different areas of the brain are activated for identifying sounds and for localizing sounds (also
see Lomber & Malhotra, 2008).
282
CHAPTER 11
–17.9
J.G.: Poor
recognition
E.S.: Poor
localization
(b)
Figure 11.39 ❚ (a) Colored areas indicate brain damage
for J.G. (left) and E.S. (right). (b) Performance on recognition
test (green bar) and localization test (red bar). (Clarke, S.,
Thiran, A. B., Maeder, P., Adriani, M., Vernet, O., Regli, L.,
Cuisenaire, O., & Thiran, J.-P., What and where in human
auditory systems: Selective deficits following focal hemispheric
lesions, Experimental Brain Research, 147, 2002, 8–15.)
Figure 11.40 ❚ Areas associated with what (green) and
where (red) auditory functions as determined by brain
imaging. (Alain, C., Arnott, S. R., Hevenor, S., Graham, S., &
Grady, C. L. (2001). “What” and “where” in the human auditory
systems. Proceedings of the National Academy of Sciences, 98,
12301–12306. Copyright 2001 National Academy of Sciences,
U.S.A.)
Sound, the Auditory System, and Pitch Perception
Pitch and the Brain
tween frequencies, led Tramo to conclude that the auditory
cortex is important for discriminating between different
frequencies.
Another approach that has been used to study the link
between pitch and the brain is to find neurons in the brain
that respond to both pure tones and complex tones that differ in their harmonics but have the same pitch. Remember
from page 264 that the pitch of a complex tone is determined by information about the tone’s fundamental frequency; even when the fundamental or other harmonics are
removed, the repetition rate of a stimulus remains the same,
so the perception of the tone’s pitch remains the same.
Daniel Bendor and Xiaoqin Wang (2005) did this experiment on a marmoset, a primate that has a range of hearing
similar to that of humans. When they recorded from single
neurons in an area just outside the primary auditory cortex and in nearby areas, they found some neurons that responded similarly to complex tones with the same fundamental frequency, but with different harmonic structures.
For example, Figure 11.43a shows the frequency spectra for
a tone with a fundamental frequency of 182 Hz. In the top
record, the tone contains the fundamental frequency and
the second and third harmonics; in the second record, harmonics 4–6 are present; and so on, until at the bottom, only
harmonics 12–14 are present. The important thing about
these stimuli is that even though they contain different
frequencies (for example, 182, 364, and 566 Hz in the top
record; 2,184, 2,366, and 2,548 Hz in the bottom record),
they are all perceived as having a pitch corresponding to the
182-Hz fundamental.
The corresponding cortical response records (Figure
11.43b) show that these stimuli all caused an increase in
firing. To demonstrate that this firing occurred only when
information about the 182-Hz fundamental frequency
was present, Bendor and Wang showed that the neuron
responded well to a 182-Hz tone presented alone, but not
to any of the harmonics when they were presented alone.
These cortical neurons, therefore, responded only to stimuli
associated with the 182-Hz tone, which is associated with a
specific pitch. Because of this, Bendor and Wang call these
neurons pitch neurons.
The two types of evidence we have just described—
research showing that damage to the auditory cortex affects
What are the brain mechanisms that determine pitch, and
where are they located? We have already seen that the frequencies of pure tones are mapped along the length of the
cochlea, with low frequencies represented at the apex and
higher frequencies at the base (Figure 11.25). This tonotopic
map also occurs in the structures along the pathway from
the cochlea to the cortex, and in the primary auditory receiving area, A1. Figure 11.41 shows the tonotopic map in
the monkey cortex, which shows that neurons that respond
best to low frequencies are located to the left, and neurons
that respond best to higher frequencies are located to the
right (Kosaki et al., 1997; also see Reale & Imig, 1980; Schreiner & Mendelson, 1990).
Linking Physiological Responding
and Perception
Just because neurons that respond best to specific frequencies are organized into maps on the cortex doesn’t mean that
these neurons are responsible for pitch perception. As we
noted for vision, we need to go beyond mapping a system’s
physiological characteristics to demonstrate a link between
physiology and perception. Just as finding neurons in the
visual cortex that respond to oriented bars does not mean
that these neurons are responsible for our perception of the
bars, finding neurons in the auditory cortex that respond
to specific frequencies doesn’t mean that these neurons are
responsible for our perception of pitch. What is necessary in
both cases is to demonstrate links between the physiological processes and perception.
Mark Tramo and coworkers (2002) studied a patient
they called A, who had suffered extensive damage to his
auditory cortex on both sides of the brain due to two successive strokes. The green bars in Figure 11.42 show that A’s
ability to judge the duration of sounds and the orientation
of lines was normal, but the red bars show that his ability
to judge the direction of frequency change (high to low or
low to high) and to detect differences in pitch were much
worse than normal. This result, which shows that damage
to the auditory cortex affects the ability to discriminate be-
0.3
0.5
0.25
0.3
0.125
0.5
1
1
4
2
2
8
8
14
18
20
Figure 11.41 ❚ The outline of the core
area of the monkey auditory cortex,
showing the tonotopic map on the primary
auditory receiving area, A1, which is
located within the core. The numbers
represent the characteristic frequencies
(CF) of neurons in thousands of Hz. Low
CFs are on the left, and high CFs are on
the right. (Adapted from Kosaki et al.,
1997.)
Pitch and the Brain
283
Poorer 15
performance
10
5
Normal
performance
0
Duration Orientation
of lines
of sounds
the ability to discriminate between frequencies, and the discovery of pitch neurons that respond to stimuli associated
with a specific pitch even if these stimuli have different harmonics—both support the idea that 
Descargar