Fundamentals of people surveillance

Anuncio
Fundamentals of people surveillance
Prof. Rita Cucchiara
Prof
Rita Cucchiara
Facoltà di Ingegneria “Enzo Ferrari”
Università di Modena e Reggio Emilia (Italy)
http://Imagelab.ing.unimore.it
Abstract
People surveillance is one of the hottest topics of the last decade in computer vision and pattern People
surveillance is one of the hottest topics of the last decade in computer vision and pattern
recognition research; it covers all aspects of computer engineering and computer science, models and algorithms, ,software and hardware architecture, real‐time data processing and management, machine learning and knowledge‐based reasoning, to detect in the space‐time dimensions people living in the real world starting from tsunami of visual data, acquired by networks of static and moving cameras, recognizing their presence also in cluttered and crowded environment, i
ii
h i
l i l
d d
d d
i
extracting information about their aspect, motion, action and interaction and eventually behavior. Although similar approaches are proposed for many different targets ( people, vehicles, aircrafts, Alth
h i il
h
df
diff
tt
t (
l
hi l
i
ft
animals etc), this course addresses mainly people, representing the principal focus of interest in surveillance for security and safety and in multimedia analytics for forensics, and more complex than other objects due to their non‐rigid structure, unpredictable motion and activity.
This short course is a survey of fundamentals of people surveillance: ‐ People detection with motion analysis ( background suppression and motion vector processing) – People detection with appearance cues ( pedestrian detection) , People tracking with single and multiple heterogeneous sensors – People action analysis according with their appearance and motion ( trajectory, body motion..) i
)
Many references to the main international research results will be proposed to assess the state‐of‐the‐
art of people surveillance.
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Agenda
•
•
•
•
•
•
•
Introduction
Video Surveillance and Forensics
Video Surveillance and Forensics
Design Aspects for surveillance
CV & PR for people surveillance
People shape detection
People shape detection
People behavior by trajectory analysis
Conclusion
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Agenda
•
•
•
•
•
•
•
Introduction
Video Surveillance and Forensics
Video Surveillance and Forensics
Design Aspects for surveillance
CV & PR for people surveillance
People shape detection
People shape detection
People behavior by trajectory analysis
Conclusion
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Video surveillance
Video surveillance
Video‐surveillance concerns models, techniques and systems for • acquiring videos about the 3D external world,
• detecting targets along the time and the space, • recognizing interesting or dangerous situations, interesting or dangerous situations,
• generating real‐time alarms • recording meaningful data about the controlled scene. Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Multidisciplnary
C
Context
Imaging, g g
Image processing, Pattern recognition, Computer vision Machine learning
Physics
Computer architecture
Digital Electronics
Ubiquitous computing, networking, Wi d i l
Wired‐wireless Communication
Multimedia
Data managing, knowledge representation
AI…
Video surveillance
Video surveillance
 Research world
•
Commercial world
 ‘60‐70 Hardware
• ‘60 ‐70 Analogue cameras
 ‘80 Military research
• ‘80 Digital CCTV systems
 ‘90 Traffic monitoring
• ’90 Digital surveillance systems
 ‘00 People surveillance
• ‘90 Network surveillance systems
 ‘10 ? Life surveillance
• ’10? Ubiquitous, WAN systems
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Vid
Video surveillance
ill
systems**
Universal Multimedia Access
Acquisition
Wired/Wirless
Network
Storage
(Remote) Display
*R. Cucchiara,"Video sorveglianza per l'individuazione di persone e l'analisi comportamentale“ in Safety&Security, 2010
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Video surveillance
Prof. Rita Cucchiara – Università di Modena e Reggio Emilia
Commercial Video surveillance …hardware
Gs3
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Commercial Video surveillance …software
Gs3
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Videosurveillance market
 IMS Research (2010)
 China: 10 million security cameras will be shipped for domestic consumption in China in 2010. Network security cameras are forecast to account for only 3% of the market in terms of unit shipments.
 Russia: over 20% revenue growth year‐on‐year for the CCTV and video surveillance equipment market in Russia in 2010 and 2011
 At EMEA (Europe, Medium East, Africa) growth 33% /year in hardware for
video surveillance . Software di video analytics
video surveillance
Software di video analytics software in surveillance
software in surveillance : growth 10%/year (215 M£ in 2009 )
 Frost & Sullivan
Frost & Sullivan (Dic. 2008)
(Dic 2008)
 2009 British Government UK : 80 M£
 2009 Department of Homeland Security (DHS) USA 239 M$  Research funds? DARPA FP7 ,JLS, SMEs Companies, ( Miur )
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Commercial projects
Commercial
projects in 2010: Chicago: Virtual
in 2010: Chicago: Virtual
Shield
Chicago (2006‐)
(2006 )
Partner : IBM Whatson Research
3000 cameras & video analytics (215M$)
Chicago
Operation
Virtual Shield
R. Cucchiara,"La visione artificiale per la videosorveglianza“ in Mondo Digitale, vol. 8, n. 3, pp. 39‐47, 2008
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
IBM S3 Hybrid Surveillance Solution
IBM S3 Hybrid
Modules:
SSE: processes
SSE:
processes data from
data from
sensors and generates XML meta‐data.
MILS: provides the infrastructure for indexing, g,
retrieving and managing
event meta‐data
[Tian08]Tian, Y.L,
[Tian08]Tian,
Y.L, Brown, L.M.,
Brown, L.M., Hampapur, A.,
Hampapur, A., Lu, M.,
Lu, M., Senior, A.,
Senior, A., Shu, C.,
Shu, C., IBM smart
IBM smart surveillance system (S3): system (S3):
event based video surveillance system with an open and extensibleframework, MVA(19), 2008,
Commercial projects
Commercial
projects in 2010: Golden shield in 2010: Golden shield
program
Beijing Cina: Cina: “china
china golden shield program
golden shield program” Partner : IBM, Honeywell and General Electric
200.000 cameras(2008‐)
China's All‐Seeing Eye: Shenzhen
2 million cameras(2009‐)
Biometry and commercial datai
y
((VISA..) )
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Research projects in in ‘90‐’00:
90 00: Traffic
Traffic
surveillance
‘90 t ffi surveillance
‘90.. traffic
ill
D. Koller, K. Daniilidis, H.‐H. Nagel Model‐Based Object Tracking in Monocular Image Sequences of Road Traffic Scenes (1993)
IJCV
QUEUE Detection. Dr. Porkili et al Mi bi hi El
Mitsubishi Electric Research Labs, USA 2004
i R
h L b USA 2004
Detecting stopped vheicles in highways. ImageLab, Modena (I) ‐ Traficon (B) WACV 2004
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Research projects in Human
in Human Surveillance ‘00:
00: Carneige Mellon University VSAM
Prof. T. Kanade VSAM Carneige Mellon University USA
Carneige Mellon University, USA
Darpa Program
1997‐2000
http://www.cs.cmu.edu/~vsam/
 ObjectVIDEO..
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Research projects at Univ. of San Diego: DIVA & ATON Projects
Prof. M. Trivedi UCSD San Diego
DIVA, DIVA,
People and stopped vehicle automatic detection in dangerous zones
Andrea Prati, Ivana Mikic, Mohan M. Trivedi, Rita Cucchiara: Detecting Moving Shadows: Algorithms and Evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 25(7): 918‐923 (2003)
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Research projects at Univ. of Central Florida
M. Shah et Al. 2007‐2009
The KNIGHT Project detection and tracking
The KNIGHT Project: detection and tracking multiple people,
•the Florida Department of Transportation,
• Orlando Police Department, DARPA Orlando Police Department DARPA
•University of Central Florida
COCOA Project : tracking from UAV
COCOA Project : tracking from UAV
WHERE I AM project: auto‐tracking Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Research projects at i‐LIDS‐ UK i‐LIDS Imagery library for intelligent detection systems  the government's benchmark for video‐based detection systems  The government is committed to promoting the development of VBDS to help in policing and counter‐terrorist operations.  Home Office Scientific Development Branch (HOSDB)
 Dr. PAUL HOSMER
– 5 cameras
– 1.35 Million frames
1 35 Milli f
– Single and multiple target
– 1000+ target events
IEE International Symposium on Imaging for Crime Detection and y p
g g
Prevention Dec.2009 London
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Research projects at Univ
at Univ of Modena and Sidney : Modena and Sidney :
Automatic abandoned pack detection
Project Automatic real‐time detection of infiltrated objects for security of airports j
j
y
p
and train stations (2006‐2008)
• Imagelab University of Modena
• University of Technology Sidney (Australian Research Council)
University of Technology Sidney (Australian Research Council)
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Research projects at Univ. Of
at Univ Of Modena: Modena:
Surveillance and privacy
2006‐2008 LAICA Laboratorio di Ambient
Intelligence per una Città Amica
Regione Emilia Romagna In 1 1 P Telematico
Regione Emilia Romagna In 1.1 P. Telematico  ImageLab Comune Reggio Emilia
 WTI (Bridge 129)
 Univ. Bologna, Modena, Parma…
FREE SURF
Free Surveillance in a privacy respectful way
MIUR PRIN 2006‐2008
Prof. Rita Cucchiara – Università di Modena e Reggio Emilia
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Research projects in Behavioral Analysis …
Many…….
Dr K.Sudo et al. NTT Cyber Space Lab, Japan
Detecting anomalies
at ATM
(a), (b) normal
(c ) (d) abnormal
(c ),(d) abnormal
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Agenda
•
•
•
•
•
•
•
Introduction
Video Surveillance and Forensics
Video Surveillance and Forensics
Design Aspects for surveillance
CV & PR for people surveillance
People shape detection
People shape detection
People behavior by trajectory analysis
Conclusion
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Video Surveillance & Video Analytics
Intelligent, smart,
Intelligent,
smart,
Video Surveillance
To
Video Analytics
To
Video and Vid
d
Multimedia Forensics
H
Human‐in‐the‐loop
i h l
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
(When) multimedia meets surveillance
and forensics in people security
and forensics
in people security
Multimedia
The use of different media contents (text audio images, video, animation interactivity…) to convey information for users
Di i l f
Digital forensics
i
A form of digital investigation that can be entered in a legal court
entered in a legal court
Forensic video analysis
The scientific examination, f
comparison and/or evaluation of video in legal matters for investigation
People security The degree of protection of The
degree of protection of
people against damage, danger, criminal or terroristic actions
Surveillance The manual/automatic monitoring of situation, behaviour, activity to generate alarm and record meaningful situation
alarm and record meaningful situation
Video Surveillance ll
The acquisition and processing of visual data about the external word to detect target along the time and the space, g
g
p
to recognize interesting and dangerous situations, to generate alarms and record data on the scene
R.Cucchiara When multimedia meets surveillance and forensics in people security Keynote Workshop MIFOR2010 at ACM Multimedia 2010
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Multimedia surveillance
Multimedia surveillance
• Th
The integration of multimedia technologies and sensor i
i
f
l i di
h l i
d
networks constitutes the fundamental infrastructure of new g
generation of multimedia surveillance systems, y
,
• where many different media streams (audio, video, text, 3D graphics, sensor data..) concur to provide an automatic analysis of the controlled environment and a support for l i f h
ll d
i
d
f
human interpretation of the scene [Cuc05].
• Multimedia surveillance systems to
– Enlarge the view
– Enhance the view
– Explore new views for human security employers
[Cuc05]R. Cucchiara “Multimedia surveillance systems” Proc of VSSN’05 at ACM Multimedia Singapore 2005
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Digital Forensic and Computer Forensics
and Computer Forensics
DIGITAL FORENSICS DIGITAL
FORENSICS
• “The digital revolution
introduces digital trace of
trace of
our activity in Real Life” [Franke09]
– Computer activities
– Digital Social Interactions
– Digital Evidence of
analog Processes
COMPUTER FORENSCIS
COMPUTER
FORENSCIS
• “When computer are involved criminal activities” involved criminal activities
[Kruse01]:
– Tools for committing crimes
– Substrate where crimes are committed
are committed
[Francke09] Franke
]
K., Shriari
,
S. Computational
p
Forensics an Overview 2009
[Kruse01] Kruse, W., Heiser, J.: Computer Forensics: Incident Response Essentials. Addison Wesley, Reading (2001)
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Multimedia Forensic
Multimedia Forensic
MULTIMEDIA FORENSICS [[Bohme09]]
• Many data are collected
through sensors
ONTOLOGY ON FORENSICS
• Create a digital counterpart of
reality
• Digital data can be probative elements in many
in many investigation
(e.g. video, audio, photos..)
“Data must be authentic and reliable”
[Boheme09] Böhme, R., Freiling, F. C., Gloe, T., and Kirchner, M. 2009. Multimedia Forensics Is Not
Computer Forensics. In Proc. of the 3rd international Workshop on Computational Forensics
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Computational Forensic is not Computer Forensic
Computational Forensic refers to applying computer aided techniques to digital data understanding:
– Assist in basic and applied research and data mining
A i ti b i
d
li d
h dd t i i
– Establish or prove the scientific basis of a particular investigative procedure
investigative procedure
– Support the forensic examiner in their daily case work.
“Modern crime investigation shall profit from the hybrid‐
intelligence of humans and machines”
intelligence of humans and machines
[Francke09] Franke
]
K., Shriari
,
S. Computational
p
Forensics an Overview 2009
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Computational Forensics Techniques
•
•
•
•
•
•
•
Signal / Image
/ Image Processing : one‐dimensional
Processing : one dimensional signals and 2‐dimensional and 2 dimensional
images are transformed for the purpose of better human or machine processing,
Computer Vision : images are automatically recognized to identify
Computer Vision : images are automatically recognized to identify objects,
Computer Graphics / Data Visualization : two‐dimensional images or y
three‐dimensional scenes are synthesized from multidimensional data for
better human understanding,
Statistical Pattern Recognition : abstract measurements are classified as belonging to one or more classes, e.g., whether a sample belongs to a k
known class and with what probability,
l
d ih h
b bili
Data Mining : large volumes of data are processed to discover nuggets of information, e.g., presence of associations, number of clusters, outliers in a cluster
a cluster,
Robotics : human movements are replicated by a machine, and
Machine Learning : a mathematical model is learnt from examples.
[Francke09] Franke K., Shriari S. Computational Forensics an Overview 2009
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Revisited Ontology of Forensic
Computational Forensics
•Surveillance
•Biometrics
•Bioinformatics
•Data Mining
•3D
3D Reconstruction
Reconstruction
•Document analysis…
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Examples
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Examples
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Multimedia
(Video) surveillance
Multimedia
(Video) forenscis
On‐line analysis
On‐line analysis
Off line analysis on logged
Off‐line analysis
on logged data
Real time response
Fast processing in large
sets of data
Many fixed conditions
and pre‐defined
d
d fi d constraints
i
Often undefined camera settings and constraints
and constraints
Correlation between cameras
Consistent labeling
Recognition
Correlation between objects
j
Re‐identification
Mining….
Noise and uncerainity on (movement) data
High variability of visual data
Need of multimedia data management multimedia data management
and analysis tools for people security
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Agenda
•
•
•
•
•
•
•
Introduction
Video Surveillance and Forensics
Video Surveillance and Forensics
Design Aspects for surveillance
CV & PR for people surveillance
People shape detection
People shape detection
People behavior by trajectory analysis
Conclusion
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Design Aspects: from hardware…..
Design Aspects: from
hardware
Universal Multimedia Access
Acquisition
Wired/Wirless
Network
Storage
(Remote) Display
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Design aspects: to software
Design aspects: …..to
Universal Multimedia Access
Acquisition
Wired/Wirless
Network
(Remote) Display
(Remote) Display
Storage
Software for Analysis, Control, Prediction….
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Architecture: Parallelism and multimodality
Architecture: Parallelism
and multimodality
Parallesim
aerial,…
Modality
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Design aspects
Design aspects
• For multicamera, distributed surveillance and sensor networks
1 sensor topology
1.
t l
2.
3.
4
4.
5.
architecture topology
p gy
communication aspects
d t ffusion
data
i
data processing.
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Design aspects
g
p
1.1) Sensor topology: placement
 best
best placement
placement definition using linear
programming and heuristics
 Using simulating annealing
 d
definition
f
off a 3D Visibility
b l model
d l off a tag( for people) and optimization via binary integer programming
[E. Hoester, R. Linheart Optimal
Placement of Multiple sensors in
M
.
Academic Press 2009, proc of ACM
VSSN 2006
Mittal, L. Davis A general method for
sensor planning in multi-sensory
systems: extension to random
occlusion Journal of Computer
occlusion,
Vision 2008
[J Zhao S. Cheung, TNnguyen
,MULTICAMERA SURVEILLANCE WITH
VISUAL TAGGING AND GENERIC
CAMERA PLACEMENT M
. Academic Press
2009, and
Proc id ICSDC 2007
Vm(x y θ r)
Vm(x, y, θ, r),
Design aspects g
p
1.2) Sensor topology: learning
 detection
detection the topology
the topology of an existig camera camera
network
 inferring the topology of a camera network by measuring statistical dependence between entrances and exits (GPS‐based performance analysis)
p
y )
[T.Ellis, D. Makris, J.Black Learning a
multicamera topology Proc of VS&
PETS2003
T.Ellis, D. Makris, J.Black Bridging the gap
across cameras CVPR2004
[ieu, Dalley, Grimson
G
Inference
f
off non
overlapping camera network topology
by measurng statistical dependence
Proc of ICCV 2005
[Besma R. Abidi, Nash R. Aragam, Yi Yao,
Mongi A.
A Abidi: Survey and analysis
of multimodal sensor planning and
integration for wide area surveillance
ACM Computing Reviews 2009
From Ellis et al. Pets 2003
Design aspects
g
p
2) Architecture topology
 Centrali
Centralized
ed ( PRISMATICA project: ( PRISMATICA project
central servers with dedicated nodes
for cameras, audio, smart cameras and so on))
 Semi‐distributed ( ADVISOR project: manyy independent
p
nodes each one
connected with more cameras)
 Distributed network with embedded
systems
 Distributed network with agent based
communication
[P. Lai Lo, J. Sun, s. Velastin Fusing visual
and audio information in a distributed
surveillance system ACTA Automatica
SINCA 2003
[M. Valera Espina and S.A. Velastin,
"Intelligent Distributed Surveillance
Systems: A Review," IEE Proc. Vision,
Image and Signal Processing, Apr.
2005, pp. 192—204
[M. Christensen, R. Alblas V2 design
issues in distributed video
surveillance systems Denemark 2000
[M. A. Patricio, J. Carbó, O. Pérez, J.
García, and J. M. Molina Multi-Agent
Framework in Visual Sensor
Networks EURASIP Journal on
Advances in Signal Processing
Volume 2007 (2007),
Design aspects
g
p
3) Communication aspects:




Shared memory and synchonization
and synchoni ation
Bandwidth allocation
Wireless and mobile protocols
p
Sensor Networks
Security:
 In distributed
I di t ib t d systems
t
 Privacy & Authentication
– A book covering all design aspects
[C Regazzoni, V.Ramesh, G. Foresti
Special Issue on video
communication processing and
understanding for third generation
surveillance systems P
O IEE
2001
[ Akyildiz,
[.
Akyildiz W
W. Su, , Y.
Y
Sankarasubramaniam and E.
Cayirci Wireless sensor networks:
a survey Computer Networks
Volume 38,
38 Issue 44, 15 March 2002
[ Tubaishat, S Madria Sensor networks: an
overview - IEEE potentials, 2003
[Adrian Perrig , J.Stankovic , D. Wagner
Security in wireless sensor
networks Communications of the
ACM Volume 47 , Issue 6 (June
2004)
M
Aghajan,
Cavallaro eds. Academic Press 2009
Design aspects g
p
4) Data fusion:
 Calibration of the multicamera the multicamera
system, for object association across
multiple camera
 sharing of information obtained by
different type of sensors
 E.g. color and thermal cameras
 E.g. mobile, wireless, fire, sound…
 E.g. visual
E
i l and PIR sensors
d PIR
5) Data processing:
  Computer Vision & pattern recognition
[J.Han, B.Bhanu Fusion of color and
infrared video for moving human
detection Pattern
recognition 2007
Y Tseng, Y Wang, K Cheng, Y Hsieh
Imouse: an integrated mobile
surveillance and wireless sensor
system IEEE computer 2007
[. Cucchiara, A. Prati, R. Vezzani, L.
Benini, E. Farella, P. Zappi, "Using a
Wireless Sensor Network to Enhance
Video Surveillance" in Journal of
Ubiquitous Computing and
Intelligence (JUCI), vol. 1, pp. 1-11,
2006
Agenda
•
•
•
•
•
•
•
Introduction
Video Surveillance and Forensics
Video Surveillance and Forensics
Design Aspects for surveillance
CV & PR for people surveillance
People shape detection
People shape detection
People behavior by trajectory analysis
Conclusion
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Algorithms for data processing
data processing
Multiple Stationary cameras
Ptz Cameras
Moving
cameras
Sensor Networks
Airborne
Ai
b
cameras
Heterogeneous cameras
Distributed cameras
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
From detecting
to reasoning on people
on people
People detection
p
People
segmentation
Localization
People tracking
People Re‐identification
Re
identification
Search
Soft‐biometryy
Face detection & recognition
People Identification
Motion analysis
Activity
analysis
Identity assessment
Biometry
Single and
and multiple objects
&
Posture, Gesture, Gate, Trajectories..
Single Single
and multiple sensors
Action ( & interaction) With environment
analysis
With moving objects
With people
Bheavior analysis
Understanding
Modelingg
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
System Architecture
People segmentation and tracking
PEOPLE DETECTION AND TRACKINGI WTH
STATIC CAMERAS
2000
C. Wren, A. Azarbayejani, T. Darrell, and A.P.
Pentland “Pfinder:
Pentland,
Pfinder: real-time
real time tracking of the human
body,” IEEE Trans. PAMI, ( 19) 7, 1997.
C. Stauffer and W.E.L. Grimson, “Learning patterns
of activity using real-time tracking,” IEEE Trans.
PAMI, ( 22) 8, 2000.
I. Haritaoglu, D. Harwood, and L.S. Davis, “W4: realtime surveillance of people and their activities,” IEEE
Trans. PAMI, ( 22) 8, 2000.
Tao Sawneir,
Sawneir Kumar.
Kumar Object tracking with bayesian
estimation of dynamic layer representation IEEE
Trans on PAMI 24,1 2002
PEOPLE TRACKING WITH MULTIPLE STATIC CAMERAS
2008
A. C. Sankaranarayanan, A.Veeraraghavan, and R.Chellappa,
Object Detection, Tracking and Recognition for Multiple Smart
Camera Proceedings of the IEEE | Vol. 96, No. 10, October 2008
S. Calderara,
S
Calderara A.
A Prati,
Prati R.
R Cucchiara,
Cucchiara “Bayesian
Bayesian-competitive
competitive
Consistent Labeling for People Surveillance“ on IEEE Trans on
PAMI, feb. 2008
Saad M. Khan and Mubarak Shah; Tracking Multiple Occluding
People
p by
y Localizing
g on Multiple
p Scene Planes;; IEEE TRANS. ON
PAMI, VOL. 31, NO. 3, MARCH 2009
R. Cucchiara, C. Grana, M. Piccardi, A. Prati
“Detecting
Detecting Moving Objects, Ghosts and Shadows in
Video Streams“, IEEE Trans on PAMI, 2003
OCCLUSION DETECTION
2004
Nguyen, H.T.
N
H T Smeulders, A.
S
ld
A Fast occluded object F t
l d d bj t
tracking by a robust appearance filter IEEE Trans on PAMI, 2004 Tao Zhao Nevatia, R. Tracking multiple humans in complex situations IEEE Trans. PAMI 2004
With forest of cameras
With moving and mobile cameras
and mobile cameras
..
In crowd…
( Shah’ss Talk ACM MM2010)
( Shah
Talk ACM MM2010)
2012
Background suppression: milestones et al.
Background suppression:
 Gray level vs. Color Analysis
 Mixture of Gaussians
 Background and Shadow detection  Background and layered representation
 Background with
Background with kernel density
 …….. And other hundreds of papers…
S.J. McKenna, S. Jabri, Z. Duric, A. Rosenfeld,
and H.Wechsler, “Tracking groups of people,”
Computer Vision and Image Underst.,( 80)1,
2000.
C. Stauffer and W.E.L. Grimson,
“Learning patterns of activity
using real-time tracking,” IEEE
Trans. PAMI, ( 22) 8, 2000.
(1400 citation on google !)
A SHORT SURVEY A. Prati, I. Mikic, M.M.
Trivedi, R. Cucchiara, "Detecting Moving
Shadows: Algorithms and Evaluation," IEEE
Trans. on PAMI, July 2003
Tao Sawneir, Kumar. Object
tracking with bayesian estimation
of dynamic layer representation
IEEE Trans on PAMI 24,1 2002
A.Elgammal, R.Duraiswami, D.Harwood, L.S.
Davis Background and Foreground Modeling
Using Nonparametric Kernel Density for Visual
Surveillance Proceedings of the IEEE 2003
Detection and tracking:
g
At each frame Segmentation (into blobs)
Tracking (observation model
(observation‐model correspondence)
At each frame Region of interest Region
of interest
selection
Tracking ( d l b
(model‐observation ti
correspondence)
Initial steps:
Initial steps:
DETECTION & TRACKING
People tracking: milestones
 Many SURVEYs
 Most used techniques:
Moeslund , Hilton, Kruger A survey of advanced
in vision-based human motion capture and
analysis CVIU vol 104 2006
Yilmaz Javed Shah Object tracking a survey
ACM COMPUTING Survey vol 33 n 4 2006
y
, A;; Chellappa,
pp ,
Wu,, H;; Sankaranarayanan,
R;Online Empirical Evaluation of Tracking
Algorithms IEEE Trans. PAMI 2009
 MeanSHIFT
 Particle filtering


Appearance
pp
based trackingg
K
Kernel
l based
b d tracking
ki
Dorin Comaniciu, Visvanathan Ramesh: Mean
Shift and Optimal Prediction for Efficient Object
Tracking. ICIP 2000
M. Isard and A. Blake. A smoothing
filter for condensation. In Proc.
ECCV 1998
ECCV,
1998.
.........
Tao Zhao Nevatia,
Nevatia R.
R Tracking multiple humans
in complex situations IEEE Trans. PAMI 2004
….
..Han, Comancio, Davis sequential kernel density
approximation for real time visual tracking
IEEE Trans. PAMI 2009
Single camera surveillance at Imagelab
Single camera surveillance
at Imagelab
Detection with SAKBOT
Statistical and knowledge
and knowledge based
Object deTection [TPAMI2003]
Smoke analysis [MVA J 2009]
y [
]
Abnormal bheavior [CVPR2008]
Tracking with AD HOC
Apperance based Discriminative
Apperance based Discriminative
Handling[ with Occlusion [ICPR2004]
People and Posture classification
p
[[TSMC2005]]
Recognition stopped vheicles
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
To parallel and distributed surveillance
Surveillance systems:
 A review of commercial systems
 A review of hardware and software requirements
 A survey of multicamera and distributed surveillance
Valera, M. Velastin, S.A.
Digital Imaging Res. Centre, Kingston
Univ., UK; Intelligent distributed
surveillance
ill
systems:
t
a review
i
Vision, Image and Signal Processing,
IEE Proceedings -2005
Hu, Tan, Wang, Maybank: A Survey on visual
surveillance of object motion and
bheaviors IEEE Trans. On system
y
man
and cybernetics vol34 n 3 2004
A. C. Sankaranarayanan, A.Veeraraghavan,
and R.Chellappa, Object Detection,
Tracking and Recognition for Multiple
Smart Camera Proceedings of
the IEEE | Vol.
V l 96,
96 No.
N 10,
10 October
O t b
2008
To parallel and distributed surveillance
• Multicamera (multiview)
surveillance
•
•
•
•
ffully synchronized acquisition ; 1 frame f
grabber with 1‐20 fixed and PTZ cameras; 1 (multiprocessor) computer for many
cameras
 shared memory architecture
Challenges: More precision, 3D reconstruction, consistent
,
labelingg in multiview, occlusion handling, people identification, beh
avior analysis
 Distributed ( network camera)
surveillance
•
•
•
loosely coupled acqusition and processing; and processing;
potentially thousands of nodes with smart
cameras and traditional network cameras
message
g p
passing
g architecture
Challenges: Large coverage, communication, bandwidth, tradeoff‐ local computation and computation al power; less precision, multiple hypothesis generation , search for
similarity
Distributed surveillance and sensor network
Multicamera surveillance
+ in addition freely moving cameras
on vehicles, moving infrastructure, hand‐cameras
Homography and data association
and data association
• In real context noise, errors in homography, p
lack of planar constraints introduce uncertainly in the position
From [A. C. Sankaranarayanan, A.Veeraraghavan, and R.Chellappa, Object
Detection, Tracking and Recognition for Multiple Smart Camera Proceedings of the IEEE | Vol. 96, No. 10, October 2008
| V l 96 N 10 O b 2008
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Multicamera surveillance
Multicamera surveillance
•
Using multiple sensors/cameras
l l
/
h many advantages:
has
d
– Wider coverage of the scene
– Multi‐modal
Multi modal sensoring
– Redundant data (improved accuracy)
– Fault tolerance
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Data fusion for multicamera surveillance
Data fusion
multicamera surveillance
Acquisition
q
Preprocessing
Acquisition
A
i iti
Preprocessing
Fusion
at Pixel level
Calibration
Homography.
…
segmentation
tracking
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Data fusion for multicamera surveillance
Data fusion
multicamera surveillance
Acquisition
q
Preprocessing
Acquisition
A
i iti
Preprocessing
segmentation
t ti
segmentation
t ki
tracking
tracking
Fusion
at Feature level
(people axis)
on
Homography.
…
For consistent
labeling
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
The solution at Imagelab
HECOL (Homography and
Epipolar-based COnsistent
Labeling)
Ground plane homography
homography and epipolar
and epipolar
 Ground‐plane
geometry automatically computed from training videos
 Person’s main axis warped to the other view and Bayesian inference is used for y
validate hypotheses
[S. Calderara, A. Prati, R. Cucchiara,
“Bayesian-competitive
Bayesian-competitive Consistent
Labeling for People Surveillance“ on
IEEE Trans on PAMI, feb. 2008
[S.
S Calderara,
C
A. Prati, R. Cucchiara,
C
“HECOL: Homography and Epipolarbased Consistent Labeling for
Outdoor Park Surveillance"
C
Computer
t Vision
Vi i and
d Image
I
Understanding, 2008
Automatic Homography computation
•
•
Automatic learning phase to
compute overlapping zones and ground‐plane homography. Take many correspondences
among ground plane support
points ( with
(
a tracking algorithm
for a single people)
•
Define the Entry Edge Field of
Views E2oFoV using Least Square
Optimization
•
Define the overlapping zones and the extremes points
•
Compute the homography from
points correspondences
E2oFoV EoFoV
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Video surveillance at ImageLab
Video surveillance
at ImageLab Modena
Sakbot
F
F
F
Fixed camera
Fixed camera
Fixed/moving
camera
Segmentation
Segmentation
Tracking
Tracking
Ad‐hoc
Geometry
Recovery
Hecol
eco
Homograpy &
Epipoles
Tracking
ROI and
model‐
based
Tracking
Sensors
Sensor data
acquisition
Posture analysis
Action analysis
Trajectory
Analysis
Video surveillance
Ontology
PTZ Control
HeadTracking
Face selection
Face Recognition & People Identification
Bheavior recognition
PTZ camera
Mosaicing
Segmentation
& Tracking
Consistent Labeling and Multicamera tracking
VISOR
Moses
People detection
Moving and mobile
Moving
and mobile
camera
P
S
M
High Resolution
Detection Face obscuration
People Annotation
Annotated
video storage
MPEG Streaming
WEB
Security control
centers
Mobile surveillance
platforms
l f
Multicamera syst.
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Experiments in Surveillance
64
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Esperiments in forensics in Modena…
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Example
p
Data correlation
for manual identification
Support of investigation
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
New..
New
•
•
•
Improving distributed
d
b d multi sensor
l
surveillance
ll
Motes CITRIC cameras and RFIDs
Autocalibration of cameras and RFID sensor
and RFID sensor network
State of wearing tag person moving on a random pathway
ICSDC’2010
boundary detected by RFID signal strength
RFID reader operating area
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Agenda
•
•
•
•
•
•
•
Introduction
Video Surveillance and Forensics
Video Surveillance and Forensics
Design Aspects for surveillance
CV & PR for people surveillance
People shape detection
People shape detection
People behavior by trajectory analysis
Conclusion
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
From Moving visual objects to shape detection detection
•
People detection with 3D models
•
Pedestrian detection in still images with machine
learning
Zhao, Nevatia, Who, Segmentation
and tracking on multiple humans
in crowded environments IEEE
T
Trans. PAMI 2009
PAMI 2009
Dalal, Triggs, Histograms of
oriented gradients for human
detection CVPR2005
M. Enzweiler, d. Gravila
M
Enzweiler d Gravila
Monoucluar pedestrian detection survey and experiments IEEE Trans PAMI dec. 2009
Dollar, P.; Wojek, C.; Schiele, B.; Perona, D
ll P W j k C S hi l B P
P.; Pedestrian detection: A benchmark CVPR 2009
• …
Wojek, C.; Walk, S.; Schiele, B.; Multi‐
cue onboard pedestrian detection
CVPR 2009
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Pedestrian detection
• When camera is moving
• When
Wh the environment
th
i
t is
i too
t complex
l
• When the background is not available…
People (pedestrian) detection with machine learning approach in still
People (pedestrian) detection with
in still images
(and video)
Feature
detection
Classification
•
•
People
No
People
Training set
Test set
( on‐line data)
Manyy features
Many classifiers: – SVMS
– Boosting classifiers with
Sliding windows search
Enzweiler, Gavrila, Monocular Pedestrian Detection: Survey and Experiments Trans on PAMI 2009
Cascade classifiers Biblio……
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Histograms of Oriented Gradients
Scan Input Image
Resize window Divide into overlapping
to 64x128
16x16 blocks
Compute histogram of
gradient over 9 directions
Slides: courtesy of
Slides: courtesy of
www.andrew.cmu.edu
72
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Hog + Deformable Part Model
P. Felzenszwalb, D. McAllester, D. Ramanan
“Object Detection with Discriminatively
Trained part based Models” ,TPAMI‐2009
This work exploits the same HOG feature of This
work exploits the same HOG feature of
Dalal et al. The model of the target object is made of:
a)) a coarse root filter (it corresponds to t filt (it
d t
Dalal model of pedestrian)
b) several higher resolution part filters
c) a spatial model for the location of
each part relative to the root
((a))
((b))
73
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
((c))
Covariance and LogitBoost
g
Classifier
O. Tuzel, F. Porikli, and P. Meer,
“Pedestrian detection via
classification on riemannian
manifolds,” IEEE Trans. on
PAMI, Oct. 08
F is a set of pixel
pixel-wise
wise features
features.
For generic region matching:
For texture classification:
For pedestrian detection:
p
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Covariance and LogitBoost Classifier
Casc 1
Casc 2
Casc N
Extract Pixel‐wise Feature 1
Image
g
Sub
Region R
Mean, var
E t t Pi l i F t
Extract Pixel‐wise Feature 2
2
M
Mean, var
Extract Pixel‐wise Feature M
Mean, var
Covariance
CR
(MxM
matrix, sym
pos def)
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Covariance
Descriptor
LogitBoost Classifier on Riemannian Manifolds
Casc 1
Casc 2
Linear Logistic Regressor
Casc N
on Riemannian Manifolds
Euclidean Space needed
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Examples
Dalal
Tuzel
Felzenszwalb
77
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Ped. Detection with Sliding Window
On each frame
On each frame
Apply pedestrian classifier on each window
Exhaustive sliding window approach
78
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Ped. Detection with Sliding Window
• Two problems
1. Accuracy •
Many false positives
•
Localization errors
2. Computation time Two approaches
Exploiting other cues
1) Learning context
2) Using relevance feedback
Exploiting statistics
1) Learning distribution
2) Multi‐stage search
p
g
• Proportional to sliding windows size and overlap
• Higher in Riemannian manifold
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
1) Using context and relevance
1) Using
and relevance feedback
LogitBoost
g
Classifier
on Riemannian Manifolds
Casc 1
Casc 2
Casc N
detection on negatives (precision) increases
detection on positives (recall) decreases
To increase precision
without affecting
recall..
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Estimate the Size of Pedestrian within the Frame
G.Gualdi,
G
G ldi A. Prati, R. A P i R
Cucchiara, "Covariance
Descriptors on Moving
Regions for Human
Detection in Very
Complex Outdoor Scenes" in ACM/IEEE ICDSC 2009
Exploit Pedestrian detector to infer pedestrian size in the frame
p
p
Sliding
Slidi
Window
Pedestrian
Detector
RANSAC
+ LSQ
with linear
model of
perspective
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
h(x,y)
8
1
Relevance Feedback (1/2)
Training Dataset
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
8
2
Relevance Feedback (2/2)
RELEVANCE
FEEDBACK
(1)
IMPLICIT
(it is automatic)
(2)
EXPLICIT
(it needs user assessment)
Estimate background images
=> negative training set
g
g
Response of the Ped Classifier
Positives
Training Dataset
Negatives
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
8
3
G. Gualdi, A. Prati, R. Cucchiara,"Perspective and Appearance Context for People pp
p
Surveillance in Open Areas"
in Proceedings of the 2nd International Workshop on Use of Context in Video Processing (UCVP 2010), at CVPR 2010
Experimental Results
Perspective Estimation
10%
~300K Windows
90%
~30K Windows
Plain ped. detection
ped detection
With perspective
With persp + rel. + rel feed
+ rel. feed
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
8
4
Experimental
p
Results
Perspective Estimation and Relevance Feedback
P ii %
Precision%
R ll%
Recall%
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
8
5
2) Use statistics: multi‐stage
2) Use
statistics: multi stage particle windows
A probabilistic bayesian paradigm for object detection:
Detection is achieved with multi‐stage search
Estimate obj. detection as a pdf
Use particle windows instead of sliding windows
particle windows instead of sliding windows
G. Gualdi, A. Prati, R. Cucchiara,"Multi‐stage
Sampling with Boosting
Cascades for Pedestrian
Detection in Images and Videos” ECCV 2010
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
2) Use statistics: multi‐stage particle windows
The measure of each
sample is a function
of the rejection
level:
d
detection response
i
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Region of support
• Often there is a basin of attraction in position and size
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Face detection
Face detection
• Also with viola and jones face detector
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Experiments with
m=5 stages
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
In Video: Exploit Bayesian Recursive Filter
In Video: Exploit Bayesian
prior
i
likelihood
predicted
di t d
posterior
measurements
t
sampled detections
l d d t ti
Final detections
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Experimental Results Miss Rate vs False Positives Per Image
Dollar, P.; Wojek, C.; Schiele, B.; Perona, P.;
Pedestrian detection: A benchmark CVPR 2009
Tuzel
And Our solution
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
9
4
Experiments
• Save time
• Or same time and Better accuracy
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Standard SW 96
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
MS‐PW
97
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Agenda
•
•
•
•
•
•
•
Introduction
Video Surveillance and Forensics
Video Surveillance and Forensics
Design Aspects for surveillance
CV & PR for people surveillance
People shape detection
People shape detection
People behavior by trajectory analysis
Conclusion
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
After people detection…
People search
multimedia
Forensics Surveillance
People search for forensics and surveillance; Why?
1) search people for answering a specific query ( i look for people with
sun glasses and a blue jaked with a red luggage..)
2) search people similar to a given shape (similarity search, CBIR)
3) search people moving in an area or with a given behaviour ( people data and metadata annotation for search and mining)
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
After people detection…
re‐identification
People P
l
Re‐identification
Search for similarity
(in a plain database of
(in a plain
database of
images and videos)
multimedia
Forensics
Surveillance
Extension of the tracking problem ( in videos)
• The
The tracking problem, aims at finding an association between tracking problem aims at finding an association between
prediction and observation. • Tracking matchs
Tracking matchs a previously seen target if it appears again in a previously seen target if it appears again in
the same camera, after a short time, in a position close to the previous one, and with a similar appearance.
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Re identification: a short survey
Re‐identification: a short survey
• Many dimension ofPapers/year
the problem
10
9
8
7
6
5
4
3
2
1
0
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
2010
Many research works
2006
2008
2010
Sym
m
bod etry d
ym
r
ode iven
3D
l
[Far
bod
10]
ym
ode
l [B
al10
]
2004
10-s
lice
mod
el [B
ir05
]
first
bod
ym
ode
l [G
he0
6]
2002
Com disjoi
mon nt bu
t clo
gro
[Bla und pl se
a
02,
Jav0 ne
disj
oint
3]
- CB
IR l
ike
3-sl
[L
ice
mod an03]
el [L
an0
3]
2000
ove
r
Hom lapped
ogra came
phy
ra
base s
d [C
ai98
]
1998
• [Cai98]
[Cai98] Q. Cai
Q Cai and J.K. Aggarwal, and J K Aggarwal “Automatic
Automatic tracking of human motion in indoor scenes across multiple synchronized video streams,
tracking of human motion in indoor scenes across multiple synchronized video streams ” Sixth Sixth
International Conference on Computer Vision 1998, pp. 356‐362.
• [Bla02] J. Black, T. Ellis, and P. Rosin, “Multi view image surveillance and tracking,” Proceedings of Workshop on Motion and Video Computing, 2002., IEEE Comput. Soc, 2002, pp. 169‐174.
• [Jav03] O. Javed, Z. Rasheed, K. Shafique, and M. Shah, “Tracking across multiple cameras with disjoint views,” Proc. IEEE International Conference on Computer Vision,2003, pp. 952‐957 vol.2.
• [Lan03] M. Lantagne, M. Parizeau, and R. Bergevin, “VIP: Vision tool for comparing Images of People,” Vision Interface, 2003.
• [Bir05] N.D. Bird, O. Masoud, N.P. Papanikolopoulos, and A. Isaacs, “Detection of Loitering Individuals in Public Transportation Areas,” IEEE Transactions on Intelligent Transportation Systems, vol. 6, Jun. 2005, pp. 167‐177.
• [Ghe06] N. Gheissari, T.B. Sebastian, and R. Hartley, “Person Reidentification
[Gh 06] N Gh i
i T B S b ti
d R H tl “P
R id tifi ti Using Spatiotemporal Appearance,” Conference on Computer U i S ti t
lA
”C f
C
t
Vision and Pattern Recognition 2006, pp. 1528‐1535.
• [Far10] M. Farenzena, L. Bazzani, A. Perina, V. Murino, and M. Cristani, “Person re‐identification by symmetry‐driven accumulation of local features,” Conference on Computer Vision and Pattern Recognition, IEEE, 2010, pp. 2360‐2367.
• [[Bal10] D. Baltieri, R. Vezzani, and R. Cucchiara, “3D Body Model Construction and Matching for Real Time People Re‐Identification,” Proc. ]
,
,
,
y
g
p
,
of Eurographics Italian Chapter Conference 2010 (EG‐IT 2010), 2010.
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Requirements for re‐identification
re identification
•
•
•
•
Object/People detection (bounding box)
Obj
t/P
l d t ti (b
di b )
Foreground detection (mask)
Face/body‐part
Face/body
part detection (segmented regions)
detection (segmented regions)
Single‐camera tracking
(temporal consistency and motion information)
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
A multi dimensional problem
A multi‐dimensional
• Camera positioning
p
g
Same
camera
•
•
•
Overl.
cameras
Disjoint
j
cameras
Same camera: the system should be able to re‐detect the same person
whenever he appears again in the same
in the same camera. View
camera View point and color and color
correction problems can be neglected [Yan99]
Overlapping cameras: geometrical information can be exploited; main
p
p p
match are captured
p
at the veryy instant [[Cal08]]
assumption: people to
Disjoint cameras: much complex case [Far10]
[Yan99] J. Yang, X. Zhu, R. Gross, J. Kominek, Y. Pan, and A. Waibel, [Yan99]
J Yang X Zhu R Gross J Kominek Y Pan and A Waibel “Multimodal
Multimodal people ID for a multimedia meeting browser,
people ID for a multimedia meeting browser” International ACM Multimedia Conference, 1999, p. 159.
[Cal08] S. Calderara, R. Cucchiara, and A. Prati, “Bayesian‐competitive consistent labeling for people surveillance.,” IEEE transactions on pattern analysis and machine intelligence, vol. 30, Feb. 2008, pp. 354‐60.
[Far10] M. Farenzena, L. Bazzani, A. Perina, V. Murino, and M. Cristani, “Person re‐identification by symmetry‐driven accumulation of local features,” 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, 2010, pp. 2360‐
fl lf t
” 2010 IEEE C
t S i t C f
C
t Vi i
d P tt
R
iti
IEEE 2010
2360
2367.
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
A multi dimensional problem
A multi‐dimensional
• Single/multiple shots
g
p
Single Shots
•
Single Shot methods associate pairs of images, each containing one instance of an individual. These methods are mostly similar to those proposed for image retrieval with some particular specialization to people. –
•
Multiple shots
PROS: simple, fast. CONS: view dependent. Less stable with occlusions and noise
Multiple shot: information coming from multiple frames (or images) containing the same person are used as training data. –
PROS: more information gathered for the same person; CONS: alignment and increased data dimensionality Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
A multi dimensional problem
A multi‐dimensional
• Signature
color
•
shape
position
texture
soft‐biometry
color – mean, histogram, Gaussian
,
g
,
model, Mixture
,
of Gaussians
– color space RGB, rgb, HSV •
shape: – width, height, h/w ratio, contour
, g , /
,
•
•
•
•
Spatial feature
position/trajectory: position in the image or in the ground plane
texture: covariance matrix, SIFT/SURF
texture: covariance
matrix SIFT/SURF
Soft‐biometry: face, gait
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
A multi dimensional problem
A multi‐dimensional
• The body model
Th b d
d l
No body model
2D body model
3D body model
• No body model
• 2D body model
2D body model
– Cylindrical model
– LTH Leg torso head
• 3D body model
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
No body model: search for similarity
No body model: search
Content based retrieval methods
Global descriptors: Histograms , texture, Medioni’s circular
hi t
histograms, Mixture
Mi t
off gaussians, covariance
i
i
matrix…
ti
SS. Calderara, R.Cucchiara, A. Prati Multimedia Surveillance: Content
Calderara R Cucchiara A Prati Multimedia Surveillance: Content based Retrieval with Multicamera People Tracking
Multicamera People Tracking Proc of workshop workshop
VSSN at acm multimedia 2006
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Cylindrical body model
Cylindrical body
• Cylindrical shape (or more generally as a solid of revolution)
solid of revolution)
– the horizontal variations of the people appearance are neglected, supposing that the color or texture
– distribution along the vertical axis is the only important data
important data. – [Bir05]: the person mask is divides into ten horizontal stripes and the mean color of each stripe is stored as representative feature.
[
[Bir05] N.D. Bird, O. Masoud, N.P. Papanikolopoulos, and A. Isaacs, “Detection of Loitering Individuals in ]
,
,
p
p
,
,
g
Public Transportation Areas,” IEEE Transactions on Intelligent Transportation Systems, vol. 6, Jun. 2005, pp. 167‐177.
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Legs torso head model
Legs‐torso‐head model
• The reason of the legs‐torso‐head model, instead, is mainly due to the occidental traditional clothing. • The target silhouette is divided g
into three horizontal parts, ideally corresponding to :
– legs (and thus to the pants/skirt appearance)
– torso (i.e., shirt or jacket) – head (i.e., hair).
[[Lan03] M. Lantagne, M. Parizeau, and R. Bergevin, “VIP: Vision tool for comparing ]
g ,
,
g
,
p
g
Images of People,” Vision Interface, 2003.
[Far10] M. Farenzena, L. Bazzani, A. Perina, V. Murino, and M. Cristani, “Person re‐
identification by symmetry‐driven accumulation of local features,” 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, 2010, pp. 2360‐
2367
2367.
Fixed [Lan03]
Estimated [Far10]
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
3d body models (1)
3d body models
• Panoramic Appearance Map: surface of a 3D cylindrical model
[GAN06].
Signature
g
to compare
[Gan06] T. Gandhi and M. Trivedi, “Panoramic Appearance Map (PAM) for Multi‐camera Based Person Re‐identification,” 2006 IEEE AVSS IEEE, 2006, pp. 78‐78.
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
3D Body Models (2)
3D Body Models
• 3D vertex model with features stored and related to each
vertex [Bal10]
[B l10]
[
[Bal10] D. Baltieri, R. Vezzani, and R. Cucchiara, “3D Body Model Construction and Matching for Real Time ]
,
,
,
y
g
People Re‐Identification,” Proceedings of Eurographics Italian Chapter Conference 2010 (EG‐IT 2010), Genova, Italy: 2010.
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
A multi dimensional problem
A multi‐dimensional
•
Spatial localization of features
mapped local
•
•
•
Unmapped local
Global
Global features: global color histogram, shape descriptors [Orw99]
U
Unmapped
d local
l l features: features
f t
f t
are computed
t d on patches
t h or blocks
bl k
but are unmapped to a body model or a relative position. E.g., Bag‐Of‐
Word with SIFT descriptors [Liu09]
Mapped local features: features
features: features are reffered
are reffered to a human
a human body model
body model and and
to specific regions [Lan03,Met10] [Orw99] J. Orwell, P. Remagnino, and G.A. Jones, “Multi‐camera colour tracking,” VS’99, pp. 14‐21.
[Liu09] K. Liu and J. Yang, “Recognition of People Reoccurrences Using Bag‐Of‐Features Representation and Support Vector Machine,” Chinese Conference on Pattern Recognition, 2009, pp. 1‐5.
[Lan03] M. Lantagne, M. Parizeau, and R. Bergevin, “VIP: Vision tool for comparing Images of People,” Vision Interface, 2003.
[Met10]M. Metternich, M. Worring, and A. Smeulders, “Color Based Tracing in Real‐Life Surveillance Data,” Trans. on Data Hiding and Multimedia Security V, vol. 6010, 2010, pp. 18‐33.
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Multiple shots
Single Shot
color
shape
position
texture
soft-biometry
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Papers for Camera positioning and Adopted Feature
16
14
12
10
8
6
4
Disjoint
2
Close‐disjoint
0
Overlapping
O l
i
Colour
Shape
Same Texture
Trajectory
P ii
Position
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Example 3D body model
3D body model for ri‐identification
ri identification
•
•
•
•
•
Camera positioning: disjoint
p
g
j
Body model: 3D vertex based body model about 600 vertices with scale factor
Signature: local color histograms
Requirements: calibration
q
Re‐identification is provided
comparing
i 3D models
3D
d l or view‐specific
i
ifi
projections of the model
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
ViSOR Re Identification Dataset
ViSOR Re‐Identification
• A new dataset designed for people re‐identification
•
•
50+ people : At least 4 snapshot for each person from different angles
Position and orientation of each person w.r.t. the camera for correct 2d/3d alignment
•
Thanks to: A EU project
to: A EU project in Prevention, Preparedness and Consequence Management in Prevention, Preparedness and Consequence Management
of Terrorism and other Security‐related Risks Programme European Commission – DG: JLS
http://www.openvisor.org
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Agenda
•
•
•
•
•
•
•
Introduction
Video Surveillance and Forensics
Video Surveillance and Forensics
Design Aspects for surveillance
CV & PR for people surveillance
People shape detection
People shape detection
People behavior by trajectory analysis
Conclusion
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
After people detection…
recognition
•
Gait analysis
•
Posture analysis
•
Trajectory analysis
•
Action/interaction/activity analysis
•
Behavior analysis
y
•
Motion in crowd
•
Anomaly detection..
M. S. Ryoo, J. K. Aggarwal
Semantic Representation and Recognition of Continued and Recursive Human Activities
Journal of Computer Vision 2009
Kaiqi Huang; Dacheng Tao; Yuan Yuan; Xuelong Li; Tieniu Tan; View‐
Yuan; Xuelong
Li; Tieniu Tan; View
Independent Behavior Analysis
IEEE Trans SMC 2008
Cheriyadat, A.M.; Radke, R.J.; Cheriyadat,
A.M.; Radke, R.J.;
Detecting Dominant Motions
in Dense Crowds, IEEE Journal of
Selected Topics in Signal
Processing 2008
Vijay Mahadevan, Weixin Li, Viral Bhalodia, Nuno
VasconcelosAnomaly Detection in Vid
Videos Using Mixtures of Dynamic U i Mi
fD
i
Textures in Proc of CVPR 2010
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
People trajectories in open space
People trajectories in open space
•
Gi
Given
allll the
h trajectories
j
i acquired
i db
by a video
id surveillance
ill
system
Which are the trajectories that share some
specific location properties?
Which are the trajectories that share some
specific shape properties?
Which are the most frequent
Behaviors?
Who did perform them?

people
l retrieval
ti
l
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Available datasets of trajectories
Available datasets of trajectories
Various time series (including trajectories):
http://www.cis.temple.edu/~latecki/TestData/TS_Koegh/
http://www.cs.ucr.edu/~eamonn/time_series_data/
Character Trajectories Data Set:
http://archive.ics.uci.edu/ml/datasets/Character+Trajectories
Simone Calderara, Andrea Prati, Rita Cucchiara Body Part Tracking for Action
Recognition
J. Multimedia Intelligence and Secuirty 2010
Pen-Based Recognition of Handwritten Digits Data Set:
http://archive.ics.uci.edu/ml/datasets/Pen-Based+Recognition+of+Handwritten+Digits
Video surveillance ETISEO project:
http://www-sop.inria.fr/orion/ETISEO/download.htm#video
http://www
sop.inria.fr/orion/ETISEO/download.htm#video_data
data
Soccer player trajectories:
“T. D’Orazio, M.Leo, N. Mosca, P.Spagnolo, P.L.Mazzeo
A Semi‐Automatic
A Semi
Automatic System for
System for Ground Truth
Ground Truth Generation of
Generation of Soccer Video Sequences
Soccer Video Sequences
In the Proceeding of the 6th IEEE International Conference on Advanced Video and Signal Surveillance, Genoa, Italy Sep2‐4 2009”
1000 trajectory on soccer video ImageLab video surveillance dataset:
More than 1000 trajectories of a video surveillance scenario
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Trajectory analysis in surveillance
Trajectory analysis in surveillance
• Trajectories are time series of data
• Working on datasets of time series is a well studied data mining problem which
requires:
• A set of Features
Feat res characterizing
characteri ing trajectories
• Similarity measure between two time series
• A clustering technique to classify trajectories
In video-surveillance
video surveillance research
• data availability is limited,  discover a model of data
• unprecise and noisy  statistical methods
• lack of reproducibility and high dinamicity adaptive methods for
classification and clustering
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Literature on Trajectory analysis
Literature on Trajectory analysis
•
Literature approaches on trajectory comparison can be classified:
– Depending
Depending on the Data Dimension (Complete vs
on the Data Dimension (Complete vs
Selected): Use all the temporal data or select a subset
– Depending on the Representation (Original vs
Transformed): Original feature space or a transformed f
d) O i i l f
f
d
space
– Depending
Depending on
on the Feature (Point to Point vs
the Feature (Point to Point vs Statistical): Statistical):
Adopt a point‐to‐point comparison or exploit a statistical model for data representation
– Depending on the similarity Measure
B. Morris and M. Trivedi, “A survey of vision‐based trajectory learning and analysis for
learning and analysis for surveillance,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 18, no. 8, pp. 1114–1127, Aug. 2008.
Calderara S., Prati A. Cucchiara
S Prati A Cucchiara R. R
Mixtures of von Mises
Distributions for People Trajectory Shape Analysis in press Trans. On Circuits and system for Video Technology 2010‐2011
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Related Works
Point to point
Basharat08 UCF
CVPR08
Hu06 Maybank
PAMI06
Porikli04
CVPRWs04
Junejo04 UCF
ICPR04
Bashir03 Shoenfeld
ICIP03
Chen08
CVPR08
Shoenfeld
Feature
Statistical
Representation
Original
Transformed
x
x
Statistical Gaussian
x
x
Statistical
HMM
x
x
HMM cross distance
x
x
Hausdorf
x
Sampling
x
Piotto09
TMM09
x
x
Distance
Gaussian
x
Ding08
VLD08
Shieh08
KDD08
Dimension
Complete
Selected
PCA
PCA
Euclidean
Null Space Projection
Eigen decompositi
on
PCNSA(Pr. Comp
Null Space analysis) distance
x
x
SAX symbolic aggregate approximation
Breakpoints
LB_Keogh
SAX symbol subspace
symbol to symbol DTW distance
Breakpoints quantization
symbol to symbol Global Alignment
distance Calderara09
ApproxWrapped
MoAWLG
x
AVSS09
LinearGaussian
Picciarelli09
x
x
Subsampling
Imagelab
–
University
of
Modena and Reggio Emilia –
http://imagelab.ing.unimore.it
TCMS09
GA KL‐divergence pdf distance
SVM Learning
References:
(Basharat08) Basharat, A. Gritai, and M. Shah. Learning object motion patterns for anomaly detection and improved object detection. In Proc. of IEEE Int’l Conference on Computer Vision and Pattern Recognition, 2008
(Porikli04) F. Porikli and T. Haga. Event detection by eigenvector decomposition using object and frame features. In Proc. Of Computer Vision and Pattern Recognition (CVPR) Workshop,volume 7, pages 114–121, 2004.
(Hu06)W. Hu, X. Xiao, Z. Fu, D. Xie, T. Tan, and S. Maybank. A system for learning statistical motion patterns. IEEE Trans. on PAMI, 28(9):1450–
1464, September 2006.
(Junejo04) Junejo, O. Javed, and M. Shah, “Multi feature path modeling for video surveillance,” in Proc. of Int’l Conference on Pattern Recognition, vol. 2, Aug. 2004, pp. 716– 719.
(Bashir03) F. I. Bashir, A. A. Khokhar, and D. Schonfeld, “Segmented trajectory based indexing and retrieval of video data,” in Proc. of IEEE Int’l Conference on Image Processing 2003 pp 623 626
Conference on Image Processing, 2003, pp. 623–626.
(Chen08) X. Chen, D. Schonfeld, and A. Khokhar, “Robust null space representation and sampling for view invariant motion trajectory analysis,” in Proc. of IEEE Int’l Conference on Computer Vision and Pattern Recognition, 2008.
(Ding08) H. Ding, G. Trajcevski, P. Scheuermann, X. Wang, and E. J. Keogh, “Querying and mining of time series data: experimental comparison of representations and distance measures,” Proceedings of the VLDB Endowment, vol. 1, no. 2, pp. 1542–1552, 2008.
(Shieh08) Jin Shieh and Eamonn Keogh (2008). iSAX: Indexing and Mining Terabyte Sized Time Series. SIGKDD 2008.
(Piotto09) N. Piotto, N. Conci, and F. De Natale. Syntactic matching of trajectories for ambient intelligence applications. IEEE Transactions on Multimedia, 11(7):1266–1275,
l i di 11( ) 1266 12
Nov. 2009.
2009
(Calderara09)S. Calderara, A. Prati, and R. Cucchiara. Learning people trajectories using semi‐directional statistics. In Proceedings of IEEE International Conference on Advanced Video and Signal Based Surveillance (IEEE AVSS 2009), Genova, Italy, Sept. 2009.
(Picciarelli08)Piciarelli, C.; Micheloni, C.; Foresti, G.L., "Trajectory‐Based
(Picciarelli08)Piciarelli, C.; Micheloni, C.; Foresti, G.L., Trajectory Based Anomalous Event Detection,
Anomalous Event Detection," Circuits and Systems for Video Circuits and Systems for Video
Technology, IEEE Transactions on , vol.18, no.11, pp.1544‐1554, Nov. 2008
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Ding Keogh 08 proposal (point to point)
Ding‐Keogh 08 proposal (point to point)
•The method proposed in (Ding‐Keogh 08) performs
the comparison among time series in the original x‐y h
i
i
i i h
i i l
data space. Tj 

xk , j , y k , j
k  1...np

•using dynamic programming and the Dynamic Time Warping
• Inexact matching such as DTW are required to account g
q
for different lengths in time series and for temporal shifts
. Ding, G. Trajcevski, P. Scheuermann, X. Wang, and E. J. h
d
Keogh, “Querying and mining of time series data: experimental comparison of representations and distance measures,” Proceedings of th VLDB E d
the VLDB Endowment, vol. 1, no. 2, t l 1
2
pp. 1542–1552, 2008.
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
(Ding08) Point to point Complete Original
(Ding08) Point‐to‐point Complete Original
•DTW algorithm
•The Method is effective when comparing similar sequences hence suitable when very large dataset is available
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Statistical models for trajectories
• Instead of point‐to‐point comparison
• create a statistical model of trajectory data
 To cope with noise
 To cope with the lack of large database
 To cope with the uncertainily of measure
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Trajectory shape analysis
•
•
•
Trajectory shape analysis for “abnormal behavior” recognition
Trajectory Shape similarity; invariant to space shifts
Not only space‐based or time‐based similarity
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
The approach of Imagelab
The approach
Statistical model:
Mixture of representative pdfs
TTrajectory
j t
t
transformed
f
d as a sequence off Symbols
S b l
Corresponding to the most representatative pdf
Trajectory alignement
( with global alignment)
Similarity based on pdf similarity
Clustering (k‐
medoids),
classification
detection anomalies.. or similarities….
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
1) Gaussian Model for spatial analysis
x
nj
, yn j

S
Sequence
off 2D spatial
ti l coordinates
di t
Tj 
 x2 , y2 
 x1 , y 1 
 x
1, j

, y1, j  ,  x 2 , j , y 2 , j  ,  , x n j , j , y n j , j
Advantages
d a tages o
of us
using
g spat
spatial
a coo
coordinates:
d ates
Natural representation
•Embodies additional information about velocity and acceleration
• Some
S
paths
th are more common then
th other
th depending
d
di on their
th i
position on the scene
•Represent partially the reaction of people to the structure of the
scenario
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it

Gaussian Model for spatial analysis
Gaussian Model for spatial analysis
• Due
Due to the uncertainties on the measure of points to the uncertainties on the measure of points
coordinates •  Gaussian model to model every point location
The simplest way:
Bivariate Gaussian
Centered on point coordinate having fixed
variance.
N i , k  N ( x, y |  i , k ,  )
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Clustering Trajectories
Clustering Trajectories
Positional Gaussian Clustering
•
•
•
•
•
S. Calderara, A. Prati, R. Cucchiara, "Trajectory analysis in Video surveillance
ill
f multimedia for
l i di
forensic" in Proc of 1st ACM Workshop on Multimedia in Forensics (MiFOR 2009), Bejing, Chi 2009
China, 2009
Frequent and anomalous behaviors can be obtained by clustering trajectories
by clustering trajectories
according to positions and detect the
most frequent activity zones (Gaussian model)
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Trajectory Shape Analysis by angles
x
nj
, yn j

S
Sequence
off 2D spatial
ti l coordinates
di t
Tj 
 x
1, j

, y1, j  ,  x 2 , j , y 2 , j  ,  , x n j , j , y n j , j
Sequence of 1D angles

T j   1, j ,  2 , j ,  ,  n j , j

 x2 , y2 
Advantages of using angles:
 x1 , y 1 
• more compact representation
• invariant to spatial translations (both
 i 1
i
local and global), thus describing
trajectory shape
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it

Imagelab Proposal
1. Trajectory description with angle sequence

T j   1,1 j ,  2 , j ,  ,  n j , j

2. Statistical representation with a Mixture of
Von Mises Distributions (MovM)
Von Mises
1
V ( |  0 , m) 
e m cos( 0 )
2 I 0 (m)
I0  m  
1
2
2
e
m cos
d
0
3. Coding with a sequence of selected vM pdf identifiers
4. Code Alignment
g
5. Clustering with k
with k‐medoids
medoids
A. Prati, S. Calderara, R.
Cucchiara, "Using
Circular Statistics for
Trajectory Analysis"
in Proceedings of CVPR
2008
Definition of EM
algorithm for MovM
Using Dynamic
programming
Definition
f
off
Bhattacharyya
distance fon vM
and on-line EM
Training
Training set and on‐line classification
set
and
on line
classification
MovM(T )
<S={S ..S },MovM(T )>


T j   1, j ,  2 , j ,  ,  n j , j
j
EM for MoVM
1j
Coding with
MAP
njj
j
Alignement
Clustering
with Br
distance
Trajectory
repository
Trajectory
clusters
repository
Normal/
abnorma
l
Surveillance
system

T j   1, j ,  2 , j ,  ,  n j , j
On-line EM
for MoVM
Classification
with Br
distance

Coding with
MAP
Alignement
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Mixture of von Mises
von Mises and Mixture
and Mixture of Gaussians
• MovM: MoG:
K
p ( x )   k   x | μ k , Σ k 
K
p ( )   kV  |  0,k , mk 
k 1
k 1
14
1.4
1.2
m 1
0.8
0   2
0.6
1.2
9
0  
5
0  
1
m 1
1
0.3
 
0.5
0.6
 1
 
0.4
0.2
0
  0.5
0.8
m 1
0.4
9
 
5
  0.3
03
2
0.2
0
1
2
1  0.2
3
2  0.5
4
5
3  0.3
6
7
0
0
1
2
1  0.2
3
4
2  0.5
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
5
6
3  0.3
7
Inexact matching
• Since the symbols we are comparing correspond to pdf, match/mismatch should be proportional to the distance between the
match/mismatch should be proportional to the distance between the two corresponding pdfs
• Need to evaluate distance between two pdfs:
Angular: Von Mises Distributions V ( |  0,a , ma ) V ( |  0,b , mb )
• Bhattacharyya distance bw pdfs (closed form)[Cal08]

1
d B  1  
I0
I
m
I
m
(
)
(
)
 0 a 0 b



ma2  mb2  2ma mb cos ( 0,a   0,b ) 

Spatial: Gaussians Distributions N ( x, y |  a ,k ,  a ) N ( x, y | b ,m ,  b )
• Bhattacharyya distance bw pdfs ( )
 a  b
1
d B  (  a  b )T  1 (  a  b )
8
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Comparison between VS and VLDB approaches
Comparison between VS and VLDB approaches
• Results on real dataset
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Experimental comparison
• Clustering accuracy was measured using the same K‐medoids
Clustering accuracy was measured using the same K medoids based clustering on based clustering on
distance matrices computed with the different methods described
Test ID
Number of Trajectories
j
(Ding08)
(Piotto09)
Our
Approach
pp
T1
140
78%
73%
95%
T2
108
80%
87%
99%
T3
145
94%
86%
96%
T4
100
90%
80%
97%
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Available data set
Available data set
VISOR : Video Surveillance Online repository
http://Imagelab.ing.unimore.it/visor
http://www.openvisor.org
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Outdoor multicamera
Outdoor multicamera
Synchronized
views
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Applications
• The proposed system can be used for trajectory retrieval in
forensic investigation:
•
Query by shape
(a)
•
Location Filtering
(b)
•
S
Snapshot
h t and
d trajectory
t j t
retrieval
ti
l (c)
( )
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Working on trajectory
on trajectory at ImageLab
at ImageLab
• People trajectory analysis for:
– Fetch Video Data from Raw and Annotated Video File
– Find anomalies in people path
– Compare trajectory in the dataset for retrieving similar elements:
Compare trajectory in the dataset for retrieving similar elements:
• Shape
• Location
• Clustering
– Compare people appearances to retrieve similar elements
– View Graphically Query Results View Graphically Query Results
– View Video sequences associated to:
• Trajectories
• Snapshots
Calderara S., Prati A. Cucchiara R. Mixtures of von Mises Distributions for People Trajectory Shape Analysis in press Trans. On Circuits and system for Video Technology 2010‐2011
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Data Fetchingg and Main UI
M. Aravecchia, S. Calderara, S. Chiossi and R. Cucchiara A Video Surveillance Data Browsing Software Architecture for Forensics: from
Trajectories Similarities to Video Fragments. MIFOR 2010 at ACM Multimedia 2010
Query by Example (Shape)
Query by Example (Location)
Query by Drawing (Shape)
Clustering (by Shape or Location)
Query by Appearances
Video Segment Retrieval
Query Optimizer
• Inside the query engine a query optimization module is
designed
• Query
Q
O ti i uses alternative feature
Optimizer
lt
ti f t
similarity
i il it measures
if provided.
• Rule based optimization:
– When providing the alternative technique also a rule can be provided
to trigger the alternative procedure.
• Index Based optimization:
– If clusters of the data are available the optimizer limits the comparison
to the cluster centers.
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Performances and Results
The query optimizer
automatically
chooses the best
performing query
strategy according to
th dataset
the
d t t
cardinality and the
user choices of
investigation.
Achieved Goals
A Three Layer Architecture for video surveillance data browsing
for the forensic analysis, designed for:
•To be flexible with the possible addition of new feature models
(Feature Model)
•To be easy to use with a graphical user interface with presentation
models that are feature specific
p
((Presentation Layer)
y )
•To allow to filter people trajectories to obtain few interesting
samples and the related video sequences (Query Engine)
•To exploit different
l i diff
similarity
i il i measure to obtain
b i a trade
d off between
ff b
query time and accuracy (Query Optimizer Module)
An extension: Action trajectories
An extension: Action
• space‐time trajectory (STT)
Simone Calderara, Andrea Prati, Rita Cucchiara Body Part Tracking for Action
Recognition
J. Multimedia Intelligence and Secuirty 2010
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Agenda
•
•
•
•
•
•
•
Introduction
Video Surveillance and Forensics
Video Surveillance and Forensics
Design Aspects for surveillance
CV & PR for people surveillance
People shape detection
People shape detection
People behavior by trajectory analysis
Conclusion
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Some Conclusion (1)
Some Conclusion
• SSmart Video surveillance, video analytics
Vid
ill
id
l i , multimedia l i di
forensics are not only a research game anymore.  it ‘s time
to knowledge
g transfer to companies
p
• Is the 80% done? Or is another effect of Pareto’s Principle
80‐20 ?
• Processing Terabyte of videos is now straightforward. We b
f d
hf
d
need ( but we have) data.
• Real‐time processing is not a chimera
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Some Conclusion (2)
Some Conclusion
The Future of Multimedia Surveillance&Forensics Architecture
• Architecture for non IT experts.
• With software solutions
ft
l ti
f
for:
– Combining different analysis (real‐time knowledge extraction
and data mining )
and data mining
– Allow 3D‐4D world reconstruction
– Presents data in innovative,intuitive and interactive
,
way (touch, y(
,
mobile..)
– Allow traceability of operation (for legaly issues)
– Deal with privacy
• Focus on performance when analysing very large databases of data
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Some Conclusion (3)
Some Conclusion
• There
Th
i so many work to
is
k t do..
d
Working on crowd
Working on challanging partial crowd
Working on moving sensors
Working on 3D
Working on multisensor, multimedia
Working
g on forecasting
f
g and pro‐active
p
bheavior
analysis
p
before
f
2020?
• … will we solve the problems
•
•
•
•
•
•
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Thanks. Thanks
..And thanks to Imagelab
For any details
http://Imagelab.ing.unimore.it
Andrea Prati, Roberto Vezzani, A
d
P ti R b t V
i
Costantino Grana, Simone Calderara, Giovanni Gualdi, Paolo Piccinini, Paolo Santinelli, Daniele Borghesani, Davide
Santinelli, Daniele Borghesani, Davide Baltieri, Sara Chiossi, Adnan Rashid, Michele Fornaciari, Manuel Aravecchia, Rudy Melli, Emanuele P i i Gi li
Perini, Giuliano Pistoni..
Pi
i
Imagelab – University of Modena and Reggio Emilia – http://imagelab.ing.unimore.it
Descargar